Reading Pills
Deep dive into AI with simplicity. Diving deep in Gen AI.
Excited to share our latest podcast episode on Generative AI.
Bhabajeet and I had a fantastic conversation on understanding gen AI concepts & the impact of this technology on various sectors.
Tune in to learn more!
The AI Podcast for Beginners - Simplifying Artificial Intelligence for Newbies - Amit Sharma Ji #ai Amit Ji's social media links - LinkedIn - https://www.linkedin.com/in/amitsharma1511/Twitter/X - https://twitter.com/amitsharma1514/Blog - https://readingpil...
Mistral AI launched their first ever code model "Codestral".
It has been trained on dataset of 80+ programming languages, including the most popular ones, such as Python, Java, C, C++, JavaScript, and Bash.
Its a 22B parameter model.
Can be used to design advanced AI applications for software developers.
It can complete coding functions, write tests, and complete any partial code using a fill-in-the-middle mechanism.
Google added one more member to Gemini models family i.e. "Med-Gemini" focused on the medical domain.
Makes promise for real world tasks like -
- Medical diagnosis and information retrieval
- Summarization of medical text
- Generating referral letters
- Medical dialogue and education
There are a total of four models —
- Med-Gemini-S 1.0
- Med-Gemini-M 1.0
- Med-Gemini-L 1.0,
- Med-Gemini-M 1.5.
These are multimodal and can provide text, image, and video outputs and are integrated with web search and has self reflection training capabilities.
It is not yet available for public use.
Image demonstrates Med-Gemini-M 1.5 ability to analyze a chest X-ray (CXR) and conduct a hypothetical realistic dialogue with a primary care physician.
Source: arXiv 204.18416
Let's find out what LLMs are thinking about each other and how unbiased their opinions are!
I did ask Gemini, ChatGPT-3.5, Mixtral, Llama3 and Gemma, how they are better than the others, here is how it unfolded -
Gemini and ChatGPT-3.5 were honest that they cannot provide unbiased opinion on this and then did output what they are good at.
One interesting thing, Gemini was the only model which highlighted it's weakness as well!
Gemma was like let me tell you how I am superior than all the other existing models by contrasting itself with each model.😆
Mixtral was the honest kid in the classroom (and not hurting anyone's sentiment), it was like I don't know about other models and compare them but let me tell you what makes a model better than others.
Llama3 70B boasted itself about it's capabilities and became diplomatic in the end, you know what, the others may be better too but I have a unique combination of powers.
Checkout the screenshots for the responses received.
New model Qwen1.5 with 110B launched, claims to perform at par with recently launched Llama3 70B.
Supports the context length 32K tokens, multilingual model supporting a large number of languages including English, Chinese, French, Spanish, German, Russian, Korean, Japanese, Vietnamese, Arabic, etc.
Microsoft launched the Phi 3 Mini 3.8B.
IT has been trained with a mix of both synthetic data and the filtered publicly available websites data.
Smaller models like Phi 3 are suitable in application use cases like -
1) memory/compute constrained environments
2) latency bound scenarios
3) strong reasoning (especially math and logic)
4) long context
It is available on Hugging Face and
to run locally.
Llama 3 is smart, great to see answers for initial prompts.
Ran the 8B model locally using ollama.
Below are the 2 questions I just tried to start -
1. How is Llama2 better than Llama3 ?
2. So how do meta names Llama LLM model names ?
Results in the image.
Llama 3 both 8B and 70B model is available on Ollama, try using it by just running a simple command.
For 8B --
ollama run llama3:8B
For 70B --
ollama run llama3:70b
Meta launched it's next generation of LLM i.e. Llama 3, can be accessed using Meta AI.
Available with 8B and 70B pretrained and instruction-tuned versions.
Not available in India yet. 🙄
Don't Trust AI Models responses? CRITIC Ensures LLM Trustworthiness
While interacting with Large Language Models (LLMs) one thing that you should keep in mind, that sometimes the response received to your prompt may be flawed, factually incorrect, completely made up or even offensive or harmful to others in some cases.
Unlike these AI models in our human lives we fact check information from trusted external sources like Wikipedia, searching through internet or using a debugging tool in case of code, to make sure we set things right.
Inspired from this human behavior Self-Correcting with Tool-Interactive Critiquing (CRITIC) is a framework that lets these black-box AI models validate the information and improve their own output by interacting with external tools such as a search engine, just like humans do.
The CRITIC Process
In attached Figure, for an input, LLMs first generate an initial output based on parametric knowledge it has, then interact with appropriate external tools through text-to-text APIs to verify the output. The critiques generated by the verification step are concatenated with the initial output, and serve as feedback to allow LLMs to correct the output. The cycle of “Verify ⇒ Correct ⇒ Verify” iterates continuously to improve the output until a specific stopping condition is met.
- The language model outputs an initial answer to a question or task given in the prompt.
- CRITIC then interacts with "external tools" to check and verify that answer. For example, it might use a search engine to look up information and see if the answer is accurate and truthful.
- Based on the feedback from the tools, CRITIC identifies any problems or mistakes in the original answer.
- CRITIC then takes that feedback and uses it to help the language model generate an improved, corrected answer.
- This process can be repeated multiple times, with the language model continually refining its answer based on the external feedback until a stopping condition is met.
Benefits of CRITIC
- It helps the language model catch and fix its own mistakes, without relying on expensive human annotations or additional training.
- It allows the model to learn from interactions with the real world, rather than just its own internal knowledge.
- It makes the language model's outputs more reliable and trustworthy, by verifying the information against external sources.
Conclusion
Overall, CRITIC acts like a coach, giving the language model feedback to help improve its answers, rather than just accepting the first output instance. This makes LLMs more reliable and trustworthy by giving them a way to check their answers and fix any errors. This makes the language model smarter and more reliable over time.
Source
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing (arXiv:2305.11738)
"Hey, You Made a Mistake!": Coaching AI Agents with Verbal Feedback
Think of a large language model (LLM) like ChatGPT, Gemini or Claude being trained to learn a task through trial and error using the traditional approach of reinforcement learning methods use a reward-and-punishment approach as data is processed. This process can be slow and inefficient.
The Reflexion framework reinforces language models not by updating weights but instead through verbal feedback. After each iteration the feedback explains what went wrong and how to improve. The LLM stores this feedback in a memory buffer and use this feedback when it encounters similar situation later, to make a better decision.
The Reflexion process
It utilizes three distinct models: an Actor, which generates text and actions; an Evaluator model, that scores the outputs produced by actor and a Self-Reflection model, which generates verbal reinforcement cues to assist the Actor in self-improvement.
Actor
LLM that is specifically prompted to generate necessary text and actions.
Evaluator
This assesses the output generated by the actor. It computes a reward score that reflects the performance of the actor within the given task context. This is done in several ways -
-- Reward functions based on exact match (EM) grading, ensuring that the generated output aligns closely with the expected solution.
-- In decision-making tasks, employ pre-defined heuristic functions that are tailored to specific evaluation criteria.
-- Using a different instantiation of an LLM itself as an Evaluator, generating rewards for decision-making and programming tasks.
This multi-faceted approach to Evaluator design allows to examine different strategies for scoring generated outputs, offering insights into their effectiveness and suitability across a range of tasks.
Self-reflection
Generates verbal self-reflections to provide valuable feedback for the future trials. It takes the reward signal such as success or failure from the evaluator, the entire path the actor took and the past lessons learned from the memory then provide a constructive feedback which tells actor what to do differently. This verbal feedback is then stored in the actor's memory.
Memory
It has two components: short-term memory and long-term memory.
The short-term memory stores the recent history of the actions actor took and the long-term memory stores the lessons learned in the past attempts in the form of verbal feedback generated by self-reflection model.
The importance of memory in Reflexion
Provides context: The short-term memory gives the Actor all the details about the current situation. This helps it understand what's happening right now.
Informs future actions: The long-term memory (verbal feedback) helps the Actor make better decisions in the future. It reminds the Actor of past mistakes and suggests better choices based on learned experiences.
Benefits of Reflexion
By getting specific advice after each try, the Actor can improve its decision-making much faster than with just a win/lose signal.
The Actor can use both its recent experience (trajectory) and past lessons (memory) to make better choices in the future.
Conclusion
By using both short-term (current situation) and long-term memory (past lessons), Reflexion agents can make informed decisions that are better than other AI approaches that rely only on the current situation. It's like having both the immediate details and the wisdom of experience to guide your actions.
Overall, the Self-reflection model is like a super-coach that helps the Actor learn from its mistakes and become a better decision-maker.
Source
Reflexion: Language Agents with Verbal Reinforcement Learning (arXiv: 2303.11366)
Elevate Your Creativity with AI's Self-Improvement Powers
You may have used various language models like ChatGPT, Claude, or Gemini and encountered less-than-satisfactory responses on your first try. But what if the AI could provide critical feedback on its own output and then improve its response? This is the core idea behind the SELF-REFINE method.
The SELF-REFINE approach aims to automate the task of providing feedback, critiquing its own output, and improving the response, all without requiring any additional training for the model. This is similar to how humans refine their writings using iterative feedback.
The SELF-REFINE Approach:
1. Initial Generation: Given a prompt, the language model generates an initial output.
2. Feedback: The same language model is then asked to provide a specific and actionable feedback on it's own initial output, identifying what could be improved.
3. Refinement: The language model then uses the feedback it provided to refine and improve the initial output.
4. Iteration: The process can be repeated multiple times until a stopping condition is met.
Benefits of Self Refine:
- Additional training or supervision is not required.
- Leverages language model's own capabilities for feedback and refinement
- Significant performance improvements across diverse tasks.
Conclusion:
The SELF-REFINE approach represents an exciting advancement in how we can leverage the power of large language models to generate high-quality outputs. By enabling these models to provide feedback on their own initial generations and then refine them iteratively, we unlock a new level of performance that goes beyond what a single pass of generation can achieve.
Blog: https://readingpills.com/blog/agentic-design-patterns-self-refine
Source:
Self-Refine: Iterative Refinement with Self-Feedback - https://arxiv.org/abs/2303.17651
Unlocking AI Creativity: The SELF-REFINE Method for Better Results The SELF-REFINE method empowers AI to critique and refine its own work, leading to dramatically better creative outputs.
Craft Flawless Prompts for Any AI Task: Mastering the Iterative Approach
You've likely encountered many articles or social media posts with titles like "50 prompts everyone should know" or "Top 20 prompts to get the best output from LLMs", When it comes to prompting, there's no one-size-fits-all approach as the tasks you're trying to accomplish will vary depending on the person and the situation.
To achieve the desired output, the best approach is to develop your own process for prompting through iteration and experimentation.
Lets look at an example:
Try the first prompt
Help me rewrite this paragraph: [.......]
See how it turned out! If it's not quite right, try tweaking the prompt for better results.
Improved prompt
Correct any spelling and grammatical errors in this: [......]
Evaluate the output once again and you may further improve it.
Further improved prompt
Correct any spelling and grammatical errors in this, and rewrite in a tone appropriate for professional cover letter:[....]
The prompting process -
1. Be clear and specific in prompt
2. Evaluate the result and think why desired output isn't coming
3. Refine the prompt
4. Repeat
AI Revolution: How Agentic Workflows Will Change Everything You Do
Agentic vs Non-Agentic AI workflow
If I ask you to write an essay on a certain topic but given the condition that you write it in one go, express whatever comes to your mind on that topic at first, you are not allowed to press backspace, and also expecting best quality result.
Most of the LLMs we use at present work like this, using zero shot mode, prompting the model to generate final output without revising its work. Although this is a difficult task LLMs do this task well. This is non-agentic workflow.
In continuation to the example, if I allow you to think over, iterate over ideas and re-write the essay, you are going to do much better and as a result produce high quality output. Similarly with agent workflow we can ask the LLM to iterate over the topic many times.
How this can help achieve more efficiency ?
Imagine you have a research paper to write. You use a search engine like Google Scholar, which is a non-agentic AI tool. You give it a specific question (like "AI and health care"), and it provides results (research papers on the topic). You then have to sort through these results, read and analyze them yourself to write your paper. This will be your whole human effort, now how AI agents can help you excel at this -
The Future of AI: Agentic Workflows
- The AI acts more like an assistant or collaborator. Continuing the research paper example, an agentic AI tool might:
- Search for relevant papers like the non-agentic search engine, do a web search and gather more information.
- Analyze the papers and summarize the key findings.
- Identify important arguments and opposing viewpoints.
- Even help you write a draft of your paper.
- Read over the first draft to spot unjustified arguments or extraneous information.
- Revise the draft taking into account any shortcoming spotted.
- and continue further...
Basically, the AI takes initiative and does more than just respond to your specific commands.
This has been presented recently by Andrew Ng at Sequoia capital.
.
Unlocking the future of creativity: What are the key terms in generative AI?
Generative AI
AI systems that can generate new content like text, images, etc.
Large language models
AI models trained on massive amounts of text data to understand and generate natural language.
GPT-3, GPT-4, Gemini, Devin
Examples of a large language model created by OpenAI, Google and Cognition.
Neural networks
Computing systems modeled on the human brain used in deep learning.
Machine learning
The use of algorithms and neural networks that learn from data.
Natural language processing (NLP)
The ability of AI systems to understand, interpret and generate human language.
Training data
The data used to train machine learning models.