- LLMs such as GPT-3 / ChatGPT are trained on generic web data.
- They can be used to tailor an answer based on context by supplying the context along with prompt.
- We cannot use the entire documentation as context as it is too large to fit in a single prompt. It may also lead to bankruptsy from the OpenAI charges.
- So we split the documentation into sections and associate each section w/ its embedding. An embedding is a vector that distills the meaning of the text. The similarity of two bodies of texts can be found by the closeness of their vectors. Closeness is measured as the dot product of the two vectors / cosine of the angle between the vectors.
- Ref : https://youtu.be/gQddtTdmG_8
- When we receive a query, get the embedding of the query and use it to find relevant sections of the documentation.
- Include the relevant sections also in the LLM prompt. Proper instructions to the system is also required to ensure correctness of the response and to prevent / minimise hallucinations.