What is the best way to summarize a document in the KB with RAG?
The document contains the transcript of a discussion between 2 persons.
I use the query API to target that document when the user request a summary.
My reasoning is to send as many chunks of the document to the LLM (GPT 3.5 turbo) to get a complete summary.
Therefore I set the "chunkLimit" parameter (from the query API) to 10. In the query API response, I see that 10 chunks are selected with their corresponding similarity score but the output from the query API is "null".
The same is true is I set the "chunkLimit" parameter to 5.
If I I set the "chunkLimit" parameter to 2 or 3 or 4. It works but the summary is not complete. It does not include important parts of the document.
In the KB settings, the Chunk limit setting is set to 10. The max token setting is set to 1000 tokens.
Am I doing something wrong?
Is there another way to get a complete summary of a document?
Thanks for your help
