unwilling-turquoise•2y ago

issue with KB not finding the info in the KB

I have noticed that the KB quite often doesnt find the answer to a question in the KB or doesnt retrieve the right chunk. So i have set up a test bot with only 1 text document. This document is a small list of businesses in a town, with a description and contact details for each. It has been structured with chatgpt in order to be presented in the most optimal way for embeddings. So u can have access to this simple text document in my project directly. The project ID is 659eb64b59d399715077ad0c lets concentrate on 2 business in this list with bad results (but ther could be plenty others potentially) : AFR performance offers sport coaching service, and "atelier Desbel" renovates armchairs. I am asking both "do u know AFR performance" , and "who can i contact if i need sport coaching", same for the other. Result : the KB cant find any relevant info on both questions about AFR performance, even if i have added a second mention at the end of the document saying that "afr performance is providing sport coaching". About atelier desbel, if we ask who renovate armchair it will name another business which doesnt mention armchair renovation. This is only a small example that should enable us to test and check where the problem should come from. But this issue happens very regularly, the KB retrieve a chunk which is not speaking about the question, when the appropriate one is somewhere else. Please contact me for more test to solve this problem

10 Replies

NiKo | Voiceflow•2y ago

To help you debugging this, we will need some info: - Can you share the text you've uploaded to your KB - Your .vf file - Some steps to repro If you can't share the .vf file: - The system prompt, instructions and settings you're using for KB retrieval - Some utterances to test

unwilling-turquoiseOP•2y ago

unwilling-turquoiseOP•2y ago

format_opimal_embedd...

unwilling-turquoiseOP•2y ago

here are the vf file and the doc uploaded. I have explained the different questions that u can reproduce above with the results that i got its only one example so that we can test on something concrete. i have other bots with only plain text containing 1 question and 1 answer ,for example 1 question about parking in the own, another question in a separate doc about restaurant in the town etc...and when i ask a question about restaurant, the chunk system retrieve the parking chunk with the higher score. so its not only a question of debugging this file i sent you, its a global issue in the chunk retrieval system that i notice , and i am ready to investigate this with you if u want

NiKo | Voiceflow•2y ago

Thanks for sharing. IMO you should optimize your document to help LLM to give better results. Here is an example using your doc after a quick LLM pass for optimization:

### Restaurants

- **A Casetta**
  - **Location:** 346 rue Saint Sauveur, 06110 Le Cannet
  - **Cuisine:** Mediterranean, Traditional French, Corsican
  - **Capacity:** 40 guests
  - **Contact:** Phone: 0493693250, Email: acasetta.lecannet@gmail.com
  - **Social Media:** Facebook
  - **Operating Hours:** Year-round, 18:00 - midnight; Summer: Opens every evening at 18:00

- **Antoine Epicerie Fine**
  - **Location:** 3 boulevard Sadi Carnot, 06110 Le Cannet
  - **Specialties:** Rare and exceptional items
  - **Contact:** Phone: 04 93 69 33 98
  - **Social Media:** Website, Facebook

...

### Services

- **AFR Performance**
  - **Service:** Personalized sports coaching and physical preparation
  - **Location:** 06110 Le Cannet
  - **Contact:** Phone: 06 46 88 08 13

- **Atelier Desbel**
  - **Service:** Creating and renovating armchairs
  - **Location:** 124 rue Saint-Sauveur, 06110 Le Cannet
  - **Contact:** Phone: 06 20 64 21 94

### Cultural and Historical Sites

- **Ancienne Mairie**
  - **Location:** Rue Saint Sauveur, 06110 Le Cannet
  - **Features:** Campanile, clock, historical significance
  - **Contact:** Phone: 04 92 59 14 42, Email: tourisme@lecannet.com
  - **Social Media:** Le Cannet Tourism website

- **Balade sur les pas de Bonnard**
  - **Activity:** 1.5-hour walk along Rue Saint Sauveur
  - **Highlights:** Discover Canal de la Siagne, follow in the footsteps of Bonnard
  - **Contact:** Phone: 04 92 59 14 42, Email: tourisme@lecannet.com
  - **Social Media:** Le Cannet Tourism website

### Events

- **ARTUS ONE MAN SHOW**
  - **Location:** 730 Av. Georges Pompidou, La Palestre, 06110 Le Cannet
  - **Description:** Comedy show with limitless humor
  - **Importance:** Regional
  - **Contact:** Phone: 04 93 46 48 88
  - **Social Media:** Sud Concerts - Artus website

### Restaurants

- **A Casetta**
  - **Location:** 346 rue Saint Sauveur, 06110 Le Cannet
  - **Cuisine:** Mediterranean, Traditional French, Corsican
  - **Capacity:** 40 guests
  - **Contact:** Phone: 0493693250, Email: acasetta.lecannet@gmail.com
  - **Social Media:** Facebook
  - **Operating Hours:** Year-round, 18:00 - midnight; Summer: Opens every evening at 18:00

- **Antoine Epicerie Fine**
  - **Location:** 3 boulevard Sadi Carnot, 06110 Le Cannet
  - **Specialties:** Rare and exceptional items
  - **Contact:** Phone: 04 93 69 33 98
  - **Social Media:** Website, Facebook

...

### Services

- **AFR Performance**
  - **Service:** Personalized sports coaching and physical preparation
  - **Location:** 06110 Le Cannet
  - **Contact:** Phone: 06 46 88 08 13

- **Atelier Desbel**
  - **Service:** Creating and renovating armchairs
  - **Location:** 124 rue Saint-Sauveur, 06110 Le Cannet
  - **Contact:** Phone: 06 20 64 21 94

### Cultural and Historical Sites

- **Ancienne Mairie**
  - **Location:** Rue Saint Sauveur, 06110 Le Cannet
  - **Features:** Campanile, clock, historical significance
  - **Contact:** Phone: 04 92 59 14 42, Email: tourisme@lecannet.com
  - **Social Media:** Le Cannet Tourism website

- **Balade sur les pas de Bonnard**
  - **Activity:** 1.5-hour walk along Rue Saint Sauveur
  - **Highlights:** Discover Canal de la Siagne, follow in the footsteps of Bonnard
  - **Contact:** Phone: 04 92 59 14 42, Email: tourisme@lecannet.com
  - **Social Media:** Le Cannet Tourism website

### Events

- **ARTUS ONE MAN SHOW**
  - **Location:** 730 Av. Georges Pompidou, La Palestre, 06110 Le Cannet
  - **Description:** Comedy show with limitless humor
  - **Importance:** Regional
  - **Contact:** Phone: 04 93 46 48 88
  - **Social Media:** Sud Concerts - Artus website

NiKo | Voiceflow•2y ago

opti-source.txt

unwilling-turquoiseOP•2y ago

Ok Niko thanks for sharing this, i will use it for my other list to make sure docs are optimized in thjs way.. But It solves half of tbe problem.. The other half is the fact that the chunk system retrieves wrong chunks with the higher score in many many cases on all my projects. In this case u used 2 chunks on a very small doc anf concentrate on the output so u cant notice this. What u can try to do is asking a series of questions about info in the 2cd chunk (my original doc was 2 chunk long but the one u sent me is shorter so might be only 1chunk long) and asking for only 1 chunk as a source. Then u have a chance to see what i mean.

NiKo | Voiceflow•2y ago

Strange, I can't repro on my end. The cleaned version has also 2 chunks, and asking for an info only available in the second chunk while limiting usage to 1 chunk give me a correct answer.

unwilling-turquoiseOP•2y ago

Can u check the score of the 2cd chunk if u set the limit to 2 chunks.. U should have one high score around 80% in the right chunk, and a low score in tje other chunk that has nothing to do with the question.. If both scores are very similar it will not be normal

optimistic-gold•2y ago

Hi @NiKo | Voiceflow could you please let me know how you got an optimized document ? What was the procedure or LLM you used to get that output

issue with KB not finding the info in the KB

Did you find this page helpful?