hrubasek
hrubasek2mo ago

KB sync

Hello everyone. I have set my Knowledge base to update daily. Whenever it does refresh it deletes my matedata that were attached to that source. Is there a way to keep the metadata? Thanks
11 Replies
Tico | Voiceflow Assistant
At the moment the automatic “Daily Sync” for URL-based Knowledge-Base sources works by completely replacing the document each time it crawls the page. Because the document is recreated, any metadata you had previously written to the document / its chunks is also replaced. What you can do today 1. Keep the daily sync turned on, but re-apply the metadata right after each refresh with the Management API: - PATCH /v1/knowledge-base/docs/{documentID} to update the whole document’s metadata, or - PATCH /v1/knowledge-base/docs/{documentID}/chunk/{chunkID} if you only need to tag specific chunks. 2. Turn off the automatic sync and instead run your own scheduled script that a) PUT /v1/knowledge-base/docs/{documentID}/upload (re-upload the URL) including the same metadata in the request body, or b) uploads a fresh document and deletes the old one—again supplying metadata in the same call. Until metadata-preserving sync is added to the product, one of the two work-arounds above is required. Docs for reference • Update Document Metadata – https://developer.voiceflow.com/reference/patch_v1-knowledge-base-docs-documentid • Update Chunk Metadata – https://developer.voiceflow.com/reference/patch_v1-knowledge-base-docs-documentid-chunk-chunkid • Replace Document (URL) – https://developer.voiceflow.com/reference/put_v1-knowledge-base-docs-documentid-upload Feel free to add a feature request in Discord → #feature-requests so the team can track interest in persisting metadata during auto-sync.
W. Williams (SFT)
There is a known bug with the daily update. I saw somewhere VF is looking into it. Here: https://discord.com/channels/1079548823610871889/1384548571822686329/1392783381120680057
hrubasek
hrubasekOP2mo ago
@W. Williams (SFT) that is a different kind of issue. My issue is that the refresh also delets my metadata. Is it part of the bug or is there a way to fix it on my side?
W. Williams (SFT)
sorry, I missed that. @NiKo | Voiceflow @Braden (Voiceflow CEO)
NiKo | Voiceflow
Shared with the team so they can investigate 👍🏻 @hrubasek Could you give more details on how you process the docs? Are you uploading URLs and set a refresh rate from the UI and then add metadata to the docs using the API or do you add URLs+metadata using the API and then set the refresh rate in the UI. Update: a fix will be released on Monday
hrubasek
hrubasekOP2mo ago
@NiKo | Voiceflow thank you!
NiKo | Voiceflow
Fix has been pushed, can you give it another try on your end when you get the chance @hrubasek?
hrubasek
hrubasekOP2mo ago
@NiKo | Voiceflow works perfect thanks! Is there a way to restore my metadata? I manually added around 1500 metadata and now it is all gone due to this bug. Thanks for additional help
NiKo | Voiceflow
Sadly not I'm afraid. Were you not using a custom automation to updated the urls + metadata? If you have a source for the metadata, you can do a quick script the update the docs with them based on the doc URL maybe?
hrubasek
hrubasekOP2mo ago
@NiKo | Voiceflow I was not using any automation or scrypts. I am not really experienced in this thing just yet. If someone could help me do this I would really appreciate it since I really dont want to do all that manually again due to this bug...
NiKo | Voiceflow
Do you have a list of URLs along with their associated metadata?

Did you find this page helpful?