Any suggestion for getting around a [Moderation error]?
The company I work at deals with video, and sometimes something NSFW will get past our moderation. If a user were to write something along the lines of "There's p**n", I'd like the bot to respond with one of our NSFW policies (which I have in my knowledge base). However, I'm being stopped by a
[Moderation error]
. Is this implemented by OpenAI or is this a VoiceFlow limitation? In either case, I suppose I can hardcode something, but that kinda defeats the purpose of having an ai bot 😅
4 Replies
The
[Moderation error]
you're encountering is a feature implemented by Voiceflow, not OpenAI. It's designed to prevent the AI from generating inappropriate content. However, it can sometimes interfere with legitimate use cases, such as yours.
To work around this, you could consider a few options:
1. Hardcoding: As you mentioned, you could hardcode a response for specific phrases or words. This would bypass the AI's moderation system, but it might not be the most scalable solution, especially if there are many different ways users could phrase their reports of inappropriate content.
2. Custom Moderation System: You could implement a custom moderation system that checks user inputs for specific phrases or words before they're sent to the AI. If a match is found, you could then trigger a specific response. This would be more flexible than hardcoding, but it would require additional development work.
3. Adjusting AI Settings: In Voiceflow, you can adjust the AI's settings to change how it generates responses. For example, you could lower the 'temperature' setting to make the AI's responses more focused and less likely to generate inappropriate content. However, this wouldn't directly address the issue of the AI's moderation system blocking certain inputs.
Remember, the Generative Global No Match feature in Voiceflow is still experimental and may not be suitable for all use cases, especially those involving sensitive content. It's recommended to use it with caution in a production environment.
For more information on how to adjust the AI's settings in Voiceflow, you can refer to the Global No Match (Generative or Static) documentation. For details on how tokens are calculated and used in Voiceflow, you can refer to the What are Tokens? documentation.Thanks, this is going through an intent not a global no match.
Appreciate the quick response!
You can use a pre-pass using Claude Haiku for example.
Here is one of the prompt they're sharing and that you can tweak to fits your needs.
Another option is to use a dedicated model like Llama Guard
interesting, thanks for this