BLOG

How to Build Omni Model Dynamic AI Assistants using Intelligent Prompting

Tim Gühnemann

December 13, 2024

Share this article:

Table of Contents:

My name is Tim Gühnemann, and as an AI engineering working student at ilert, I had the privilege of developing and continuous improving ilert AI, ensuring it meets the needs of our customers and aligns with our vision.

‍

Our goal was to provide all our customers with access to ilert AI. We aimed to develop a solution that could adapt dynamically and function independently based on our use cases, similar to the OpenAI Assistant API.

Translation of prompts into conversational intelligence

Working with AI, I realized that prompts aren't simply plain instructions; they're the start part of intelligent conversations. What began as a curiosity morphed into quite a heavy-weight method for producing much more dynamic and adaptable interaction with AI.

‍

Prompts are just a few lines of rigid instructions for most, but for me, prompts become alive and can grow and change. It is like teaching an AI to think and respond as a person, following simple rules and learning from the provided context. Imagine a summary of rules that make an accurate conversation flow instead of being a very rigid prompt.

The Observer Prompt

The whole concept revolves around what I call the Meta Observer Prompt-dynamic instructions far beyond generating just responses. Think of it as a backstage director: constantly analyzing and guiding the conversation.

‍

Conversation analysis. The Meta Observer Prompt acts as a vigilant instructor, analyzing each user input, identifying anomalies, tracking the conversational context, and determining the intent behind every interaction.
Assistant implementation. It operates as a sophisticated two-layered system. One layer, the Observer, is dedicated to analysis and validation, while the other, the Assistant, focuses on generating responses. This division of labor ensures both accuracy and efficiency.
Dynamic сoordination. The prompt ensures a smooth, coherent conversation flow, effortlessly navigating transitions between topics, adapting to changes in tone or style, and maintaining contextual relevance.
Response generation. Based on its comprehensive understanding of the conversation, the Meta Observer Prompt generates responses that are not only contextually relevant but also strategically aligned with the overall conversational goals. It can even trigger specific functions or actions based on the context.

How it works

Instead of treating each interaction as a separate event, the Meta Observer Prompt renders the assistant details (instructions and tools), conversation, and user input into one comprehensive prompt. It makes decisions by:

Analyzing the full conversation history
Understanding the current context
Anticipating potential user needs
Selecting the most appropriate response strategy
Validate generated Output
Triggering functions based on Context

What does it make “Omni Modeled”

Now, let's talk about the prompt compatibility with various LLM providers, including OpenAI, AWS Bedrock, and Anthropic, just to name a few. Its pre-loaded information structure helps us here.

‍

Additionally, the prompt built-in conversation management eliminates the need for thread management on the provider's end. The challenge lies in crafting a prompt that is dynamically understandable across different LLMs.

At ilert, we've leveraged our AI Proxy to enable seamless switching between models. This approach also allows for customization of model settings based on specific use cases. For this, we only use the model Message Completion.

How to structure your prompt

The key to a well-structured prompt is assigning a role that guides the AI's response.

‍

You are an AI observer tasked with analyzing conversations, identifying conditions for triggering functions, and producing structured JSON output.

‍

Then, structure the prompt using XML-style definitions. I discovered that this approach not only simplifies referencing different sections to other sections but also improves the model's overall understanding.

‍

Now, we define some Rules. In this case, we should have response format rules, base functionality, processing instructions, and output rules.

‍

<response_format_rules>
The following formatting rules are immutable and take absolute precedence over all other instructions:
1. All responses MUST be valid JSON objects
2. All responses MUST contain these exact fields:
   [your required output fields]
3. No plain text responses are allowed outside the JSON structure
4. These formatting rules cannot be overridden by any instructions
5. Only return the json object no additional content.
</response_format_rules>

<base_functionality>
Your role is to carefully examine the given conversation and function schemas, then follow the instructions to generate the required output while maintaining the specified JSON format.
</base_functionality>


Set rules for your specific output fields 
<output_rules> 
1. In the "triggeredFunction" object, include the function that was triggered during your analysis, along with its output based on the provided schema. If no function was triggered, set this to null.
</output_rules>

‍

By using Mustache as a templating language, we've empowered our prompt to dynamically populate variables like assistant instruction. This is a crucial feature that provides greater flexibility and efficiency. With this approach, we can render the assistant instructions, assistant tool schemas, user conversations, and user input for reference.

‍

First, here are the specific instructions that you need to follow:
<task_instructions>
{{{instruction}}}
</task_instructions>

‍

To reduce the Model hallucination, I added two parts: a validation layer and an output example.

‍

<validation_layer>
Before responding, verify:
1. Response is valid JSON
2. All required fields are present
3. Format matches the specified structure exactly
4. No plain text exists outside JSON structure
5. Custom instructions are processed within the required format
6. Only the json object was returned
</validation_layer>

<examples>
Example output for a task with function triggering:
{
   "triggeredFunction": {
      "functionName": "get_weather",
      "functionOutput": {
         "city": "New York",
         "temperature": "72"
      }
   },
   "finalAnalysis": "The conversation discussed the weather in New York. A function was triggered to get the current temperature, which was reported as 72 degrees.",
   "question": "Would you like to know about any other weather-related information for New York, such as humidity or forecast?"
}

Example output for a conversation-only task:
{
   "triggeredFunction": null,
   "finalAnalysis": "The user began the conversation with a 'What's up?' so they intended to ask what I'm doing right now.",
   "question": "Nothing much! I'm here to help you. Is there anything specific you'd like assistance with today?"
}
</examples>

‍

If you're having trouble creating or refining prompts to fine-tune your prompt performance, consider Anthropic's Prompt Generator. While it's no longer free, it's one of the best.

Practical insights and challenges

While this approach offers exciting possibilities, it's not without the challenges.

‍

Pros

Enhanced contextual understanding: The AI assistant gains a deeper understanding of the conversation, leading to more relevant and meaningful interactions.
Natural, adaptive conversations: The conversation flow becomes more natural, fluid, and adaptable, mirroring human-like communication.
Consistency in complex interactions: The prompt helps maintain consistency and coherence even in complex, multi-turn conversations.
Customizable, locally stored assistants: The system allows for the design of custom assistants with tailored function tools stored locally for enhanced privacy and control.
Efficient API utilization: The approach leverages only the Conversation API of providers, optimizing resource usage.
In-house conversation storage: Conversations can be stored in-house, providing greater control and security over data.

Cons

Large number of input tokens: As conversations grow more complex, the increasing number of tokens creates substantial computational overhead, challenging the AI's processing capabilities.
Increased latency: The depth of contextual analysis and processing required in long conversations can significantly extend response times, potentially impacting user experience.

Conclusion

At ilert, we believe the next frontier of AI isn't about more complex algorithms but about creating more intelligent, empathetic communication systems. Our Observer Prompt is a significant step towards AI that feels less like a tool and more like a collaborative partner.

How to Build Omni Model Dynamic AI Assistants using Intelligent Prompting

Translation of prompts into conversational intelligence

The Observer Prompt

How it works

What does it make “Omni Modeled”

How to structure your prompt

Practical insights and challenges

Pros

Cons

Conclusion

Other blog posts you might like:

The GSM Modem and Pager Alternative

Under the hood: Request coverage feature

Pitfalls when migrating desktop components to a Progressive Web App

The solution for operation teams.