The last two weeks have had a huge focus on getting the providers into the new 1.2.0 structure, beta blockers done and manipulating the AI Agents system so it handles specific cases that we see happen when we are working on the Canvas AI module.
Progress Service for Agents
When setting up a user interface for agents, complex agents can take several minutes to load. If users are shown only a loading spinner during this time, many will assume the system has failed and abandon the page.
Another challenge arises during development, when we need to debug the agent’s behaviour. A useful debug tool should be able to display the agent’s progress, which tools it has used, the contexts it has added, and other relevant details.
A third use case is that you might want to log the progress of the agent, so that you can see what it did, and if something went wrong, you can figure out what happened.
We have a progress service for this, that makes it possible!
The Progress Service can give you very granular and optional information about what a specific agent is up to, while it's running. This means that in your user interface, you can give back information on what the agent is doing, has done and is planning to do.
OpenAI can now read PDF’s
If you use the OpenAI provider, it has had the capability to read PDF files for some time, however a Drupal solution has been missing.
If you use the latest 1.2.x version with AI 1.2.x, you can now upload a PDF in the Chat Explorer or your custom code, to use a PDF as context for your specific task.
Agents can give back structured output
When setting up an agent, you can also configure it to return a structured JSON output once it has completed its context gathering, reasoning, modifications, or other processing steps.
This means that if you invoke a custom agent, you can rely on receiving its response in a predictable, structured format.
In the UI of the Agents a form has been added where you can provide a json-schema of the output that you want.
Chat History Form Element added
We have multiple debug and testing modules that require replicating a chat history, capturing the full sequence of actions an agent takes, or the complete conversation history. Examples include the AI API Explorer, AI Agents Test, and AI Agents Explorer. Potential future use cases could include ECA. Currently, each module implements this functionality independently.
To address this, we have created a reusable form element. Instead of rewriting the same logic for every module, you can now set up and manage chat histories with a single line of code, providing a standardised and reusable way to replicate and manipulate interaction histories across all relevant modules.
Pexels Tools - create new media via Chat
You might have already started building agents that can search your local media library using semantic search. This is really cool, but what happens when you simply do not have an image that fits what you are looking for?
The Pexel AI module offers the function calls that gives you the ability to search and create media entities from the popular royalty free stock photo database Pexels.
This means that you can ask for an image from your local database, but when the Chatbot doesn’t find anything you could follow up with an “Could you please search if Pexels has some images for this?” and it will give you back suggestions that you can then follow up by saying that you want to download.
Tool calling is now available with streaming
One issue with streaming an output, is that once your website starts responding, you do not really know what you are going to get. It could be a textual message, but it could also be a decision to use a tool.
The agent's logic relies heavily on knowing whether a tool is being used, in order to determine if it should continue looping. However, we cannot wait for the entire message to stream before sending it to the end user, that defeats the purpose.
We have now added the ability for any code that starts an agent to include callbacks, which are triggered once streaming is complete. That way it will know if it should keep the connection alive for streaming out more content or finish the output buffer.
Other notable fixes:
- Chatbot first message is now translatable
- Common method for suggestions in Field Widget Actions
- Simplified the code for events
- AI core drush command supports copy, paste and multiline
- Override the description of tools to give your specific agent better reliability for a specific use case