Fix handling of OpenAI-compatible Gemini req/res #5712
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: This is Draft PR because I'm not sure you want to accept this approach, also the
async-openai
fork should be minimally updated in order for this to work. It would be great if someone from the Meilisearch can test this PR together with the fork update.Fixes #5684
I used
create_stream_byot
when Gemini is used. The first difference is thatindex
is not returned insidetool_calls
elements. Gemini returnsindex: None
.This is OpenAI response:
And this is Gemini response:
So I updated the
async_openai
type in the Meilisearch fork (I only have this locally, didn't create a PR for that since I don't know if you wanna take this approach):I only tested with one tool call, not sure how will all this behave in case of multiple tool calls.
Another difference is that Gemini sends in the same chunk
tool_calls: Some(..)
andfinish_reason: Some(tool_calls)
while OpenAI after collecting tool call chunks it sends a final chunk withtool_calls: None
andfinish_reason: Some(tool_calls)
.So I updated the logic to accumulate tool calls as they arrive and as soon there is
finish_reason: Some(tool_calls)
, to process accumulated tool calls immediately, regardless of the value oftool_calls
.