Skip to content

SDK - Added re-ranking into vector search #1516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 10, 2024
Merged

Conversation

SilasMarvin
Copy link
Contributor

No description provided.

"boost": 1.0
},
}
},
Copy link
Contributor Author

@SilasMarvin SilasMarvin Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@montanalow How does this "rerank" key look?

query is the text to compare against.
model is the model to use
num_documents_to_rerank are the number of results to return from vector search and rerank against before limiting it to the limit parameter defined in the next section

"rerank": {
"query": "Test document 2",
"model": "mixedbread-ai/mxbai-rerank-base-v1",
"num_documents_to_rerank": 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about calling this just limit. Does llamaindex or transformers have a similarly named parameter name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry missed this before merging. I think it might be a little confusing if we make it limit as we already have a limit key, and this isn't actually the limit. We already defined limit with llama index to mean the final number of items returned, but I'm not sure if they or langchain use it elsewhere.

}
},
"rerank": {
"query": "Test document 2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like query is being repeated a few places in this example, which may be pretty typical. One enhancement would be to move the query string out and reuse it everywhere, and make passing specific sub clause query strings optional. Not a launch blocker though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will think more on making that optional and reusing it, but will merge this and get it out in the meantime.

@SilasMarvin SilasMarvin merged commit c3a8514 into master Jun 10, 2024
1 check passed
@SilasMarvin SilasMarvin deleted the silas-sdk-add-reranking branch June 10, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants