-
Notifications
You must be signed in to change notification settings - Fork 101
Add ToolsRetriever class and Retriever.convert_to_tool() method #332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
def __init__( | ||
self, | ||
driver: neo4j.Driver, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider updating the Retriever
interface if this one does not use the driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure tbh. I think the tools retriever is an exception so I think we can keep it for now.
# Execute the tool with the provided arguments | ||
tool_result = selected_tool.execute(**tool_args) | ||
# If the tool result is a RawSearchResult, extract its records | ||
if hasattr(tool_result, "records"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maybe enforce the fact that Tool.execute
must return a list of something (or list of strings?), or do we really want to create fake records?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll think about this one a bit.
src/neo4j_graphrag/tools/utils.py
Outdated
# arguments like query_text, top_k, etc., passed as keyword arguments. | ||
# The Tool's 'parameters' definition (e.g., ObjectParameter) ensures | ||
# that these arguments are provided in kwargs when Tool.execute is called. | ||
return retriever.get_search_results(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using the search
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think dealing with the raw output is better in these cases, and let the ToolRetriever handle the formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each retriever is likely to return results in different format, I mean it will "often" be a neo4j Record, but with different keys, especially if using the "*CypherRetriever", so the only formatting we can do here is stringify these records, right? But we can let the formatting question for a later discussion, it's probably part of a wider decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, I guess that's correct. I updated the we call the .search()
method now instead and I do some result type checking. See if that look ok.
I think it 's better than it was at least :)
63f3348
to
3eb99e5
Compare
3eb99e5
to
b4e532e
Compare
src/neo4j_graphrag/tools/tool.py
Outdated
self._parameters = parameters | ||
|
||
self._execute_func = execute_func | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth checking parameters is really None here? The user could provide a different type (a list of something for instance) and silently not using it would lead to unexpected behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes it more robust. Updated.
self, | ||
driver: neo4j.Driver, | ||
llm: LLMInterface, | ||
tools: Sequence[Tool], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we checking somewhere that tool's name are unique? Or this is not a requirement for the LLM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated
- Add abstract get_parameters() method to Retriever base class - Add convert_to_tool() instance method to Retriever class - Implement get_parameters() for all concrete retriever classes - Remove automatic query_text injection in ToolsRetriever - Update example to use new convert_to_tool() method - Remove unnecessary description from ObjectParameter in example
- Remove redundant convert_retriever_to_tool function from utils.py - Add validation for tool name uniqueness in ToolsRetriever - Add parameter type validation in Tool constructor - Update convert_to_tool() to use search() method instead of get_search_results() - This ensures retriever result_formatter is applied for consistent formatting - Update ToolsRetriever to handle RetrieverResult objects from formatted tools - Create consistent record structure with tool attribution and metadata Fixes the result formatting inconsistency issue identified in PR review. Each tool now returns consistently formatted results while preserving the original retriever's formatting logic.
b4e532e
to
fc53af0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗳️
Description
In addition to the major parts of this PR, I've refactored and moved the
tool.py
file into its own directory to group relevant tool files together. This move touches a lot of files because of updated imports.ToolsRetriever
Now that we have tool support in the LLM interface, let's create a retriever that can use one or more tools to retrieve data.
This retriever follows the Retriever interface so it can be used within the GraphRAG class to get the full e2e experience (see example file).
The ToolsRetriever uses an LLM to decide on what tools to use to find the relevant data.
convert_to_tool ()
This method is a way to convert a
Retriever
to aTool
so it can be used within theToolsRetriever
. This is useful when you might want to have both aVectorRetriever
and aText2CypherRetreiver
as a fallback.See new example files for usage.
Type of Change
How Has This Been Tested?
Checklist
The following requirements should have been met (depending on the changes in the branch):