Skip to content

Add ToolsRetriever class and Retriever.convert_to_tool() method #332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

oskarhane
Copy link
Member

@oskarhane oskarhane commented May 8, 2025

Description

In addition to the major parts of this PR, I've refactored and moved the tool.py file into its own directory to group relevant tool files together. This move touches a lot of files because of updated imports.

ToolsRetriever

Now that we have tool support in the LLM interface, let's create a retriever that can use one or more tools to retrieve data.

This retriever follows the Retriever interface so it can be used within the GraphRAG class to get the full e2e experience (see example file).

The ToolsRetriever uses an LLM to decide on what tools to use to find the relevant data.

calendar_tool = CalendarTool()
weather_tool = WeatherTool()

llm = OpenAILLM(...)
driver = GraphDatabase.driver(...)

tools_retriever = ToolsRetriever(
    driver=driver,
    llm=llm,
    tools=[calendar_tool, weather_tool],
)

graphrag = GraphRAG(
    llm=llm,
    retriever=tools_retriever,
)

result = graphrag.search(query_text="Tell me about tomorrow", return_context=False)

convert_to_tool ()

This method is a way to convert a Retriever to a Tool so it can be used within the ToolsRetriever. This is useful when you might want to have both a VectorRetriever and a Text2CypherRetreiver as a fallback.
See new example files for usage.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@oskarhane oskarhane requested a review from a team as a code owner May 8, 2025 10:47

def __init__(
self,
driver: neo4j.Driver,
Copy link
Contributor

@stellasia stellasia May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider updating the Retriever interface if this one does not use the driver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure tbh. I think the tools retriever is an exception so I think we can keep it for now.

# Execute the tool with the provided arguments
tool_result = selected_tool.execute(**tool_args)
# If the tool result is a RawSearchResult, extract its records
if hasattr(tool_result, "records"):
Copy link
Contributor

@stellasia stellasia May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can maybe enforce the fact that Tool.execute must return a list of something (or list of strings?), or do we really want to create fake records?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll think about this one a bit.

# arguments like query_text, top_k, etc., passed as keyword arguments.
# The Tool's 'parameters' definition (e.g., ObjectParameter) ensures
# that these arguments are provided in kwargs when Tool.execute is called.
return retriever.get_search_results(**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using the search method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dealing with the raw output is better in these cases, and let the ToolRetriever handle the formatting.

Copy link
Contributor

@stellasia stellasia Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each retriever is likely to return results in different format, I mean it will "often" be a neo4j Record, but with different keys, especially if using the "*CypherRetriever", so the only formatting we can do here is stringify these records, right? But we can let the formatting question for a later discussion, it's probably part of a wider decision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, I guess that's correct. I updated the we call the .search() method now instead and I do some result type checking. See if that look ok.
I think it 's better than it was at least :)

@oskarhane oskarhane force-pushed the tools-retriever-3 branch from 63f3348 to 3eb99e5 Compare June 18, 2025 15:45
@oskarhane oskarhane changed the title Add ToolsRetriever class and convert_retriever_to_tool() function Add ToolsRetriever class and Retriever.convert_to_tool() method Jun 18, 2025
@oskarhane oskarhane force-pushed the tools-retriever-3 branch from 3eb99e5 to b4e532e Compare June 18, 2025 20:52
@oskarhane oskarhane requested a review from stellasia June 18, 2025 21:01
self._parameters = parameters

self._execute_func = execute_func
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth checking parameters is really None here? The user could provide a different type (a list of something for instance) and silently not using it would lead to unexpected behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes it more robust. Updated.

self,
driver: neo4j.Driver,
llm: LLMInterface,
tools: Sequence[Tool],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we checking somewhere that tool's name are unique? Or this is not a requirement for the LLM?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated

- Add abstract get_parameters() method to Retriever base class
- Add convert_to_tool() instance method to Retriever class
- Implement get_parameters() for all concrete retriever classes
- Remove automatic query_text injection in ToolsRetriever
- Update example to use new convert_to_tool() method
- Remove unnecessary description from ObjectParameter in example
- Remove redundant convert_retriever_to_tool function from utils.py
- Add validation for tool name uniqueness in ToolsRetriever
- Add parameter type validation in Tool constructor
- Update convert_to_tool() to use search() method instead of get_search_results()
- This ensures retriever result_formatter is applied for consistent formatting
- Update ToolsRetriever to handle RetrieverResult objects from formatted tools
- Create consistent record structure with tool attribution and metadata

Fixes the result formatting inconsistency issue identified in PR review.
Each tool now returns consistently formatted results while preserving
the original retriever's formatting logic.
@oskarhane oskarhane force-pushed the tools-retriever-3 branch from b4e532e to fc53af0 Compare June 23, 2025 10:02
@oskarhane oskarhane requested a review from stellasia June 23, 2025 10:08
Copy link
Contributor

@stellasia stellasia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗳️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants