Skip to content

Improve Status performance with many ignored files #1379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: v6-exp
Choose a base branch
from

Conversation

silkeh
Copy link

@silkeh silkeh commented Jan 14, 2025

This PR contains two changes that significantly improve the performance of WorkTree.Status() (from 491 s to 5 s):

  • Skip ignored files when walking through the worktree.
    This signigifantly improves the performance of Status():
    In a repository with 3M ignored files Status now takes 5 s instead of 160 s.
  • Fix loading of .gitignore files in nested ignored directories.
    These were not matched in the current implementation, as that only checks the ignores of the current directory.
    As a side-effect this change also leads to significantly improved performance: in a repository with 3M ignored files Status now takes 160 s instead of 491 s.

I have a backport to master ready if desired.

Fix loading of `.gitignore` files in nested ignored directories.
These were not matched in the current implementation,
as that only checks the ignores of the current directory.

As a side-effect this change also leads to significantly improved performance:
in a repository with 3M ignored files `Status` now takes 160 s instead of 491 s.
Skip ignored files when walking through the worktree.

This signigifantly improves the performance of `Status()`:
In a repository with 3M ignored files `Status` now takes 5 s instead of 160 s.
@pjbgf
Copy link
Member

pjbgf commented Jan 15, 2025

@silkeh thanks for proposing this PR. Although this may improve the use case of multiple ignored files, it seems to cause a regression when that is not the case. I did a quick benchmark testing a call to Status() on a copy of the go-git repo, and these changes seem to have a negative impact on both allocation numbers and time per operation:

BenchmarkStatus-16 (v5.13.0)      	      48	  21514427 ns/op	19511754 B/op	   69999 allocs/op
BenchmarkStatus-16 (this PR)      	      34	  34658273 ns/op	42274701 B/op	  106194 allocs/op

I am struggling for time to pinpoint optimisation areas in the PR, would you be able to optimise it for the standard use case as well?

@pjbgf pjbgf added this to the v6.0.0 milestone Jan 15, 2025
@silkeh
Copy link
Author

silkeh commented Jan 15, 2025

@pjbgf: Thanks for taking a look! I'll take a look later this week and see what I can improve.

@onee-only
Copy link
Contributor

Hello @silkeh, Are you still working on this? This could be a big improvement for #181.

@silkeh
Copy link
Author

silkeh commented Jun 13, 2025

@onee-only: That was the plan, but I've been busy with other things. I'll try to make some time next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants