-
Notifications
You must be signed in to change notification settings - Fork 803
[v6] plumbing: format/packfile, Optimise packfile delta processing #1523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This may also fix #1451. I will test it against my workloads when I have time. Edit: Nevermind, it doesn't. |
@@ -0,0 +1,17 @@ | |||
package ioutil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very confusingly named, since it's also the name of the deprecated io/ioutil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@runxiyu this is an existing package, renaming it is outside of the scope of this PR.
@@ -135,7 +135,7 @@ func TestResolveExternalRefsInThinPack(t *testing.T) { | |||
|
|||
checksum, err := parser.Parse() | |||
assert.NoError(t, err) | |||
assert.NotEqual(t, plumbing.ZeroHash, checksum) | |||
assert.NotEqual(t, checksum, plumbing.ZeroHash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this make a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is largely to align with the assertion construct, whereby the second argument is the expected whereas the third is the actual.
The new func abstracts away the use of sync while managing the buffers used for io.CopyBuffer calls. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Ensures that: - Slices are at least of the initial length. - No data is kept between Put and Get operations. - The slice size is increased to 32kb. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Avoid on-demand allocation of buffers to hold parent content by reusing buffers from sync. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The memory churn for processing delta objects is quite high, due to the performance optimisation which keeps in memory all the inflated contents of each object. The new default only keeps in memory the contents of the parents needed to resolve the contents of the delta being processed. The buffers used for all the contents are reused to avoid unnecessary churn. Users can opt-out by using WithHighMemoryUsage() when calling the Parser. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The initial low-memory mode did not work well on specific scenarios which was leading to some tests to fail. The changes resolve that and rename the HighMemoryUsage option to HighMemoryMode instead. It introduces the LowMemoryCapable interface, which enables storage implementations to opt-in/out of this mode. When a storage does not implement that interface, high-memory mode would be the default. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
In filesystem storage, the packfile caching is only used if KeepDescriptors is enabled. This should make overall operations using filesystem storage more efficient. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The memory churn for processing delta objects was quite high, due to the performance optimisation which kept in memory all the inflated contents of each object.
The changes introduced allows for supporting storage implementations (i.e.
storage/filesystem
) to only keep in memory the contents of the parents needed to resolve the contents of the delta being processed. The buffers used for all the contents are reused to avoid unnecessary churn.Users can opt-out by using
WithHighMemoryUsage()
when creating the Parser, orHighMemoryMode
in the storage option.A new
ioutil.Copy
was introduced to simplify the process of reusing buffers during Copy operations.KeepDescriptors
by default, which enables the same PackFile being re-parsed several times during a given git operation. This change is still being tested for its impact.Fixes #1188.
Benchmarks comparing
v6-transport
with this PR while cloningkubernetes/kubernetes
:Filesystem storage (2.3GB less memory churn)
Memory storage (1GB less memory churn)