For most real use cases, repositories with alternates use them as main
object storage. Checking the alternate for objects before the main
repository should result in measurable speedups.
Because of this, we're changing the sorting algorithm to prioritize
alternates *in cases where two backends have the same priority*. This
means that the pack backend for the alternate will be checked before the
pack backend for the main repository *but* both of them will be checked
before any loose backends.
In the current implementation of ODB backends, each backend is tasked
with refreshing itself after a failed lookup. This is standard Git
behavior: we want to e.g. reload the packfiles on disk in case they have
changed and that's the reason we can't find the object we're looking
for.
This behavior, however, becomes pathological in repositories where
multiple alternates have been loaded. Given that each alternate counts
as a separate backend, a miss in the main repository (which can
potentially be very frequent in cases where object storage comes from
the alternate) will result in refreshing all its packfiles before we
move on to the alternate backend where the object will most likely be
found.
To fix this, the code in `odb.c` has been refactored as to perform the
refresh of all the backends externally, once we've verified that the
object is nowhere to be found.
If the refresh is successful, we then perform the lookup sequentially
through all the backends, skipping the ones that we know for sure
weren't refreshed (because they have no refresh API).
The on-disk pack backend has been adjusted accordingly: it no longer
performs refreshes internally.
xdiff craps the bed on large files. Treat very large files as binary,
so that it doesn't even have to try.
Refactor our merge binary handling to better match git.git, which
looks for a NUL in the first 8000 bytes.
As refdb and odb backends can be allocated by client code, libgit2
can’t know whether an alternative memory allocator was used, and thus
should not try to call `git__free` on those objects.
Instead, odb and refdb backend implementations must always provide
their own `free` functions to ensure memory gets freed correctly.
SSL_shutdown() does not like it when we pass an unitialized ssl context
to it. This means that when we fail to connect to a host, we hide the
error message saying so with OpenSSL's indecipherable error message.
git expects an empty line after the binary data:
literal X
...binary data...
<empty_line>
The last literal block of the generated patches were not containing the required empty line. Example:
diff --git a/binary_file b/binary_file
index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 6
Nc${NM%g@i}0ssZ|0lokL
diff --git a/binary_file2 b/binary_file2
index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 13
Sc${NMEKbZyOexL+Qd|HZV+4u-
git apply of that diff results in:
error: corrupt binary patch at line 9: diff --git a/binary_file2 b/binary_file2
fatal: patch with only garbage at line 10
The proper formating is:
diff --git a/binary_file b/binary_file
index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 6
Nc${NM%g@i}0ssZ|0lokL
diff --git a/binary_file2 b/binary_file2
index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 13
Sc${NMEKbZyOexL+Qd|HZV+4u-
If the header doesn't look like a header (e.g. if it doesn't have a ":"
or if it has newlines), report "custom HTTP header '%s' is malformed".
If the header has the same name as a header already set by libgit2 (e.g.
"Host"), report "HTTP header '%s' is already set by libgit2".
Check that the repository directory is beneath the workdir before
adding it to the list of reserved paths. If it is not, then there
is no possibility of checking out files into it, and it should not
be a reserved word.
This is a particular problem with submodules where the repo directory
may be in the super's .git directory.
Test an initial submodule update, where we are trying to checkout
the submodule for the first time, and placing a file within the
submodule working directory with the same name as the submodule
(and consequently, the same name as the repository itself).
When there is a comment at the end of a section, git keeps it there,
while we write the new variable right at the end.
Keep comments buffered and dump them when we're going to output a
variable or section, or reach EOF. This puts us in line with the config
files which git produces.
Before:
libdir=/usr//usr/lib64
includedir=/usr//usr/include
After:
libdir=/usr/lib64
includedir=/usr/include
(note the duplication of /usr in the before case)
Don't coalesce all errors into ENOENT. At least identify EACCES.
All callers should be handling this case already, as the POSIX
`lstat` will return this.