We currently use the timestamp in order to decide whether a config file
has changed since we last read it.
This scheme falls down if the file is written twice within the same
second, as we fail to detect the file change after the first read in
that second.
Using calloc instead of malloc because the parse error will lead to an immediate free of committer (and its properties, which can segfault on free if undefined - test_refs_reflog_reflog__reading_a_reflog_with_invalid_format_returns_error segfaulted before the fix).
#3458
Provide a new merge option, GIT_MERGE_TREE_FAIL_ON_CONFLICT, which
will stop on the first conflict and fail the merge operation with
GIT_EMERGECONFLICT.
Test that nanoseconds are round-tripped correctly when we read
an index file that contains them. We should, however, ignore them
because we don't understand them, and any new entries in the index
should contain a `0` nsecs field, while existing preserving entries.
For most real use cases, repositories with alternates use them as main
object storage. Checking the alternate for objects before the main
repository should result in measurable speedups.
Because of this, we're changing the sorting algorithm to prioritize
alternates *in cases where two backends have the same priority*. This
means that the pack backend for the alternate will be checked before the
pack backend for the main repository *but* both of them will be checked
before any loose backends.
We moved the "main" parsing to use 64 bits for the timestamp, but the
quick parsing for the revwalk did not. This means that for large
timestamps we fail to parse the time and thus the walk.
Move this parser to use 64 bits as well.
xdiff craps the bed on large files. Treat very large files as binary,
so that it doesn't even have to try.
Refactor our merge binary handling to better match git.git, which
looks for a NUL in the first 8000 bytes.
As refdb and odb backends can be allocated by client code, libgit2
can’t know whether an alternative memory allocator was used, and thus
should not try to call `git__free` on those objects.
Instead, odb and refdb backend implementations must always provide
their own `free` functions to ensure memory gets freed correctly.
When we do not trust the on-disk mode, we use the mode of an existing
index entry. This allows us to preserve executable bits on platforms
that do not honor them on the filesystem.
If there is no stage 0 index entry, also look at conflicts to attempt
to answer this question: prefer the data from the 'ours' side, then
the 'theirs' side before falling back to the common ancestor.
git expects an empty line after the binary data:
literal X
...binary data...
<empty_line>
The last literal block of the generated patches were not containing the required empty line. Example:
diff --git a/binary_file b/binary_file
index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 6
Nc${NM%g@i}0ssZ|0lokL
diff --git a/binary_file2 b/binary_file2
index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 13
Sc${NMEKbZyOexL+Qd|HZV+4u-
git apply of that diff results in:
error: corrupt binary patch at line 9: diff --git a/binary_file2 b/binary_file2
fatal: patch with only garbage at line 10
The proper formating is:
diff --git a/binary_file b/binary_file
index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 6
Nc${NM%g@i}0ssZ|0lokL
diff --git a/binary_file2 b/binary_file2
index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
GIT binary patch
literal 8
Pc${NM&PdElPvrst3ey5{
literal 13
Sc${NMEKbZyOexL+Qd|HZV+4u-
This allows us to remove OS checks from source code, instead relying
on CMake to detect whether or not `struct stat` has the nanoseconds
members we rely on.
Test an initial submodule update, where we are trying to checkout
the submodule for the first time, and placing a file within the
submodule working directory with the same name as the submodule
(and consequently, the same name as the repository itself).
`git_futils_mkdir` does not blindly call `git_futils_mkdir_relative`.
`git_futils_mkdir_relative` is used when you have some base directory
and want to create some path inside of it, potentially removing blocking
symlinks and files in the process. This is not suitable for a general
recursive mkdir within the filesystem.
Instead, when `mkdir` is being recursive, locate the first existent
parent directory and use that as the base for `mkdir_relative`.
Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter
assumes that we own everything beneath the base, as if it were
being called with a base of the repository or working directory,
and is tailored towards checkout and ensuring that there is no
bogosity beneath the base that must be cleaned up.
This is (at best) slow and (at worst) unsafe in the larger context
of a filesystem where we do not own things and cannot do things like
unlink symlinks that are in our way.
When a file exists on disk and we're checking out a file that differs
in executableness, remove the old file. This allows us to recreate the
new file with p_open, which will take the new mode into account and
handle setting the umask properly.
Remove any notion of chmod'ing existing files, since it is now handled
by the aforementioned removal and was incorrect, as it did not take
umask into account.
Ensure that we can iterate the filesystem root and that paths come
back well-formed, not with an additional '/'. (eg, when iterating
`c:/`, expect that we do not get some path like `c://autoexec.bat`).
The previous commit left the comment referencing the earlier state of
the code, change it to explain the current logic. While here, change the
logic to avoid repeating the copy of the base pattern.
These are small pieces of data, so there is no advantage to allocating
them separately. Include the two ids inline in the struct we use to
check that the expected and actual ids match.
On case insensitive platforms, allow `git_index_add` to provide a new
path for an existing index entry. Previously, we would maintain the
case in an index entry without the ability to change it (except by
removing an entry and re-adding it.)
Higher-level functions (like `git_index_add_bypath` and
`git_index_add_frombuffers`) continue to keep the old path for easier
usage.
On case insensitive systems, when given a user-provided path in the
higher-level index addition functions (eg `git_index_add_bypath` /
`git_index_add_frombuffer`), examine the index to try to match the
given path to an existing directory.
Various mechanisms can cause the on-disk representation of a folder
to not match the representation in HEAD or the index - for example,
a case changing rename of some file `a/file.txt` to `A/file.txt`
will update the paths in the index, but not rename the folder on
disk.
If a user subsequently adds `a/other.txt`, then this should be stored
in the index as `A/other.txt`.
We create a lockfile to update files under GIT_DIR. Sometimes these
files are actually located elsewhere and a symlink takes their place. In
that case we should lock and update the file at its final location
rather than overwrite the symlink.
Some nicer refactoring for index iteration walks.
The index iterator doesn't binary search through the pathlist space,
since it lacks directory entries, and would have to binary search
each index entry and all its parents (eg, when presented with an index
entry of `foo/bar/file.c`, you would have to look in the pathlist for
`foo/bar/file.c`, `foo/bar` and `foo`). Since the index entries and the
pathlist are both nicely sorted, we walk the index entries in lockstep
with the pathlist like we do for other iteration/diff/merge walks.
When using literal pathspecs in diff with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`
turn on the faster iterator pathlist handling.
Updates iterator pathspecs to include directory prefixes (eg, `foo/`)
for compatibility with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`.
Document that `GIT_DIFF_PATHSPEC_DISABLE` is not necessarily about
explicit path matching, but also includes matching of directory
names. Enforce this in a test.
When given a pathlist, don't assume that directories sort before
files. Walk through any list of entries sorting before us to make
sure that we've exhausted all entries that *aren't* directories.
Eg, if we're searching for 'foo/bar', and we have a 'foo.c', keep
advancing the pathlist to keep looking for an entry prefixed with
'foo/'.
When parsing user-provided regex patterns for functions, we must not
fail to provide a diff just because a pattern is not well
formed. Ignore it instead.
This lock/unlock pair allows for the cller to lock a configuration file
to avoid concurrent operations.
It also allows for a transactional approach to updating a configuration
file. If multiple updates must be made atomically, they can be done
while the config is locked.
When a configuration file is locked, any updates made to it will be done
to the in-memory copy of the file. This allows for multiple updates to
happen while we hold the lock, preventing races during complex
config-file manipulation.
While we download the remote's remote-tracking branches, we don't
download the tag. This points to the tag auto-follow rules interfering
with the refspec.
When we pass the path of a repository to `_bypath()`, we should behave
like git and stage it as a `_COMMIT` regardless of whether it is
registered a a submodule.