Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
When updating the index during a diff, preserve the original mode,
which prevents us from dropping the mode to what we have interpreted
as on our system (eg, what the working directory claims it to be,
which may be a lie on some systems.)
This is used by the submodule in order to figure out if the index has
changed since it last read it. Using a timestamp is racy, so let's make
it use the checksum, just like we now do for reloading the index itself.
When ticking over one second, it can happen that the actual time ticks
over the same second between the time that we undermine our own race
protections and the time in which we perform the index update. Such
timing would make the time in the entries match the index' timestamp and
we have not gained anything.
Ticking over five seconds makes it so that if real-time rolls over that
second, our index is still ahead. This is still suboptimal as we're
dealing with timing, but five seconds should be long enough for any
reasonable test runner to finish the tests.
We currently use a timetamp to check whether an index file has been
modified since we last read it, but this is racy. If two updates happen
in the same second and we read after the first one, we won't detect the
second one.
Instead read the SHA-1 checksum of the file, which are its last 20 bytes which
gives us a sure-fire way to detect whether the file has changed since we
last read it.
As we're now keeping track of it, expose an accessor to this data.
Credits to @directhex
It is possible for PKG_CHECK_MODULES(LIBSSH2 libssh2) to LIBSSH2_LIBRARIES to a string with more than one library in it - e.g. if your libssh2 was built against libgcrypt, it will be "ssh2;gcrypt"
Quoting the string is needed, or CHECK_LIBRARY_EXISTS will fail.
When checking out some file 'foo' that has been modified in the
working directory, allow the checkout to proceed (do not conflict)
if 'foo' is identical to the target of the checkout.
These tests want to test that we don't recalculate entries which match
the index already. This is however something we force when truncating
racily-clean entries.
Tick the index forward as we know that we don't perform the
modifications which the racily-clean code is trying to avoid.
In order to avoid racy-git, we zero out the file size for entries with
the same timestamp as the index (or during the initial checkout). This
is the case in a couple of crlf tests, as the code is fast enough to do
everything in the same second.
As we know that we do not perform the modification just after writing
out the index, which is what this is designed to work around, tick the
mtime of the index file such that it doesn't agree with the files
anymore, and we do not zero out these entries.
If a file entry has the same timestamp as the index itself, it is
considered racily-clean, as it may have been modified after the index
was written, but during the same second. We take extra steps to check
the contents, but this is just one part of avoiding races.
For files which do have changes but have not been updated in the index,
updating the on-disk index means updating its timestamp, which means we
would no longer recognise these entries as racy and we would trust the
timestamp to tell us whether they have changed.
In order to work around this, git zeroes out the file-size field in
entries with the same timestamp as the index in order to force the next
diff to check the contents. Do so in libgit2 as well.
We update the index and then immediately change the contents of the
file. This makes the diff think there are no changes, as the timestamp
of the file agrees with the cached data. This is however a bug, as the
file has obviously changed contents.
The test is a bit fragile, as it assumes that the index writing and the
following modification of the file happen in the same second, but it's
enough to show the issue.
Arguably all uses of readdir_r are unnecessary, but in this case
especially so, as the directory handle only exists within this function,
so we don't race with anybody.
Introduce a new binary diff callback to provide the actual binary
delta contents to callers. Create this data from the diff contents
(instead of directly from the ODB) to support binary diffs including
the workdir, not just things coming out of the ODB.