Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
As we attempt to replicate a situation in which an older checkout has
put a file on disk with different filtering settings from us, set the
timestamp on the entry and file to a second before we're performing the
operation so the entry in the index counts as old.
This way we can test that we're not looking at the on-disk file when the
index has the entry and we detect it as clean.
This allows the user to look up fields which we don't parse in libgit2,
and allows them to access gpgsig or mergetag fields if they wish to
check the signature.
When a file on the workdir has the same or a newer timestamp than the
index, we need to perform a full check of the contents, as the update of
the file may have happened just after we wrote the index.
The iterator changes are such that we can reach inside the workdir
iterator from the diff, though it may be better to have an accessor
instead of moving these structs into the header.
When ticking over one second, it can happen that the actual time ticks
over the same second between the time that we undermine our own race
protections and the time in which we perform the index update. Such
timing would make the time in the entries match the index' timestamp and
we have not gained anything.
Ticking over five seconds makes it so that if real-time rolls over that
second, our index is still ahead. This is still suboptimal as we're
dealing with timing, but five seconds should be long enough for any
reasonable test runner to finish the tests.
These tests want to test that we don't recalculate entries which match
the index already. This is however something we force when truncating
racily-clean entries.
Tick the index forward as we know that we don't perform the
modifications which the racily-clean code is trying to avoid.
In order to avoid racy-git, we zero out the file size for entries with
the same timestamp as the index (or during the initial checkout). This
is the case in a couple of crlf tests, as the code is fast enough to do
everything in the same second.
As we know that we do not perform the modification just after writing
out the index, which is what this is designed to work around, tick the
mtime of the index file such that it doesn't agree with the files
anymore, and we do not zero out these entries.
We update the index and then immediately change the contents of the
file. This makes the diff think there are no changes, as the timestamp
of the file agrees with the cached data. This is however a bug, as the
file has obviously changed contents.
The test is a bit fragile, as it assumes that the index writing and the
following modification of the file happen in the same second, but it's
enough to show the issue.
Introduce a new binary diff callback to provide the actual binary
delta contents to callers. Create this data from the diff contents
(instead of directly from the ODB) to support binary diffs including
the workdir, not just things coming out of the ODB.
Some tools create multiple author fields. git is rather lax when parsing
them, although fsck does complain about them. This means that they exist
in the wild.
As it's not too taxing to check for them, and there shouldn't be a
noticeable slowdown when dealing with correct commits, add logic to skip
over these extra fields when parsing the commit.
When the callback returns an error, we should stop immediately. This
broke when trying to make sure we pass specific errors up the chain.
This broke cancelling out of the loose backend's foreach.
A remote's URLs are now modified according to the url.*.insteadOf
and url.*.pushInsteadOf configurations. This allows a user to
replace URL prefixes by setting the corresponding keys. E.g.
"url.foo.insteadOf = bar" would replace the prefix "bar" with the
new prefix "foo".
Treat input bytes as unsigned before doing arithmetic on them,
lest we look at some non-ASCII byte (like a UTF-8 character) as a
negative value and perform the comparison incorrectly.
We do not error on "merge conflicts"; on the contrary, merge conflicts
are a normal part of merging. We only error on "checkout conflicts",
where a change exists in the index or the working directory that would
otherwise be overwritten by performing the checkout.
This *may* happen during merge (after the production of the new index
that we're going to checkout) but it could happen during any checkout.
When confronted with a conflict in the index, `git_index_add_all`
should stage the working directory copy. If there is no file in the
working directory, the conflict should simply be removed.
It's not always obvious the mapping between stage level and
conflict-ness. More importantly, this can lead otherwise sane
people to write constructs like `if (!git_index_entry_stage(entry))`,
which (while technically correct) is unreadable.
Provide a nice method to help avoid such messy thinking.
Since a diff entry only concerns a single entry, zero the information
for the index side of a conflict. (The index entry would otherwise
erroneously include the lowest-stage index entry - generally the
ancestor of a conflict.)
Test that during status, the index side of the conflict is empty.
When diffing against an index, return a new `GIT_DELTA_CONFLICTED`
delta type for items that are conflicted. For a single file path,
only one delta will be produced (despite the fact that there are
multiple entries in the index).
Index iterators now have the (optional) ability to return conflicts
in the index. Prior to this change, they would be omitted, and callers
(like diff) would omit conflicted index entries entirely.
When we moved from acting on the instance to acting on the
configuration, we dropped the validation of the passed refspec, which
can lead to writing an invalid refspec to the configuration. Bring that
validation back.
When we look for which remote corresponds to a remote-tracking branch,
we look in the refspecs to see which ones matches. If none do, we should
abort. We currently ignore the error message from this operation, so
let's not do that anymore.
As part of the test we're writing, let's test for the expected behaviour
if we cannot find a refspec which tells us what the remote-tracking
branch for a remote would look like.
When we find out that we're dealing with a matching refspec, we set the
flag and return immediately. This leaves the strings as NULL, which
breaks the contract.
Assign these pointers to a string with the correct values.
When we discover that we want to keep a negative rule, make sure to
clear the error variable, as it we otherwise return whatever was left by
the previous loop iteration.
The code used to rely on the clone code calling the remote's save, which
does not happen anymore, meaning that the configuration settings the
remote expected were not being written to disk.
The run-time configuration was still being affected, so the right branch
was being cloned. The tests continued to pass as we did not check for
the configuration entires. Fix this by creating the remote with the
single-branch refspec we want and checking for its existence in the
configuration.
A couple of tests use the wrong remote to push to. We did not notice up
to now because the local push would copy individual objects, and those
already existed, so it became a no-op.
Once we made local push create the packfile, it became noticeable that
there was a new packfile where it didn't belong.
The base refspecs changing can be a cause of confusion as to what is the
current base refspec set and complicate saving the remote's
configuration.
Change `git_remote_add_{fetch,push}()` to update the configuration
instead of an instance.
This finally makes `git_remote_save()` a no-op, it will be removed in a
later commit.
While this will rarely be different from the default, having it in the
remote adds yet another setting it has to keep around and can affect its
behaviour. Move it to the options.
Instead of having it set in a different place from every other callback,
put it the main structure. This removes some state from the remote and
makes it behave more like clone, where the constructors are passed via
the options.
As a first step in removing the repository-saving logic, don't allow
chaning the url or push url from a remote object, but change the
configuration on the configuration immediately.
Having the setting be different from calling its actions was not a great
idea and made for the sake of the wrong convenience.
Instead of that, accept either fetch options, push options or the
callbacks when dealing with the remote. The fetch options are currently
only the callbacks, but more options will be moved from setters and
getters on the remote to the options.
This does mean passing the same struct along the different functions but
the typical use-case will only call git_remote_fetch() or
git_remote_push() and so won't notice much difference.
Ensure that when examining a .gitignore in a subdirectory, we do not
erroneously apply the paths contained therein to the root of the
repository. (Fixed in c02a0e4).
We have a few tests checking each step, but we do not yet have a test
which tests the documented workflow for creating a submodule, namely
`setup_add` followed by cloning into it, followed by `add_finalize`.
Add such a test to protect against regressions in this workflow.
Add a test that exposes a bug in config_write.
It is valid to have multiple separate headers for the same config section, but
config_write will exit after finding the first matching section in certain
situations.
This test proves that config_write will duplicate a variable that already
exists instead of overwriting it if the variable is defined under a duplicate
section header.
Previously we would try to be clever when writing the configuration
file and try to stop parsing (and simply copy the rest of the old
file) when we either found the value we were trying to write,
or when we left the section that value was in, the assumption being
that there was no more work to do.
Regrettably, you can have another section with the same name later
in the file, and we must cope with that gracefully, thus we read the
whole file in order to write a new file.
Now, writing a file looks even more than reading. Pull the config
parsing out into its own function that can be used by both reading
and writing the configuration.
On Mac OS, `realpath` is deficient in determining the actual filename
on-disk as it will simply provide the string you gave it if that file
exists, instead of returning the filename as it exists. Instead we
must read the directory entries for the parent directory to get the
canonical filename.
Ensure that on a case insensitive filesystem that we can checkout
into some folder 'FOLDER' that exists on disk, even if the target
of the checkout is a different case (eg 'folder').
On Windows, you might sloppily rewrite a file (or have a sloppy
text editor that does it for you) and accidentally change its
case. (eg, "README" -> "readme"). Git ignores this accidental
case changing rename during checkout and will happily write the
new content to the file despite the name change. We should, too.
The _next method shouldn't take a path pointer (and a path_len
pointer) as 100% of current users use the full path and ignore
the filename.
Plus let's add some docs and a unit test.
Moved offending tests from network to online so they will get skipped
when there is a lack of network connectivity:
-test_online_remotes__single_branch
-test_online_remotes__restricted_refspecs
Add a unittest to validate bug #3043, where a duplicate empty config header
could cause deletion of a config entry to fail silently. The bug is currently
unresolved and this test will fail.
The idea...sometimes, a filemode is user-specified via an
explicit git_index_entry. In this case, believe the user, always.
Sometimes, it is instead built up by statting the file system. In
those cases, go with the existing logic we have to determine
whether the file system supports all filemodes and symlinks, and
make the best guess.
On file systems which have full filemode and symlink support, this
commit should make no difference. On others (most notably Windows),
this will fix problems things like:
* git_index_add and git_index_add_frombuffer() should be believed.
* As a consequence, git_checkout_tree should make the filemodes in
the index match the ones in the tree.
* And diffs with GIT_DIFF_UPDATE_INDEX don't write the wrong filemodes.
* And merges, and probably other downstream stuff now fixed, too.
This makes my previous changes to checkout.c unnecessary,
so they are now reverted.
Also, added a test for index_entry permissions from git_index_add
and git_index_add_frombuffer, both of which failed before these changes.
In `git_rebase_operation_current()`, indicate when a rebase has not
started (with `GIT_REBASE_NO_OPERATION`) rather than conflating that
with the first operation being in-progress.
It can be useful for the caller to know which update commands will be
sent to the server before the packfile is pushed up. git does this via
the pre-push hook.
We don't have hooks, but as it adds introspection into what is
happening, we can add a callback which performs the same function.
The regcomp function returns a non-zero value if compilation of
a regular expression fails. In most places we only check for
negative values, but positive values indicate an error, as well.
Fix this tree-wide, fixing a segmentation fault when calling
git_config_iterator_glob_new with an invalid regexp.
When no reference names could be found we did error out when trying to describe
a commit. This is wrong, though, when the option to fall back to a commit's
object ID is set.
git_checkout_tree() has some fallback behaviors for file systems
which don't have full support of filemodes. Generally works fine,
but if a given file had a change of type from a 0644 to 0755 (i.e.,
you add executable permissions), the fallback behavior incorrectly
triggers when writing hte updated index.
This would cause a git_checkout_tree() command, even with the
GIT_CHECKOUT_FORCE option set, to leave a dirty index on Windows.
Also added checks to an existing test to catch this case.
When there is a tag, we must make sure that we get all referenced
objects from this tag as well. This failing test shows that e.g. when
there is a tagged tree, we insert the top tree but do not descend, thus
causing the clone to have broken links.
Since the Linux platform has a case sensitive file system, the header name should be lower case for cross compiling purposes. (On Linux, the mingw header is called ```windows.h```).
This was but down to 5 when GitHub made a change to their server which
made them stop honouring the include-tag request.
This has recently been corrected, so we can bring it back up to six.
Currently git_submodule_sync writes the submodule's URL to the
key 'branch.<REMOTE_NAME>.remote' while the reference
implementation of `git submodule sync` writes to
'remote.<REMOTE_NAME>.url', which is the intended behavior
according to git-submodule(1).
When we rename a reference, we want the old and new ids to be the same
one (as we did not change it). The normal code path looks up the old id
from the current value of the brtanch, but by the time we look it up, it
does not exist anymore and thus we write a zero id.
Pass the old id explicitly instead.
Clear the error message on git_libgit2_shutdown for all versions of
the library (no threads and Win32 threads). Drop the giterr_clear
in clar, as that shouldn't be necessary.
This changes the get_entry() method to return a refcounted version of
the config entry, which you have to free when you're done.
This allows us to avoid freeing the memory in which the entry is stored
on a refresh, which may happen at any time for a live config.
For this reason, get_string() has been forbidden on live configs and a
new function get_string_buf() has been added, which stores the string in
a git_buf which the user then owns.
The functions which parse the string value takea advantage of the
borrowing to parse safely and then release the entry.