We want a predictable number of initializations in our multithreaded
init test, but we also want to make sure that we have _actually_
initialized `git_libgit2_init` before calling `git_thread_create` (since
it now has a sanity check that `git_libgit2_init` has been called).
Since `git_thread_create` is internal-only, keep this sanity check.
Flip the invocation so that we `git_libgit2_init` before our thread
tests and `git_libgit2_shutdown` again after.
Introduce `git_thread_exit`, which will allow threads to terminate at an
arbitrary time, returning a `void *`. On Windows, this means that we
need to store the current `git_thread` in TLS, so that we can set its
`return` value when terminating.
We cannot simply use `ExitThread`, since Win32 returns `DWORD`s from
threads; we return `void *`.
On Windows we can find locked files even when reading a reference or the
packed-refs file. Bubble up the error in this case as well to allow
callers on Windows to retry more intelligently.
We say it's going to work if you use a different repository in each
thread. Let's do precisely that in our code instead of hoping re-using
the refdb is going to work.
This test does fail currently, surfacing existing bugs.
When trying to find a discovery, we walk up the directory
structure checking if there is a ".git" file or directory and, if
so, check its validity. But in the case that we've got a ".git"
file, we do not want to unconditionally assume that the file is
in fact a ".git" file and treat it as such, as we would error out
if it is not.
Fix the issue by only treating a file as a gitlink file if it
ends with "/.git". This allows users of the function to discover
a repository by handing in any path contained inside of a git
repository.
The code correctly detects that forced creation of a branch on a
nonbare repo should not be able to overwrite a branch which is
the HEAD reference. But there's no reason to prevent this on
a bare repo, and in fact, git allows this. I.e.,
git branch -f master new_sha
works on a bare repo with HEAD set to master. This change fixes
that problem, and updates tests so that, for this case, both the
bare and nonbare cases are checked for correct behavior.
Exercise the logic surrounding deinitialization of the libgit2
library as well as repeated concurrent de- and reinitialization.
This tries to catch races and makes sure that it is possible to
reinitialize libgit2 multiple times.
After deinitializing libgit2, we have to make sure to setup
options required for testing. Currently, this only includes
setting up the configuration search path again. Before, this has
been set up once in `tests/main.c`.
The `git_pqueue` struct allows being fixed in its total number of
entries. In this case, we simply throw away items that are
inserted into the priority queue by examining wether the new item
to be inserted has a higher priority than the previous smallest
one.
This feature somewhat contradicts our pqueue implementation in
that it is allowed to not have a comparison function. In fact, we
also fail to check if the comparison function is actually set in
the case where we add a new item into a fully filled fixed-size
pqueue.
As we cannot determine which item is the smallest item in absence
of a comparison function, we fix the `NULL` pointer dereference
by simply dropping all new items which are about to be inserted
into a full fixed-size pqueue.
`xlocale.h` only defines `regcomp_l` if `regex.h` was included as well.
Also change the test cases to actually test `p_regcomp` works with
a multibyte locale.
After porting over the commit hiding and selection we were still left
with mistmaching output due to the topologial sort.
This ports the topological sorting code to make us match with our
equivalent of `--date-order` and `--topo-order` against the output
from `rev-list`.
This is a convenience function to reverse the contents of a vector and a pqueue
in-place.
The pqueue function is useful in the case where we're treating it as a
LIFO queue.
We had some home-grown logic to figure out which objects to show during
the revision walk, but it was rather inefficient, looking over the same
list multiple times to figure out when we had run out of interesting
commits. We now use the lists in a smarter way.
We also introduce the slop mechanism to determine when to stpo
looking. When we run out of interesting objects, we continue preparing
the walk for another 5 rounds in order to make it less likely that we
miss objects in situations with complex graphs.
When creating and printing diffs, deal with binary deltas that have
binary data specially, versus diffs that have a binary file but lack the
actual binary data.
According to the reference the git_checkout_tree and git_checkout_head
functions should accept NULL in the opts field
This was broken since the opts field was dereferenced and thus lead to a
crash.
The .gitignore file allows for patterns which unignore previous
ignore patterns. When unignoring a previous pattern, there are
basically three cases how this is matched when no globbing is
used:
1. when a previous file has been ignored, it can be unignored by
using its exact name, e.g.
foo/bar
!foo/bar
2. when a file in a subdirectory has been ignored, it can be
unignored by using its basename, e.g.
foo/bar
!bar
3. when all files with a basename are ignored, a specific file
can be unignored again by specifying its path in a
subdirectory, e.g.
bar
!foo/bar
The first problem in libgit2 is that we did not correctly treat
the second case. While we verified that the negative pattern
matches the tail of the positive one, we did not verify if it
only matches the basename of the positive pattern. So e.g. we
would have also negated a pattern like
foo/fruz_bar
!bar
Furthermore, we did not check for the third case, where a
basename is being unignored in a certain subdirectory again.
Both issues are fixed with this commit.
Support reading and writing index v4. Index v4 uses a very simple
compression scheme for pathnames, but is otherwise similar to index v3.
Signed-off-by: David Turner <dturner@twitter.com>
Only provide the empty tree internally, which matches git's behavior.
If we provide the empty blob then any users trying to write it with
libgit2 would omit it from actually landing in the odb, which appear
to git proper as a broken repository (missing that object).
Since writing multiple objects may all already exist in a single
packfile, avoid freshening that packfile repeatedly in a tight loop.
Instead, only freshen pack files every 2 seconds.
According to git-fetch(1), "[t]he colon can be omitted when <dst>
is empty." So according to git, the refspec "refs/heads/master"
is the same as the refspec "refs/heads/master:" when fetching
changes. When trying to fetch from a remote with a trailing
colon with libgit2, though, the fetch actually fails while it
works when the trailing colon is left out. So obviously, libgit2
does _not_ treat these two refspec formats the same for fetches.
The problem results from parsing refspecs, where the resulting
refspec has its destination set to an empty string in the case of
a trailing colon and to a `NULL` pointer in the case of no
trailing colon. When passing this to our DWIM machinery, the
empty string gets translated to "refs/heads/", which is simply
wrong.
Fix the problem by having the parsing machinery treat both cases
the same for fetch refspecs.
When passing in a specific suite which should be executed by clar
via `-stest::suite`, we try to parse this string and then include
all tests contained in this suite. This also includes all tests
in sub-suites, e.g. 'test::suite::foo'.
In the case where multiple suites start with the same _string_,
for example 'test::foo' and 'test::foobar', we fail to
distinguish this correctly. When passing in `-stest::foobar`,
we wrongly determine that 'test::foo' is a prefix and try to
execute all of its matching functions. But as no function
will now match 'test::foobar', we simply execute nothing.
To fix this, we instead have to check if the prefix is an actual
suite prefix as opposed to a simple string prefix. We do so by by
inspecting if the first two characters trailing the prefix are
our suite delimiters '::', and only consider the filter as
matching in this case.
Ensure that we include conflicts when calling `git_index_read_index`,
which will remove conflicts in the index that do not exist in the new
target, and will add conflicts from the new target.
When showing copy information because we are duplicating contents,
for example, when performing a `diff --find-copies-harder -M100 -B100`,
then show copy from/to lines in a patch, and do not show context.
Ensure that we can also parse such patches.
git_repository_open_ext provides parameters for the start path, whether
to search across filesystems, and what ceiling directories to stop at.
git commands have standard environment variables and defaults for each
of those, as well as various other parameters of the repository. To
avoid duplicate environment variable handling in users of libgit2, add a
GIT_REPOSITORY_OPEN_FROM_ENV flag, which makes git_repository_open_ext
automatically handle the appropriate environment variables. Commands
that intend to act just like those built into git itself can use this
flag to get the expected default behavior.
git_repository_open_ext with the GIT_REPOSITORY_OPEN_FROM_ENV flag
respects $GIT_DIR, $GIT_DISCOVERY_ACROSS_FILESYSTEM,
$GIT_CEILING_DIRECTORIES, $GIT_INDEX_FILE, $GIT_NAMESPACE,
$GIT_OBJECT_DIRECTORY, and $GIT_ALTERNATE_OBJECT_DIRECTORIES. In the
future, when libgit2 gets worktree support, git_repository_open_env will
also respect $GIT_WORK_TREE and $GIT_COMMON_DIR; until then,
git_repository_open_ext with this flag will error out if either
$GIT_WORK_TREE or $GIT_COMMON_DIR is set.
GIT_REPOSITORY_OPEN_NO_SEARCH does not search up through parent
directories, but still tries the specified path both directly and with
/.git appended. GIT_REPOSITORY_OPEN_BARE avoids appending /.git, but
opens the repository in bare mode even if it has a working directory.
To support the semantics git uses when given $GIT_DIR in the
environment, provide a new GIT_REPOSITORY_OPEN_NO_DOTGIT flag to not try
appending /.git.
git only checks ceiling directories when its search ascends to a parent
directory. A ceiling directory matching the starting directory will not
prevent git from finding a repository in the starting directory or a
parent directory. libgit2 handled the former case correctly, but
differed from git in the latter case: given a ceiling directory matching
the starting directory, but no repository at the starting directory,
libgit2 would stop the search at that point rather than finding a
repository in a parent directory.
Test case using git command-line tools:
/tmp$ git init x
Initialized empty Git repository in /tmp/x/.git/
/tmp$ cd x/
/tmp/x$ mkdir subdir
/tmp/x$ cd subdir/
/tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x git rev-parse --git-dir
fatal: Not a git repository (or any of the parent directories): .git
/tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x/subdir git rev-parse --git-dir
/tmp/x/.git
Fix the testsuite to test this case (in one case fixing a test that
depended on the current behavior), and then fix find_repo to handle this
case correctly.
In the process, simplify and document the logic in find_repo():
- Separate the concepts of "currently checking a .git directory" and
"number of iterations left before going further counts as a search"
into two separate variables, in_dot_git and min_iterations.
- Move the logic to handle in_dot_git and append /.git to the top of the
loop.
- Only search ceiling_dirs and find ceiling_offset after running out of
min_iterations; since ceiling_offset only tracks the longest matching
ceiling directory, if ceiling_dirs contained both the current
directory and a parent directory, this change makes find_repo stop the
search at the parent directory.
Avoid declaring old-style functions without any parameters.
Functions not accepting any parameters should be declared with
`void fn(void)`. See ISO C89 $3.5.4.3.
Read a tree into an index using `git_index_read_index` (by reading
a tree into a new index, then reading that index into the current
index), then write the index back out, ensuring that our new index
is treesame to the tree that we read.
Test with some postimages that actually grow/shrink from the
original, adding new lines or removing them. (Also do so without
context to ensure that we can add/remove from a non-zero part of
the line vector.)
Parse values up to and including `\377` (`0xff`) when unquoting.
Print octal values as an unsigned char when quoting, lest `printf`
think we're talking about negatives.
Patches can now come from a variety of sources - either internally
generated (from diffing two commits) or as the results of parsing
some external data.
When we are provided some input buffer (with a length) to inflate,
and it contains more data than simply the deflated data, fail.
zlib will helpfully tell us when it is done reading (via Z_STREAM_END),
so if there is data leftover in the input buffer, fail lest we
continually try to inflate it.
Handle the application of binary patches. Include tests that
produce a binary patch (an in-memory `git_patch` object),
then enusre that the patch applies correctly.
Instead of going through the usual steps of reading a tree recursively
into an index, modifying it and writing it back out as a tree, introduce
a function to perform simple updates more efficiently.
`git_tree_create_updated` avoids reading trees which are not modified
and supports upsert and delete operations. It is not as versatile as
modifying the index, but it makes some common operations much more
efficient.
`test_commit_commit__create_initial_commit_parent_not_current` was not correctly
testing that `HEAD` was not changed. Now we grab the oid that it was pointing to
before the call to `git_commit_create` and the oid that it's pointing to afterwards
and compare those.
While no extra header fields are defined for tags, git accepts them by
ignoring them and continuing the search for the message. There are a few
tags like this in the wild which git parses just fine, so we should do
the same.
In order to match the star-star, we disable the flag that's looking for
a single path element, but that leads to searching for the pattern in
the middle of elements in the input string.
Mark when we're handing a star-star so we jump over the elements in our
attempt to match the part of the pattern that comes after the star-star.
While here, tighten up the check so we don't allow invalid rules
through.
When we're dealing with proxy addresses, we only want a hostname and
port, and the user would not provide a path, so make it optional so we
can use this same function to parse git as well as proxy URLs.
When running as root, skip the unreadable file tests, because, well,
they're probably _not_ unreadable to root unless you've got some
crazy NSA clearance-level honoring operating system shit going on.
When we turned strict object creation validation on by default, we
forgot to inform the refs::create tests of this. They, in fact,
believed that strict object creation was off by default. As a result,
their cleanup function went and turned strict object creation off for
the remaining tests.
If we cannot dwim the input, set the error message to be explicit about
that. Otherwise we leave the error for the last failed lookup, which
can be rather unexpected as it mentions a remote when the user thought
they were trying to look up a branch.
When passing -DUSE_OPENSSL:BOOL=OFF to cmake the testsuite will
fail with the following error:
core::stream::register_tls [/tmp/libgit2/tests/core/stream.c:40]
Function call failed: (error)
error -1 - <no message>
Fix test to assume failure for tls when built without openssl.
While at it also fix GIT_WIN32 cpp to check if it's defined
or not.
Allow callers to specify a start path with a trailing slash to match
a submodule, instead of just a directory. This is for some legacy
behavior that's sort of dumb, but there it is.
If we're looking for a symlink, realpath will give us the resolved path,
which is not what we're after, but a canonicalized version of the path
the user asked for.
Iterator tests were split over repo::iterator and diff::iterator,
with duplication between the two. Move them to iterator::index,
iterator::tree, and iterator::workdir.
Prior iterator implementations returned `GIT_ENOTFOUND` when
trying to advance into empty directories. Ensure that we no longer
do that and simply handle them gracefully.
tree_iterator was only working properly for a pathlist containing
file paths. In case of directory paths, it didn't match children
which contradicts GIT_DIFF_DISABLE_PATHSPEC_MATCH and
is different from index_iterator and fs_iterator.
As a consequence head-to-index status reporting for a specific
directory did not work properly -- all files have been reported
as added.
Include additional tests.
In the workdir iterator we do some tricky things to step down into
directories to look for things that are in our pathlist. Make sure
that we don't confuse between folders that we're definitely going to
return everything in and folders that we're only stepping down into
to keep looking for matches.
Ensure that we have hit the end of iteration; previously we tested
that we saw all the values that we expected to see. We did not
then ensure that we were at the end of the iteration (and that there
were subsequently values in the iteration that we did *not* expect.)
Refactored the tree iterator to never recurse; simply process the
next entry in order in `advance`. Additionally, reduce the number of
allocations and sorting as much as possible to provide a ~30% speedup
on case-sensitive iteration. (The gains for case-insensitive iteration
are less majestic.)
Disambiguate the reset and reset_range functions. Now reset_range
with a NULL path will clear the start or end; reset will leave the
existing start and end unchanged.
The callback mechanism makes it awkward to write data from an IO
source; move to `_fromstream()` which lets the caller remain in control,
in the same vein as we prefer iterators over foreach callbacks.
By returning when the count goes to zero rather than below it, setting
`howmany` to 7 in fact writes out the string 6 times.
Correct the termination condition to write out the string the amount of
times we specify.
The pair of `git_blob_create_frombuffer()` and
`git_blob_create_frombuffer_commit()` is meant to replace
`git_blob_create_fromchunks()` by providing a way for a user to write a
new blob when they want filtering or they do not know the size.
This approach allows the caller to retain control over when to add data
to this buffer and a more natural fit into higher-level language's own
stream abstractions instead of having to handle IO wait in the callback.
The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a
round multiple of usual page sizes and a value where most blobs seem
likely to be either going to be way below or way over that size. It's
also a round number of pages.
This implementation re-uses the helper we have from `_fromchunks()` so
we end up writing everything to disk, but hopefully more efficiently
than with a default filebuf. A later optimisation can be to avoid
writing the in-memory contents to disk, with some extra complexity.
This special-casing ignores that we might have a locked file, so the
hashtable does not represent the contents of the file we want to
write. This causes multivar writes to overwrite entries instead of add
to them when under lock.
There is no need for this as the normal code-path will write to the file
just fine, so simply get rid of it.
Since the `apply` callback can defer, the `check` callback is not
necessary. Removing the `check` callback further makes the `payload`
unnecessary along with the `cleanup` callback.
Ensure that setting the merge attribute forces the built-in default
`text` driver and does *not* honor the `merge.default` configuration
option. Further ensure that unsetting the merge attribute forces
a conflict (the `binary` driver).
Consumers can now register custom merged drivers with
`git_merge_driver_register`. This allows consumers to support the
merge drivers, as configured in `.gitattributes`. Consumers will be
asked to perform the file-level merge when a custom driver is
configured.
The function to extract signatures suffers from a similar bug to the
header field finding one by having an unecessary line feed check as a
break condition of its loop.
Fix that and add a test for this single-line signature situation.
This ensures that when using OpenSSL a safe default set of ciphers
is selected. This is done so that the client communicates securely
and we don't accidentally enable unsafe ciphers like RC4, or even
worse some old export ciphers.
Implements the first part of https://github.com/libgit2/libgit2/issues/3682
git_buf_clear does not free allocated memory associated with a
git_buf. Use `git_buf_free` instead to correctly free its memory
and plug the memory leak.
The old implementation had two issues:
1. OIDs that were too short as to be ambiguous were not being handled
properly.
2. If the last OID to expand in the array was missing from the ODB, we
would leak a `GIT_ENOTFOUND` error code from the function.