This restores a behavior that was accidentally lost during some
diff refactoring where an untracked directory that contains a .git
item should be treated as IGNORED, not as UNTRACKED. The submodule
code already detects this, but the diff code was not handling the
scenario right.
This also updates a number of existing tests that were actually
exercising the behavior but did not have the right expectations in
place. It actually makes the new
`test_diff_submodules__diff_ignore_options` test feel much better
because the "not-a-submodule" entries are now ignored instead of
showing up as untracked items.
Fixes#1697
Add test for bug fixed in 852ded9698
Sorry, I wrote that bug fix and forgot to check in a test at the
same time. Here is one that fails on the old version of the code
and now works.
When the last item in a diff was an untracked directory that only
contained ignored items, the loop to scan the contents would run
off the end of the iterator and dereference a NULL pointer. This
includes a test that reproduces the problem and a fix.
This started out trying to look at the problems from issue #1425
and gradually grew to a broader set of fixes. There are two core
things fixed here:
1. When you had an ignore like "/bin" which is rooted at the top
of your tree, instead of immediately adding the "bin/" entry
as an ignored item in the diff, we were returning all of the
direct descendants of the directory as ignored items. This
changes things to immediately ignore the directory. Note that
this effects the behavior in test_status_ignore__subdirectories
so that we no longer exactly match core gits ignore behavior,
but the new behavior probably makes more sense (i.e. we now
will include an ignored directory inside an untracked directory
that we previously would have left off).
2. When a submodule only contained working directory changes, the
diff code was always considering it unmodified which was just
an outright bug. The HEAD SHA of the submodule matches the SHA
in the parent repo index, and since the SHAs matches, the diff
code was overwriting the actual status with UNMODIFIED.
These fixes broke existing tests test_diff_workdir__submodules and
test_status_ignore__subdirectories but looking it over, I actually
think the new results are correct and the old results were wrong.
@nulltoken had actually commented on the subdirectory ignore issue
previously.
I also included in the tests some debugging versions of the
shared iteration callback routines that print status or diff
information. These aren't used actively in the tests, but can be
quickly swapped in to test code to give a better picture of what
is being scanned in some of the complex test scenarios.
This fixes some places where the new tests were leaving the test
area in a bad state or were freeing data they should not free.
It also removes code that is extraneous to the core issue and
fixes an invalid SHA being looked up in one of the tests (which
was failing, but for the wrong reason).
This adds a helper function for the cases where you want to
quickly set a single boolean config value for a repository.
This allowed me to remove a lot of code.
1. Fix sort order problem with submodules where "mod" was sorting
after "mod-plus" because they were being sorted as "mod/" and
"mod-plus/". This involved pushing the "contains a .git entry"
test significantly lower in the stack.
2. Reinstate behavior that a directory which contains a .git entry
will be treated as a submodule during iteration even if it is
not yet added to the .gitmodules.
3. Now that any directory containing .git is reported as submodule,
we have to be more careful checking for GIT_EEXISTS when we
do a submodule lookup, because that is the error code that is
returned by git_submodule_lookup when you try to look up a
directory containing .git that has no record in gitmodules or
the index.
The callback will be called for each file, just before the `git_delta_t` gets inserted into the diff list.
When the callback:
- returns < 0, the diff process will be aborted
- returns > 0, the delta will not be inserted into the diff list, but the diff process continues
- returns 0, the delta is inserted into the diff list, and the diff process continues
It is not legal inside our `p_mmap` function to mmap a zero length
file. This adds a test that exercises that case inside diff and
fixes the code path where we would try to do that.
The fix turns out not to be a lot of code since our default file
content is already initialized to "" which works in this case.
Fixes#1210
The diff constructor functions had some confusing names, where the
"old" side of the diff was coming after the "new" side. This
reverses the order in the function name to make it less confusing.
Specifically...
* git_diff_index_to_tree becomes git_diff_tree_to_index
* git_diff_workdir_to_index becomes git_diff_index_to_workdir
* git_diff_workdir_to_tree becomes git_diff_tree_to_workdir
This makes the diff functions that take callbacks both take
the payload parameter after the callback function pointers and
pass the payload as the last argument to the callback function
instead of the first. This should make them consistent with
other callbacks across the API.
A number of diff APIs and the `git_checkout_index` API take a
`git_repository` object an operate on the index. This updates
them to take a `git_index` pointer explicitly and only fall back
on the `git_repository` index if the index input is NULL. This
makes it easier to operate on a temporary index.
This implements the basis for diff rename and copy detection,
although it is based on simple SHA comparison right now instead
of using a matching algortihm. Just as `git_diff_merge` can be
used as a post-pass on diffs to emulate certain command line
behaviors, there is a new API `git_diff_detect` which will
update a diff list in-place, adjusting some deltas to RENAMED
or COPIED state (and also, eventually, splitting MODIFIED deltas
where the change is too large into DELETED/ADDED pairs).
This also adds a new test repo that will hold rename/copy/split
scenarios. Right now, it just has exact-match rename and copy,
but the tests are written to use tree diffs, so we should be able
to add new test scenarios easily without breaking tests.
The adds a test for the submodule diff capabilities and then
fixes a few bugs with how the output is generated. It improves
the accuracy of OIDs in the diff delta object and makes the
submodule output more closely mirror the OIDs that will be used
by core git.
There are a lot of places where the diff API gives the user access
to internal data structures and many of these were being exposed
through non-const pointers. This replaces them all with const
pointers for any object that the user can access but is still
owned internally to the git_diff_list or git_diff_patch objects.
This will probably break some bindings... Sorry!
Replacing the `git_iterator` object, this creates a simple API
for accessing the "patch" for any file pair in a diff list and
then gives indexed access to the hunks in the patch and the lines
in the hunk. This is the initial implementation of this revised
API - it is still broken, but at least builds cleanly.
There is a bug in building the linked list of line records in the
diff iterator and also an off by one element error in the hunk
counts. This fixes both of these, adds some test data with more
complex sets of hunk and line diffs to exercise this code better.
This refactors the diff output code so that an iterator object
can be used to traverse and generate the diffs, instead of just
the `foreach()` style with callbacks. The code has been rearranged
so that the two styles can still share most functions.
This also replaces `GIT_REVWALKOVER` with `GIT_ITEROVER` and uses
that as a common error code for marking the end of iteration when
using a iterator style of object.
There are three actual changes in this commit:
1. When the trailing newline of a file is removed in a diff, the
change will now be reported with `GIT_DIFF_LINE_DEL_EOFNL` passed
to the callback. Previously, the `ADD_EOFNL` constant was given
which was just an error in my understanding of when the various
circumstances arose. `GIT_DIFF_LINE_ADD_EOFNL` is deprecated and
should never be generated. A new newline is simply an `ADD`.
2. Rewrote the `diff_delta__merge_like_cgit` function that contains
the core logic of the `git_diff_merge` implementation. The new
version doesn't actually have significantly different behavior,
but the logic should be much more obvious, I think.
3. Fixed a bug in `git_diff_merge` where it freed a string pool
while some of the string data was still in use. This led to
`git_diff_print_patch` accessing memory that had been freed.
The rest of this commit contains improved documentation in `diff.h`
to make the behavior and the equivalencies with core git clearer,
and a bunch of new tests to cover the various cases, oh and a minor
simplification of `examples/diff.c`.
File modes were both not being ignored properly on platforms
where they should be ignored, nor be diffed consistently on
platforms where they are supported.
This change adds a number of diff and status filemode change
tests. This also makes sure that filemode-only changes are
included in the diff output when they occur and that filemode
changes are ignored successfully when core.filemode is false.
There is no code that automatically toggles core.filemode
based on the capabilities of the current platform, so the user
still needs to be careful in their .git/config file.
git_status_file would always return GIT_ENOTFOUND for these files.
The underlying bug was that git__strcmp_cb, which is used by
git_path_with_stat_cmp to sort entries in the working directory,
compares strings based on unsigned chars (this is confirmed by the
strcmp(3) manpage), while git__prefixcmp, which is used by
workdir_iterator__entry_cmp to search for a path in the working
directory, compares strings based on char. So the sort puts this path at
the end of the list, while the search expects it to be at the beginning.
The fix was simply to make git__prefixcmp compare using unsigned chars,
just like strcmp(3). The rest of the change is just adding/updating
tests.
This updates to the latest clar which includes the helpers
`cl_assert_equal_s` and `cl_assert_equal_i`. Convert the code
over to use those and remove the old libgit2-only helpers.
This adds preliminary support for pathspecs to diff and status.
The implementation is not very optimized (it still looks at
every single file and evaluated the the pathspec match against
them), but it works.
It turns out that commit 31e9cfc4cbcaf1b38cdd3dbe3282a8f57e5366a5
did not fix the GIT_USUSED behavior on all platforms. This commit
walks through and really cleans things up more thoroughly, getting
rid of the unnecessary stuff.
To remove the use of some GIT_UNUSED, I ended up adding a couple
of new iterators for hashtables that allow you to iterator just
over keys or just over values.
In making this change, I found a bug in the clar tests (where we
were doing *count++ but meant to do (*count)++ to increment the
value). I fixed that but then found the test failing because it
was not really using an empty repo. So, I took some of the code
that I wrote for iterator testing and moved it to clar_helpers.c,
then made use of that to make it easier to open fixtures on a
per test basis even within a single test file.
This is a major reorganization of the diff code. This changes
the diff functions to use the iterators for traversing the
content. This allowed a lot of code to be simplified. Also,
this moved the functions relating to outputting a diff into a
new file (diff_output.c).
This includes a number of other changes - adding utility
functions, extending iterators, etc. plus more tests for the
diff code. This also takes the example diff.c program much
further in terms of emulating git-diff command line options.