With the new code to make tree iterators support ignore_case,
there is a bug in setting the start entry for range bounded
iterators where memcmp was being used instead of strncasecmp.
This fixes that and expands the tree iterator test to cover
the cases that were broken.
This makes tree iterators directly support case insensitivity by
using a secondary index that can be sorted by icase. Also, this
fixes the ambiguity check in the git_status_file API to also be
case insensitive. Lastly, this adds new test cases for case
insensitive range boundary checking for all types of iterators.
With this change, it should be possible to deprecate the spool
and sort iterator, but I haven't done that yet.
This adds a test that confirms that the working directory iterator
can actually correctly process ranges of files case insensitively
with proper sorting and proper boundaries.
This changes the iterator API so that flags can be passed in to
the constructor functions to control the ignore_case behavior.
At this point, the flags are not supported on tree iterators (i.e.
there is no functional change over the old API), but the API
changes are all made to accomodate this.
By the way, I went with a flags parameter because in the future
I have a couple of other ideas for iterator flags that will make
it easier to fix some diff/status/checkout bugs.
In preparation for further iterator changes, this cleans up a few
small things in the iterator API:
* removed the git_iterator_for_repo_index_range API
* made git_iterator_free not be inlined
* minor param name and test function name tweaks
This was just wrong. Added a test that verifying patch line
numbers even for hunks further into a file and then fixed the
algorithm. I needed to add a little extra state into the patch
so that I could track old and new file numbers independently,
but it should be okay.
It is not legal inside our `p_mmap` function to mmap a zero length
file. This adds a test that exercises that case inside diff and
fixes the code path where we would try to do that.
The fix turns out not to be a lot of code since our default file
content is already initialized to "" which works in this case.
Fixes#1210
This moves the implementation of these two APIs into common code
that will be shared between the two. Also, this adds tests for
the `git_diff_blob_to_buffer` API. Lastly, this adds some extra
`const` to a few places that can use it.
The diff constructor functions had some confusing names, where the
"old" side of the diff was coming after the "new" side. This
reverses the order in the function name to make it less confusing.
Specifically...
* git_diff_index_to_tree becomes git_diff_tree_to_index
* git_diff_workdir_to_index becomes git_diff_index_to_workdir
* git_diff_workdir_to_tree becomes git_diff_tree_to_workdir
The `git_iterator_reset` command has not been working in all cases
particularly when there is a start and end range. This fixes it
and adds tests for it, and also extends it with the ability to
update the start/end range strings when an iterator is reset.
This removes the need to explicitly pass the repo into iterators
where the repo is implied by the other parameters. This moves
the repo to be owned by the parent struct. Also, this has some
iterator related updates to the internal diff API to lay the
groundwork for checkout improvements.
This makes the diff functions that take callbacks both take
the payload parameter after the callback function pointers and
pass the payload as the last argument to the callback function
instead of the first. This should make them consistent with
other callbacks across the API.
Without this change, any failed assertion in the second (or a later) test
inside a test suite has a chance of double deleting memory, resulting in
a heap corruption. See #1096 for details.
This leaves alone the test cases where we "just" use cl_git_sandbox_init()
and cl_git_sandbox_cleanup(). These methods already take good care to not
double delete a repository.
Fixes#1096
The workdir iterator has always tried to ignore .git files, but
it turns out there were some bugs. This makes it more robust at
ignoring .git files.
This also makes iterators always check ".git" case insensitively
regardless of the properties of the system. This will make libgit2
skip ".GIT" and the like. This is different from core git, but on
systems with case insensitive but case preserving file systems,
allowing ".GIT" to be added is problematic.
A number of diff APIs and the `git_checkout_index` API take a
`git_repository` object an operate on the index. This updates
them to take a `git_index` pointer explicitly and only fall back
on the `git_repository` index if the index input is NULL. This
makes it easier to operate on a temporary index.
The index iterator could previously only be created from a repo
object, but this allows creating an iterator from a `git_index`
object instead (while keeping, though renaming, the old function).
This improves the naming for the rename related functionality
moving it to be called `git_diff_find_similar()` and renaming
all the associated constants, etc. to make more sense.
I also moved the new code (plus the existing `git_diff_merge`)
into a new file `diff_tform.c` where I can put new functions
related to manipulating git diff lists.
This also updates the implementation significantly from the
last revision fixing some ordering issues (where break-rewrite
needs to be handled prior to copy and rename detection) and
improving config option handling.
This implements the basis for diff rename and copy detection,
although it is based on simple SHA comparison right now instead
of using a matching algortihm. Just as `git_diff_merge` can be
used as a post-pass on diffs to emulate certain command line
behaviors, there is a new API `git_diff_detect` which will
update a diff list in-place, adjusting some deltas to RENAMED
or COPIED state (and also, eventually, splitting MODIFIED deltas
where the change is too large into DELETED/ADDED pairs).
This also adds a new test repo that will hold rename/copy/split
scenarios. Right now, it just has exact-match rename and copy,
but the tests are written to use tree diffs, so we should be able
to add new test scenarios easily without breaking tests.
To answer if a single given file should be ignored, the path to
that file has to be processed progressively checking that there
are no intermediate ignored directories in getting to the file
in question. This enables that, fixing the broken old behavior,
and adds tests to exercise various ignore situations.
This started as a complex new test for checkout going through the
"typechanges" test repository, but that revealed numerous issues
with checkout, including:
* complete failure with submodules
* failure to create blobs with exec bits
* problems when replacing a tree with a blob because the tree
"example/" sorts after the blob "example" so the delete was
being processed after the single file blob was created
This fixes most of those problems and includes a number of other
minor changes that made it easier to do that, including improving
the TYPECHANGE support in diff/status, etc.
The adds a test for the submodule diff capabilities and then
fixes a few bugs with how the output is generated. It improves
the accuracy of OIDs in the diff delta object and makes the
submodule output more closely mirror the OIDs that will be used
by core git.
There are a lot of places where the diff API gives the user access
to internal data structures and many of these were being exposed
through non-const pointers. This replaces them all with const
pointers for any object that the user can access but is still
owned internally to the git_diff_list or git_diff_patch objects.
This will probably break some bindings... Sorry!
This fixes all the bugs in the new diff patch code. The only
really interesting one is that when we merge two diffs, we now
have to actually exclude diff delta records that are not supposed
to be tracked, as opposed to before where they could be included
because they would be skipped silently by `git_diff_foreach()`.
Other than that, there are just minor errors.
Replacing the `git_iterator` object, this creates a simple API
for accessing the "patch" for any file pair in a diff list and
then gives indexed access to the hunks in the patch and the lines
in the hunk. This is the initial implementation of this revised
API - it is still broken, but at least builds cleanly.
There is a bug in building the linked list of line records in the
diff iterator and also an off by one element error in the hunk
counts. This fixes both of these, adds some test data with more
complex sets of hunk and line diffs to exercise this code better.
In the process of adding tests for the max file size threshold
(which treats files over a certain size as binary) there seem to
be a number of problems in the new code with detecting binaries.
This should fix those up, as well as add a test for the file
size threshold stuff.
Also, this un-deprecates `GIT_DIFF_LINE_ADD_EOFNL`, since I
finally found a legitimate situation where it would be returned.
The `git_diff_iterator_num_files` API was problematic, since we
don't actually know the exact number of files to be iterated over
until we load those files into memory. This replaces it with a
new `git_diff_iterator_progress` API that goes from 0 to 1, and
moves and renamed the old API for the internal places that can
tolerate a max value instead of an exact value.
This refactors the diff output code so that an iterator object
can be used to traverse and generate the diffs, instead of just
the `foreach()` style with callbacks. The code has been rearranged
so that the two styles can still share most functions.
This also replaces `GIT_REVWALKOVER` with `GIT_ITEROVER` and uses
that as a common error code for marking the end of iteration when
using a iterator style of object.
In looking at PR #878, I found a few small bugs in the diff code,
mostly related to work that can be avoided when processing tree-
to-tree diffs that was always being carried out. This commit has
some small fixes in it.
This updates all the `foreach()` type functions across the library
that take callbacks from the user to have a consistent behavior.
The rules are:
* A callback terminates the loop by returning any non-zero value
* Once the callback returns non-zero, it will not be called again
(i.e. the loop stops all iteration regardless of state)
* If the callback returns non-zero, the parent fn returns GIT_EUSER
* Although the parent returns GIT_EUSER, no error will be set in
the library and `giterr_last()` will return NULL if called.
This commit makes those changes across the library and adds tests
for most of the iteration APIs to make sure that they follow the
above rules.