When diff encounters an untracked directory, there was a shortcut
that it took which is not compatible with core git. This makes
the default behavior no longer take that shortcut and instead look
inside the untracked directory to see if there are any untracked
files within it. If there are not, then the directory is treated
as an ignore directory instead of an untracked directory. This
has implications for the git_status APIs.
This adds tests for diffs with submodules in them and (perhaps
unsurprisingly) requires further fixes to be made. Specifically,
this fixes:
- when considering if a submodule is dirty in the workdir, it was
being treated as dirty even if only the index was dirty.
- git_diff_patch_to_str (and git_diff_patch_print) were "printing"
the headers for files (and submodules) that were unmodified or
had no meaningful content.
- added comment to previous fix and removed unneeded parens.
This started out trying to look at the problems from issue #1425
and gradually grew to a broader set of fixes. There are two core
things fixed here:
1. When you had an ignore like "/bin" which is rooted at the top
of your tree, instead of immediately adding the "bin/" entry
as an ignored item in the diff, we were returning all of the
direct descendants of the directory as ignored items. This
changes things to immediately ignore the directory. Note that
this effects the behavior in test_status_ignore__subdirectories
so that we no longer exactly match core gits ignore behavior,
but the new behavior probably makes more sense (i.e. we now
will include an ignored directory inside an untracked directory
that we previously would have left off).
2. When a submodule only contained working directory changes, the
diff code was always considering it unmodified which was just
an outright bug. The HEAD SHA of the submodule matches the SHA
in the parent repo index, and since the SHAs matches, the diff
code was overwriting the actual status with UNMODIFIED.
These fixes broke existing tests test_diff_workdir__submodules and
test_status_ignore__subdirectories but looking it over, I actually
think the new results are correct and the old results were wrong.
@nulltoken had actually commented on the subdirectory ignore issue
previously.
I also included in the tests some debugging versions of the
shared iteration callback routines that print status or diff
information. These aren't used actively in the tests, but can be
quickly swapped in to test code to give a better picture of what
is being scanned in some of the complex test scenarios.
This option has been sitting unimplemented for a while, so I
finally went through and implemented it along with some tests.
As part of this, I improved the implementation of
GIT_DIFF_IGNORE_SUBMODULES so it be more diligent about avoiding
extra work and about leaving off delta records for submodules to
the greatest extent possible (though it may include them still
if you are request TYPECHANGE records).
This implements working versions of GIT_DIFF_RECURSE_IGNORED_DIRS
and GIT_STATUS_OPT_RECURSE_IGNORED_DIRS along with some tests for
the newly available behaviors. This is not turned on by default
for status, but can be accessed via the options to the extended
version of the command.
This fixes some places where the new tests were leaving the test
area in a bad state or were freeing data they should not free.
It also removes code that is extraneous to the core issue and
fixes an invalid SHA being looked up in one of the tests (which
was failing, but for the wrong reason).
This adds a helper function for the cases where you want to
quickly set a single boolean config value for a repository.
This allowed me to remove a lot of code.
The iterator APIs are not currently consistent with the parameter
ordering of the rest of the codebase. This rearranges the order
of parameters, simplifies the naming of a number of functions, and
makes somewhat better use of macros internally to clean up the
iterator code.
This also expands the test coverage of iterator functionality,
making sure that case sensitive range-limited iteration works
correctly.
In preparation for further iterator changes, this cleans up a few
small things in the iterator API:
* removed the git_iterator_for_repo_index_range API
* made git_iterator_free not be inlined
* minor param name and test function name tweaks
* Rework GIT_DIRREMOVAL values to GIT_RMDIR flags, allowing
combinations of flags
* Add GIT_RMDIR_EMPTY_PARENTS flag to remove parent dirs that
are left empty after removal
* Add GIT_MKDIR_VERIFY_DIR to give an error if item is a file,
not a dir (previously an EEXISTS error was ignored, even for
files) and enable this flag for git_futils_mkpath2file call
* Improve accuracy of error messages from git_futils_mkdir
git_index_read_tree() was exposing a parameter to provide the user with
a progress indicator. Unfortunately, due to the recursive nature of the
tree walk, the maximum number of items to process was unknown. Thus,
the indicator was only counting processed entries, without providing
any information how the number of remaining items.
To answer if a single given file should be ignored, the path to
that file has to be processed progressively checking that there
are no intermediate ignored directories in getting to the file
in question. This enables that, fixing the broken old behavior,
and adds tests to exercise various ignore situations.
This started as a complex new test for checkout going through the
"typechanges" test repository, but that revealed numerous issues
with checkout, including:
* complete failure with submodules
* failure to create blobs with exec bits
* problems when replacing a tree with a blob because the tree
"example/" sorts after the blob "example" so the delete was
being processed after the single file blob was created
This fixes most of those problems and includes a number of other
minor changes that made it easier to do that, including improving
the TYPECHANGE support in diff/status, etc.
This adds support to diff and status for running filters (a la crlf)
on blobs in the workdir before computing SHAs and before generating
text diffs. This ended up being a bit more code change than I had
thought since I had to reorganize some of the diff logic to minimize
peak memory use when filtering blobs in a diff.
This also adds a cap on the maximum size of data that will be loaded
to diff. I set it at 512Mb which should match core git. Right now
it is a #define in src/diff.h but it could be moved into the public
API if desired.
This is a big redesign of the git_submodule_status API and the
implementation of the redesigned API. It also fixes a number of
bugs that I found in other parts of the submodule API while
writing the tests for the status part.
This also fixes a couple of bugs in the iterators that had not
been noticed before - one with iterating when there is a gitlink
(i.e. separate-work-dir) and one where I was treating anything
even vaguely submodule-like as a submodule, more aggressively
than core git does.