Commit Graph

42 Commits

Author SHA1 Message Date
Carlos Martín Nieto
9950bb4e8d diff: rename the file's 'oid' to 'id'
In the same vein as the previous commits in this series.
2014-01-25 08:15:44 +01:00
Russell Belfer
3e57069e82 Fix --assume-unchanged support
This was never really working right because we were checking the
wrong flag and not checking it in all the places that we need to
be checking it.  I finally got around to writing a test and adding
actual support for it.
2013-11-01 13:49:43 -07:00
Russell Belfer
3ff1d12373 Rename diff objects and split patch.h
This makes no functional change to diff but renames a couple of
the objects and splits the new git_patch (formerly git_diff_patch)
into a new header file.
2013-10-11 14:51:54 -07:00
Ben Straub
5e96f31638 Merge pull request #1738 from libgit2/diff-patch-content-size
Add API for getting at git_diff_patch->content_size
2013-08-08 08:54:38 -07:00
Russell Belfer
effdbeb323 Make rename detection file size fix better
The previous fix for checking file sizes with rename detection
always loads the blob.  In this version, if the odb backend can
get the object header without loading the whole thing into memory,
then we'll just use that, so that we can eliminate possible rename
sources & targets without loading them.
2013-07-24 17:48:37 -07:00
Russell Belfer
197b8966db Add hunk/file headers to git_diff_patch_size
This allows git_diff_patch_size to account for hunk headers and
file headers in the returned size.  This required some refactoring
of the code that is used to print file headers so that it could be
invoked by the git_diff_patch_size API.

Also this increases the test coverage and fixes an off-by-one bug
in the size calculation when newline changes happen at the end of
the file.
2013-07-23 14:34:31 -07:00
Russell Belfer
2b672d5b64 Add git_pathspec_match_diff API
This adds an additional pathspec API that will match a pathspec
against a diff object.  This is convenient if you want to handle
renames (so you need the whole diff and can't use the pathspec
constraint built into the diff API) but still want to tell if the
diff had any files that matched the pathspec.

When the pathspec is matched against a diff, instead of keeping
a list of filenames that matched, instead the API keeps the list
of git_diff_deltas that matched and they can be retrieved via a
new API git_pathspec_match_list_diff_entry.

There are a couple of other minor API extensions here that were
mostly for the sake of convenience and to reduce dependencies
on knowing the internal data structure between files inside the
library.
2013-07-10 20:50:33 +02:00
Russell Belfer
a1683f28ce More tests and bug fixes for status with rename
This changes the behavior of the status RENAMED flags so that they
will be combined with the MODIFIED flags if appropriate.  If a file
is modified in the index and also renamed, then the status code
will have both the GIT_STATUS_INDEX_MODIFIED and INDEX_RENAMED bits
set.  If it is renamed but the OID has not changed, then just the
GIT_STATUS_INDEX_RENAMED bit will be set.  Similarly, the flags
GIT_STATUS_WT_MODIFIED and GIT_STATUS_WT_RENAMED can both be set
independently of one another.

This fixes a serious bug where the check for unmodified files that
was done at data load time could end up erasing the RENAMED state
of a file that was renamed with no changes.

Lastly, this contains a bunch of new tests for status with renames,
including tests where the only rename changes are case changes.
The expected results of these tests have to vary by whether the
platform uses a case sensitive filesystem or not, so the expected
data covers those platform differences separately.
2013-06-17 10:03:49 -07:00
Russell Belfer
351888cf3d Improve case handling in git_diff__paired_foreach
This commit reinstates some changes to git_diff__paired_foreach
that were discarded during the rebase (because the diff_output.c
file had gone away), and also adjusts the case insensitively
logic slightly to hopefully deal with either mismatched icase
diffs and other case insensitivity scenarios.
2013-06-17 10:03:49 -07:00
Russell Belfer
114f5a6c41 Reorganize diff and add basic diff driver
This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization.  Hopefully this will make the diff code easier to
understand and to extend.

This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided.  The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.

This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.
2013-06-10 10:10:39 -07:00
Russell Belfer
67db583dab More diff rename tests; better split swap handling
This adds a couple more tests of different rename scenarios.

Also, this fixes a problem with the case where you have two
"split" deltas and the left half of one matches the right half of
the other.  That case was already being handled, but in the wrong
order in a way that could result in bad output.  Also, if the swap
also happened to put the other two halves into the correct place
(i.e. two files exchanged places with each other), then the second
delta was left with the SPLIT flag set when it really should be
cleared.
2013-05-23 15:06:07 -07:00
Russell Belfer
a21cbb12db Significant rename detection rewrite
This flips rename detection around so instead of creating a
forward mapping from deltas to possible rename targets, instead
it creates a reverse mapping, looking at possible targets and
trying to find a source that they could have been renamed or
copied from.  This is important because each output can only
have a single source, but a given source could map to multiple
outputs (in the form of COPIED records).

Additionally, this makes a couple of tweaks to the public rename
detection APIs, mostly renaming a couple of options that control
the behavior to make more sense and to be more like core Git.

I walked through the tests looking at the exact results and
updated the expectations based on what I saw.  The new code is
different from the old because it cannot give some nonsense
results (like A was renamed to both B and C) which were part of
the outputs previously.
2013-05-22 10:37:12 -07:00
Russell Belfer
e35e2684f6 Add GIT_DIFF_LINE_CONTEXT_EOFNL
This adds a new line origin constant for the special line that
is used when both files end without a newline.

In the course of writing the tests for this, I was having problems
with modifying a file but not having diff notice because it was
the same size and modified less than one second from the start of
the test, so I decided to start working on nanosecond timestamp
support.  This commit doesn't contain the nanosecond support, but
it contains the reorganization of maybe_modified and the hooks so
that if the nanosecond data were being read by stat() (or rather
being copied by git_index_entry__init_from_stat), then the nsec
would be taken into account.

This new stuff could probably use some more tests, although there
is some amount of it here.
2013-05-07 04:32:17 -07:00
Edward Thomson
0462fba538 renames! 2013-04-30 16:01:11 -05:00
Russell Belfer
71a3d27ea6 Replace diff delta binary with flags
Previously the git_diff_delta recorded if the delta was binary.
This replaces that (with no net change in structure size) with
a full set of flags.  The flag values that were already in use
for individual git_diff_file objects are reused for the delta
flags, too (along with renaming those flags to make it clear that
they are used more generally).

This (a) makes things somewhat more consistent (because I was
using a -1 value in the "boolean" binary field to indicate unset,
whereas now I can just use the flags that are easier to understand),
and (b) will make it easier for me to add some additional flags to
the delta object in the future, such as marking the results of a
copy/rename detection or other deltas that might want a special
indicator.

While making this change, I officially moved some of the flags that
were internal only into the private diff header.

This also allowed me to remove a gross hack in rename/copy detect
code where I was overwriting the status field with an internal
value.
2013-02-20 15:10:21 -08:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Russell Belfer
9950d27ab6 Clean up iterator APIs
This removes the need to explicitly pass the repo into iterators
where the repo is implied by the other parameters.  This moves
the repo to be owned by the parent struct.  Also, this has some
iterator related updates to the internal diff API to lay the
groundwork for checkout improvements.
2012-12-10 15:38:28 -08:00
Ben Straub
c7231c45fe Deploy GITERR_CHECK_VERSION 2012-11-30 16:31:42 -08:00
Ben Straub
2f8d30becb Deploy GIT_DIFF_OPTIONS_INIT 2012-11-30 13:12:14 -08:00
Russell Belfer
0f3def715d Fix various cross-platform build issues
This fixes a number of warnings and problems with cross-platform
builds.  Among other things, it's not safe to name a member of a
structure "strcmp" because that may be #defined.
2012-11-09 13:52:07 -08:00
Russell Belfer
55cbd05b18 Some diff refactorings to help code reuse
There are some diff functions that are useful in a rewritten
checkout and this lays some groundwork for that.  This contains
three main things:

1. Share the function diff uses to calculate the OID for a file
   in the working directory (now named `git_diff__oid_for_file`
2. Add a `git_diff__paired_foreach` function to iterator over
   two diff lists concurrently.  Convert status to use it.
3. Move all the string/prefix/index entry comparisons into
   function pointers inside the `git_diff_list` object so they
   can be switched between case sensitive and insensitive
   versions.  This makes them easier to reuse in various
   functions without replicating logic.  As part of this, move
   a couple of index functions out of diff.c and into index.c.
2012-11-09 13:52:07 -08:00
Russell Belfer
db106d01f0 Move rename detection into new file
This improves the naming for the rename related functionality
moving it to be called `git_diff_find_similar()` and renaming
all the associated constants, etc. to make more sense.

I also moved the new code (plus the existing `git_diff_merge`)
into a new file `diff_tform.c` where I can put new functions
related to manipulating git diff lists.

This also updates the implementation significantly from the
last revision fixing some ordering issues (where break-rewrite
needs to be handled prior to copy and rename detection) and
improving config option handling.
2012-10-30 09:40:50 -07:00
Russell Belfer
b4f5bb0747 Initial implementation of diff rename detection
This implements the basis for diff rename and copy detection,
although it is based on simple SHA comparison right now instead
of using a matching algortihm.  Just as `git_diff_merge` can be
used as a post-pass on diffs to emulate certain command line
behaviors, there is a new API `git_diff_detect` which will
update a diff list in-place, adjusting some deltas to RENAMED
or COPIED state (and also, eventually, splitting MODIFIED deltas
where the change is too large into DELETED/ADDED pairs).

This also adds a new test repo that will hold rename/copy/split
scenarios.  Right now, it just has exact-match rename and copy,
but the tests are written to use tree diffs, so we should be able
to add new test scenarios easily without breaking tests.
2012-10-23 16:40:51 -07:00
Russell Belfer
52a61bb804 Fix minor bugs
Fixed no-submodule speedup of new checkout code.  Fixed missing
final update to progress (which may go away, I realize).  Fixed
unused structure in header and incorrect comment.
2012-10-17 14:10:23 -07:00
Russell Belfer
bae957b95d Add const to all shared pointers in diff API
There are a lot of places where the diff API gives the user access
to internal data structures and many of these were being exposed
through non-const pointers.  This replaces them all with const
pointers for any object that the user can access but is still
owned internally to the git_diff_list or git_diff_patch objects.

This will probably break some bindings...  Sorry!
2012-09-25 16:35:05 -07:00
Russell Belfer
6428630865 Fix bugs in new diff patch code
This fixes all the bugs in the new diff patch code.  The only
really interesting one is that when we merge two diffs, we now
have to actually exclude diff delta records that are not supposed
to be tracked, as opposed to before where they could be included
because they would be skipped silently by `git_diff_foreach()`.
Other than that, there are just minor errors.
2012-09-25 16:35:05 -07:00
Russell Belfer
5f69a31f7d Initial implementation of new diff patch API
Replacing the `git_iterator` object, this creates a simple API
for accessing the "patch" for any file pair in a diff list and
then gives indexed access to the hunks in the patch and the lines
in the hunk.  This is the initial implementation of this revised
API - it is still broken, but at least builds cleanly.
2012-09-25 16:35:05 -07:00
Russell Belfer
b36effa22e Replace git_diff_iterator_num_files with progress
The `git_diff_iterator_num_files` API was problematic, since we
don't actually know the exact number of files to be iterated over
until we load those files into memory.  This replaces it with a
new `git_diff_iterator_progress` API that goes from 0 to 1, and
moves and renamed the old API for the internal places that can
tolerate a max value instead of an exact value.
2012-09-10 09:59:14 -07:00
Russell Belfer
60b9d3fcef Implement filters for status/diff blobs
This adds support to diff and status for running filters (a la crlf)
on blobs in the workdir before computing SHAs and before generating
text diffs.  This ended up being a bit more code change than I had
thought since I had to reorganize some of the diff logic to minimize
peak memory use when filtering blobs in a diff.

This also adds a cap on the maximum size of data that will be loaded
to diff.  I set it at 512Mb which should match core git.  Right now
it is a #define in src/diff.h but it could be moved into the public
API if desired.
2012-09-06 15:34:02 -07:00
Russell Belfer
f335ecd6e1 Diff iterators
This refactors the diff output code so that an iterator object
can be used to traverse and generate the diffs, instead of just
the `foreach()` style with callbacks.  The code has been rearranged
so that the two styles can still share most functions.

This also replaces `GIT_REVWALKOVER` with `GIT_ITEROVER` and uses
that as a common error code for marking the end of iteration when
using a iterator style of object.
2012-09-05 15:17:24 -07:00
Russell Belfer
0abd724454 Fix filemode comparison in diffs
File modes were both not being ignored properly on platforms
where they should be ignored, nor be diffed consistently on
platforms where they are supported.

This change adds a number of diff and status filemode change
tests.  This also makes sure that filemode-only changes are
included in the diff output when they occur and that filemode
changes are ignored successfully when core.filemode is false.

There is no code that automatically toggles core.filemode
based on the capabilities of the current platform, so the user
still needs to be careful in their .git/config file.
2012-06-08 12:09:10 -07:00
Russell Belfer
16b83019af Fix usage of "new" for fieldname in public header
This should restore the ability to include libgit2 headers
in C++ projects.

Cherry picked 2de60205df from
development into new-error-handling.
2012-05-02 15:34:58 -07:00
nulltoken
eb3d71a5bc diff: fix generation of the header of a removal patch 2012-04-25 15:37:17 -07:00
Russell Belfer
19fa2bc111 Convert attrs and diffs to use string pools
This converts the git attr related code (including ignores) and
the git diff related code (and implicitly the status code) to use
`git_pools` for storing strings.  This reduces the number of small
blocks allocated dramatically.
2012-04-25 10:42:37 -07:00
Russell Belfer
14a513e058 Add support for pathspec to diff and status
This adds preliminary support for pathspecs to diff and status.
The implementation is not very optimized (it still looks at
every single file and evaluated the the pathspec match against
them), but it works.
2012-04-13 15:00:29 -07:00
Russell Belfer
95dfb031f7 Improve config handling for diff,submodules,attrs
This adds support for a bunch of core.* settings that affect
diff and status, plus fixes up some incorrect implementations
of those settings from before.  Also, this cleans up the
handling of config settings in the new submodules code and
in the old attrs/ignore code.
2012-03-30 14:40:50 -07:00
Russell Belfer
1db12b0053 Eliminate hairy COITERATE macro
I decided that the COITERATE macro was, in the end causing
more confusion that it would save and decided just to write
out the loops that I needed for parallel diff list iteration.
It is not that much code and this just feels less obfuscated.
2012-03-25 23:04:26 -07:00
Russell Belfer
a48ea31d69 Reimplment git_status_foreach using git diff
This is an initial reimplementation of status using diff a la
the way that core git does it.
2012-03-21 12:33:09 -07:00
Russell Belfer
74fa4bfae3 Update diff to use iterators
This is a major reorganization of the diff code.  This changes
the diff functions to use the iterators for traversing the
content.  This allowed a lot of code to be simplified.  Also,
this moved the functions relating to outputting a diff into a
new file (diff_output.c).

This includes a number of other changes - adding utility
functions, extending iterators, etc. plus more tests for the
diff code.  This also takes the example diff.c program much
further in terms of emulating git-diff command line options.
2012-03-02 15:49:29 -08:00
Russell Belfer
e47329b6d8 First pass of diff index to workdir implementation
This is an initial version of git_diff_workdir_to_index.  It
also includes renaming some structures and some refactoring
of the existing code so that it could be shared better with
the new function.

This is not complete since it needs a rebase to get some
new odb functions from the upstream branch.
2012-03-02 15:49:29 -08:00
Russell Belfer
a2e895be82 Continue implementation of git-diff
* Implemented git_diff_index_to_tree
* Reworked git_diff_options structure to handle more options
* Made most of the options in git_diff_options actually work
* Reorganized code a bit to remove some redundancy
* Added option parsing to examples/diff.c to test most options
2012-03-02 15:49:29 -08:00
Russell Belfer
65b09b1ded Implement diff lists and formatters
This reworks the diff API to separate the steps of producing
a diff descriptions from formatting the diff.  This will allow
us to share diff output code with the various diff creation
scenarios and will allow us to implement rename detection as
an optional pass that can be run on a diff list.
2012-03-02 15:49:28 -08:00