Commit Graph

70 Commits

Author SHA1 Message Date
Ben Straub
2f8d30becb Deploy GIT_DIFF_OPTIONS_INIT 2012-11-30 13:12:14 -08:00
Russell Belfer
7bf87ab698 Consolidate text buffer functions
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc).  This groups all those functions together into a
new file and converts the code to use that.

This has two enhancements to existing functionality.  The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).
2012-11-28 09:58:48 -08:00
Russell Belfer
a8122b5d4a Fix warnings on Win64 build 2012-11-27 13:18:29 -08:00
Russell Belfer
9cd423583f API updates for submodule.h 2012-11-27 13:18:28 -08:00
Russell Belfer
793c438559 Update diff callback param order
This makes the diff functions that take callbacks both take
the payload parameter after the callback function pointers and
pass the payload as the last argument to the callback function
instead of the first.  This should make them consistent with
other callbacks across the API.
2012-11-27 13:18:28 -08:00
Vicent Marti
cfbe4be3fb More external API cleanup
Conflicts:
	src/branch.c
	tests-clar/refs/branches/create.c
2012-11-27 13:18:27 -08:00
Russell Belfer
0f3def715d Fix various cross-platform build issues
This fixes a number of warnings and problems with cross-platform
builds.  Among other things, it's not safe to name a member of a
structure "strcmp" because that may be #defined.
2012-11-09 13:52:07 -08:00
Russell Belfer
55cbd05b18 Some diff refactorings to help code reuse
There are some diff functions that are useful in a rewritten
checkout and this lays some groundwork for that.  This contains
three main things:

1. Share the function diff uses to calculate the OID for a file
   in the working directory (now named `git_diff__oid_for_file`
2. Add a `git_diff__paired_foreach` function to iterator over
   two diff lists concurrently.  Convert status to use it.
3. Move all the string/prefix/index entry comparisons into
   function pointers inside the `git_diff_list` object so they
   can be switched between case sensitive and insensitive
   versions.  This makes them easier to reuse in various
   functions without replicating logic.  As part of this, move
   a couple of index functions out of diff.c and into index.c.
2012-11-09 13:52:07 -08:00
Russell Belfer
cb7180a6e2 Add git_diff_patch_print
This adds a `git_diff_patch_print()` API which is more like the
existing API to "print" a patch from an entire `git_diff_list`
but operates on a single `git_diff_patch` object.

Also, it rewrites the `git_diff_patch_to_str()` API to use that
function (making it very small).
2012-10-25 11:48:39 -07:00
Russell Belfer
3943dc78a5 Check errors while generating diff patch string 2012-10-25 11:12:56 -07:00
Russell Belfer
93cf7bb8e2 Add git_diff_patch_to_str API
This adds an API to generate a complete single-file patch text
from a git_diff_patch object.
2012-10-24 20:56:32 -07:00
Russell Belfer
0d64bef941 Add complex checkout test and then fix checkout
This started as a complex new test for checkout going through the
"typechanges" test repository, but that revealed numerous issues
with checkout, including:

* complete failure with submodules
* failure to create blobs with exec bits
* problems when replacing a tree with a blob because the tree
  "example/" sorts after the blob "example" so the delete was
  being processed after the single file blob was created

This fixes most of those problems and includes a number of other
minor changes that made it easier to do that, including improving
the TYPECHANGE support in diff/status, etc.
2012-10-09 11:59:34 -07:00
Russell Belfer
5d1308f25f Add test for diffs with submodules and bug fixes
The adds a test for the submodule diff capabilities and then
fixes a few bugs with how the output is generated.  It improves
the accuracy of OIDs in the diff delta object and makes the
submodule output more closely mirror the OIDs that will be used
by core git.
2012-10-08 15:22:40 -07:00
Russell Belfer
dfbff793b8 Fix a few diff bugs with directory content
There are a few cases where diff should leave directories in
the diff list if we want to match core git, such as when the
directory contains a .git dir.  That feature was lost when I
introduced some of the new submodule handling.

This restores that and then fixes a couple of related to diff
output that are triggered by having diffs with directories in
them.

Also, this adds a new flag that can be passed to diff if you
want diff output to actually include the file content of any
untracked files.
2012-10-08 15:22:40 -07:00
Sascha Cunz
1686641f18 Extract submodule logic out of diff_output.c:get_workdir_content 2012-10-05 13:03:38 +02:00
Sascha Cunz
7e57d2506a Diff: teach get_workdir_content to show a submodule as text
1. teach diff.c:maybe_modified to query git_submodule_status for the
   modification state of a submodule. According to the
   git_submodule_status docs, it will filter for to-ignore states
   already.

2. teach diff_output.c:get_workdir_content to check the submodule status
   again and create a line like:

      Subproject commit <SHA-1>\n
   or
      Subproject comimt <SHA-1>-dirty\n

   like git.git does.
2012-10-05 13:03:38 +02:00
Sascha Cunz
9ce44f1ae5 Diff: teach get_blob_content to show a submodule as text
diff_output.c:get_blob_content used to try to read the submodule commit
as a blob in the superproject's odb. Of course it cannot find it and
errors out with GIT_ENOTFOUND, implcitly terminating the whole diff
output.

This patch teaches it to create a text that describes the submodule
instead. The text looks like:

	Subproject commit <SHA1>\n

which is what git.git does, too.
2012-10-05 13:03:38 +02:00
Sascha Cunz
1a5cd26b8c Fix minor whitespace issue 2012-10-05 13:03:38 +02:00
Russell Belfer
cc5bf359a6 Clean up Win64 warnings 2012-09-28 14:34:08 -07:00
Russell Belfer
bae957b95d Add const to all shared pointers in diff API
There are a lot of places where the diff API gives the user access
to internal data structures and many of these were being exposed
through non-const pointers.  This replaces them all with const
pointers for any object that the user can access but is still
owned internally to the git_diff_list or git_diff_patch objects.

This will probably break some bindings...  Sorry!
2012-09-25 16:35:05 -07:00
Russell Belfer
6428630865 Fix bugs in new diff patch code
This fixes all the bugs in the new diff patch code.  The only
really interesting one is that when we merge two diffs, we now
have to actually exclude diff delta records that are not supposed
to be tracked, as opposed to before where they could be included
because they would be skipped silently by `git_diff_foreach()`.
Other than that, there are just minor errors.
2012-09-25 16:35:05 -07:00
Russell Belfer
5f69a31f7d Initial implementation of new diff patch API
Replacing the `git_iterator` object, this creates a simple API
for accessing the "patch" for any file pair in a diff list and
then gives indexed access to the hunks in the patch and the lines
in the hunk.  This is the initial implementation of this revised
API - it is still broken, but at least builds cleanly.
2012-09-25 16:35:05 -07:00
nulltoken
9ac8b113b1 Fix MSVC amd64 compilation warnings 2012-09-20 14:10:05 +02:00
Russell Belfer
12b6af1718 Forgot to reset hunk & line between files
The last change tweaked the way we use the hunk_curr pointer
during iteration, but failed to reset the value back to NULL
when switching files.
2012-09-13 14:15:07 -07:00
Russell Belfer
49d34c1c0c Fix problems in diff iterator record chaining
There is a bug in building the linked list of line records in the
diff iterator and also an off by one element error in the hunk
counts.  This fixes both of these, adds some test data with more
complex sets of hunk and line diffs to exercise this code better.
2012-09-13 13:17:38 -07:00
Russell Belfer
1f35e89dbf Fix diff binary file detection
In the process of adding tests for the max file size threshold
(which treats files over a certain size as binary) there seem to
be a number of problems in the new code with detecting binaries.
This should fix those up, as well as add a test for the file
size threshold stuff.

Also, this un-deprecates `GIT_DIFF_LINE_ADD_EOFNL`, since I
finally found a legitimate situation where it would be returned.
2012-09-11 12:03:33 -07:00
Russell Belfer
c6ac28fdc5 Reorg internal odb read header and object lookup
Often `git_odb_read_header` will "fail" and have to read the
entire object into memory instead of just the header.  When this
happens, the object is loaded and then disposed of immediately,
which makes it difficult to efficiently use the header information
to decide if the object should be loaded (since attempting to do
so will often result in loading the object twice).

This commit takes the existing code and reorganizes it to have
two new functions:

- `git_odb__read_header_or_object` which acts just like the old
  read header function except that it returns the object, too, if
  it was forced to load the whole thing.  It then becomes the
  callers responsibility to free the `git_odb_object`.
- `git_object__from_odb_object` which was extracted from the old
  `git_object_lookup` and creates a subclass of `git_object` from
  an existing `git_odb_object` (separating the ODB lookup from the
  `git_object` creation).  This allows you to use the first header
  reading function efficiently without instantiating the
  `git_odb_object` twice.

There is no net change to the behavior of any of the existing
functions, but this allows internal code to tap into the ODB
lookup and object creation to be more efficient.
2012-09-10 12:24:05 -07:00
Russell Belfer
e597b1890e Move diff max_size to public API
This commit adds a max_size value in the public `git_diff_options`
structure so that the user can automatically flag blobs over a
certain size as binary regardless of other properties.

Also, and perhaps more importantly, this moves binary detection
to be as early as possible in the diff traversal inner loop and
makes sure that we stop loading objects as soon as we decide that
they are binary.
2012-09-10 11:49:12 -07:00
Russell Belfer
b36effa22e Replace git_diff_iterator_num_files with progress
The `git_diff_iterator_num_files` API was problematic, since we
don't actually know the exact number of files to be iterated over
until we load those files into memory.  This replaces it with a
new `git_diff_iterator_progress` API that goes from 0 to 1, and
moves and renamed the old API for the internal places that can
tolerate a max value instead of an exact value.
2012-09-10 09:59:14 -07:00
Russell Belfer
3a3deea80b Clean up blob diff path
Previously when diffing blobs, the diff code just ran with a NULL
repository object. Of course, that's not necessary and the test
for a NULL repo was confusing. This makes the blob diff run with
the repo that contains the blobs and clarifies the test that it
is possible to be diffing data where the path is unknown.
2012-09-06 15:45:50 -07:00
Russell Belfer
60b9d3fcef Implement filters for status/diff blobs
This adds support to diff and status for running filters (a la crlf)
on blobs in the workdir before computing SHAs and before generating
text diffs.  This ended up being a bit more code change than I had
thought since I had to reorganize some of the diff logic to minimize
peak memory use when filtering blobs in a diff.

This also adds a cap on the maximum size of data that will be loaded
to diff.  I set it at 512Mb which should match core git.  Right now
it is a #define in src/diff.h but it could be moved into the public
API if desired.
2012-09-06 15:34:02 -07:00
Vicent Marti
01ae1909c5 diff: Cleanup documentation and printf compat 2012-09-06 10:13:38 +02:00
Russell Belfer
510f1bac6b Fix comments and a minor bug
This adds better header comments and also fixes a bug in one of
simple APIs that tells the number of lines in the current hunk.
2012-09-05 15:17:24 -07:00
Russell Belfer
f335ecd6e1 Diff iterators
This refactors the diff output code so that an iterator object
can be used to traverse and generate the diffs, instead of just
the `foreach()` style with callbacks.  The code has been rearranged
so that the two styles can still share most functions.

This also replaces `GIT_REVWALKOVER` with `GIT_ITEROVER` and uses
that as a common error code for marking the end of iteration when
using a iterator style of object.
2012-09-05 15:17:24 -07:00
nulltoken
b97c169ec0 Fix MSVC compilation warnings 2012-09-04 10:01:18 +02:00
Russell Belfer
5f4a61aea8 Working implementation of git_submodule_status
This is a big redesign of the git_submodule_status API and the
implementation of the redesigned API.  It also fixes a number of
bugs that I found in other parts of the submodule API while
writing the tests for the status part.

This also fixes a couple of bugs in the iterators that had not
been noticed before - one with iterating when there is a gitlink
(i.e. separate-work-dir) and one where I was treating anything
even vaguely submodule-like as a submodule, more aggressively
than core git does.
2012-08-24 11:00:27 -07:00
Russell Belfer
5fdc41e765 Minor bug fixes in diff code
In looking at PR #878, I found a few small bugs in the diff code,
mostly related to work that can be avoided when processing tree-
to-tree diffs that was always being carried out.  This commit has
some small fixes in it.
2012-08-22 13:57:57 -07:00
Vicent Marti
c07d9c95f2 oid: Explicitly include oid.h for the inlined CMP 2012-08-09 15:33:04 -07:00
Russell Belfer
5dca201072 Update iterators for consistency across library
This updates all the `foreach()` type functions across the library
that take callbacks from the user to have a consistent behavior.
The rules are:

* A callback terminates the loop by returning any non-zero value
* Once the callback returns non-zero, it will not be called again
  (i.e. the loop stops all iteration regardless of state)
* If the callback returns non-zero, the parent fn returns GIT_EUSER
* Although the parent returns GIT_EUSER, no error will be set in
  the library and `giterr_last()` will return NULL if called.

This commit makes those changes across the library and adds tests
for most of the iteration APIs to make sure that they follow the
above rules.
2012-08-03 17:08:01 -07:00
yorah
29f9186d1b diff: make inter-hunk-context default value git-compliant
Default in git core is 0, not 3
2012-07-02 17:27:49 +02:00
Russell Belfer
145e696b49 Minor fixes, cleanups, and clarifications
There are three actual changes in this commit:

1. When the trailing newline of a file is removed in a diff, the
   change will now be reported with `GIT_DIFF_LINE_DEL_EOFNL` passed
   to the callback.  Previously, the `ADD_EOFNL` constant was given
   which was just an error in my understanding of when the various
   circumstances arose.  `GIT_DIFF_LINE_ADD_EOFNL` is deprecated and
   should never be generated.  A new newline is simply an `ADD`.
2. Rewrote the `diff_delta__merge_like_cgit` function that contains
   the core logic of the `git_diff_merge` implementation.  The new
   version doesn't actually have significantly different behavior,
   but the logic should be much more obvious, I think.
3. Fixed a bug in `git_diff_merge` where it freed a string pool
   while some of the string data was still in use.  This led to
   `git_diff_print_patch` accessing memory that had been freed.

The rest of this commit contains improved documentation in `diff.h`
to make the behavior and the equivalencies with core git clearer,
and a bunch of new tests to cover the various cases, oh and a minor
simplification of `examples/diff.c`.
2012-06-08 12:11:13 -07:00
Russell Belfer
0abd724454 Fix filemode comparison in diffs
File modes were both not being ignored properly on platforms
where they should be ignored, nor be diffed consistently on
platforms where they are supported.

This change adds a number of diff and status filemode change
tests.  This also makes sure that filemode-only changes are
included in the diff output when they occur and that filemode
changes are ignored successfully when core.filemode is false.

There is no code that automatically toggles core.filemode
based on the capabilities of the current platform, so the user
still needs to be careful in their .git/config file.
2012-06-08 12:09:10 -07:00
Vicent Martí
3f0358604e misc: Fix warnings from PVS Studio trial 2012-06-07 22:43:48 +02:00
Garrett Regier
2ab9dcbd62 Fix checking for the presence of a flag 2012-05-27 16:52:37 -07:00
Vicent Martí
29e948debe global: Change parameter ordering in API
Consistency is good.
2012-05-18 01:25:57 +02:00
Russell Belfer
b59c73d39a Optimize away git_text_gather_stats in diff
GProf shows `git_text_gather_stats` as the most expensive call
in large diffs.  The function calculates a lot of information
that is not actually used and does not do so in a optimal
order.  This introduces a tuned `git_buf_is_binary` function
that executes the same algorithm in a fraction of the time.
2012-05-17 13:06:20 -07:00
nulltoken
9a29f8d56c diff: fix the diffing of two identical blobs 2012-05-07 12:18:33 +02:00
nulltoken
28ef7f9b28 diff: make git_diff_blobs() able to detect binary blobs 2012-05-07 12:18:32 +02:00
nulltoken
4f80676182 diff: fix the diffing of a concrete blob against a null one 2012-05-07 12:18:31 +02:00
nulltoken
245c5eaec5 diff: When diffing two blobs, ensure the delta callback parameter is filled with relevant information 2012-05-07 12:18:31 +02:00