Commit Graph

1023 Commits

Author SHA1 Message Date
Vicent Martí
e68e33f33d Merge pull request #1233 from arrbee/file-similarity-metric
Add file similarity scoring to diff rename/copy detection
2013-02-27 14:50:32 -08:00
Edward Thomson
395509ffcd don't dereference at the end of the workdir iterator 2013-02-27 15:35:52 -06:00
Michael Schubert
8005c6d420 Revert "hash: remove git_hash_init from internal api"
This reverts commit efe7fad6c9, except for
the indentation fixes.
2013-02-26 01:08:34 +01:00
Michael Schubert
efe7fad6c9 hash: remove git_hash_init from internal api
Along with that, fix indentation in tests-clar/object/raw/hash.c
2013-02-26 00:23:00 +01:00
Michael Schubert
be225be785 tests/pack: fixup 6774b10
Initialize the hash ctx with git_hash_ctx_init, not git_hash_init.
2013-02-25 23:36:25 +01:00
Michael Schubert
6774b1071f tests/pack: do strict check of testpack's SHA1 hash 2013-02-25 22:22:15 +01:00
Martin Woodward
fc6c5b5001 Remove sample hook files
Getting rid of sample hook files from test repos as they just take up
space with no value.
2013-02-25 17:03:05 +00:00
Russell Belfer
37d9168608 Do not fail if .gitignore is directory
This is designed to fix libgit2sharp #350 where if .gitignore is
a directory we abort all operations that process ignores instead
of just skipping it as core git does.

Also added test that fails without this change and passes with it.
2013-02-22 12:21:54 -08:00
Russell Belfer
1be4ba9842 More rename detection tests
This includes tests for crlf changes, whitespace changes with the
default comparison and with the ignore whitespace comparison, and
more sensitivity checking for the comparison code.
2013-02-22 11:13:01 -08:00
Philip Kelley
7beeb3f420 Rename 'exp' so it doesn't conflict with exp() 2013-02-22 14:03:44 -05:00
Russell Belfer
6f9d5ce818 Fix tests for find_similar and related
This fixes both a test that I broke in diff::patch where I was
relying on the current state of the working directory for the
renames test data and fixes an unstable test in diff::rename
where the environment setting for the "diff.renames" config was
being allowed to influence the test results.
2013-02-22 10:17:08 -08:00
Vicent Martí
06eaa06f26 Merge pull request #1343 from nulltoken/topic/remote_orphaned_branch
Teach git_branch_remote_name() to work with orphaned heads
2013-02-22 09:48:47 -08:00
nulltoken
bbc53e4f93 branch: refactor git_branch_remote_name() tests 2013-02-22 17:04:25 +01:00
nulltoken
c1b5e8c42b branch: Make git_branch_remote_name() cope with orphaned heads 2013-02-22 17:04:23 +01:00
nulltoken
9ccab8dfb8 stash: Update the reference when dropping the topmost stash 2013-02-22 15:25:59 +01:00
nulltoken
39bcb4deb8 stash: Refactor stash::drop tests 2013-02-22 15:25:58 +01:00
nulltoken
d788499a10 ignore: enhance git_ignore_path_is_ignored() test coverage 2013-02-22 15:25:57 +01:00
Russell Belfer
d4b747c1cb Add diff rename tests with partial similarity
This adds some new tests that actually exercise the similarity
metric between files to detect renames, copies, and split modified
files that are too heavily modified.

There is still more testing to do - these tests are just partially
covering the cases.

There is also one bug fix in this where a change set with only
MODIFY being broken into ADD/DELETE (due to low self-similarity)
without any additional RENAMED entries would end up not processing
the split requests (because the num_rewrites counter got reset).
2013-02-21 16:44:44 -08:00
Russell Belfer
960a04dd56 Initial integration of similarity metric to diff
This is the initial integration of the similarity metric into
the `git_diff_find_similar()` code path.  The existing tests all
pass, but the new functionality isn't currently well tested.  The
integration does go through the pluggable metric interface, so it
should be possible to drop in an alternative to the internal
metric that libgit2 implements.

This comes along with a behavior change for an existing interface;
namely, passing two NULLs to git_diff_blobs (or passing NULLs to
git_diff_blob_to_buffer) will now call the file_cb parameter zero
times instead of one time.  I know it's strange that that change
is paired with this other change, but it emerged from some
initialization changes that I ended up making.
2013-02-21 12:40:33 -08:00
Russell Belfer
71a3d27ea6 Replace diff delta binary with flags
Previously the git_diff_delta recorded if the delta was binary.
This replaces that (with no net change in structure size) with
a full set of flags.  The flag values that were already in use
for individual git_diff_file objects are reused for the delta
flags, too (along with renaming those flags to make it clear that
they are used more generally).

This (a) makes things somewhat more consistent (because I was
using a -1 value in the "boolean" binary field to indicate unset,
whereas now I can just use the flags that are easier to understand),
and (b) will make it easier for me to add some additional flags to
the delta object in the future, such as marking the results of a
copy/rename detection or other deltas that might want a special
indicator.

While making this change, I officially moved some of the flags that
were internal only into the private diff header.

This also allowed me to remove a gross hack in rename/copy detect
code where I was overwriting the status field with an internal
value.
2013-02-20 15:10:21 -08:00
Russell Belfer
9bc8be3d7e Refine pluggable similarity API
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API.  In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.

Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
2013-02-20 15:09:41 -08:00
Russell Belfer
aa6432604e More tests of file signatures with whitespace opts
Seems to be working pretty well...
2013-02-20 15:09:41 -08:00
Russell Belfer
5e5848eb15 Change similarity metric to sampled hashes
This moves the similarity metric code out of buf_text and into a
new file.  Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes.  In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
2013-02-20 15:09:40 -08:00
Russell Belfer
9c454b007b Initial implementation of similarity scoring algo
This adds a new `git_buf_text_hashsig` type and functions to
generate these hash signatures and compare them to give a
similarity score.  This can be plugged into diff similarity
scoring.
2013-02-20 15:09:40 -08:00
Vicent Martí
f2e1d06064 Merge pull request #1351 from arrbee/moar-treebuilder-tests
Add more treebuilder tests
2013-02-20 12:00:51 -08:00
Russell Belfer
0cfce06d08 Add more treebuilder tests
The recent changes with git_treebuilder_entrycount point out that
the test coverage for git_treebuilder_remove and
git_treebuilder_entrycount is completely absent.  This adds tests.
2013-02-20 11:58:21 -08:00
Russell Belfer
f7511c2c69 Merge pull request #1348 from libgit2/signatures-2
Simplify signature parsing
2013-02-20 10:19:58 -08:00
Vicent Marti
63964c891b Disable caching in Clar 2013-02-20 18:49:00 +01:00
Vicent Marti
c51880eeaf Simplify signature parsing 2013-02-20 17:03:18 +01:00
Russell Belfer
56543a609a Clear up warnings from cppcheck
The cppcheck static analyzer generates warnings for a bunch of
places in the libgit2 code base.  All the ones fixed in this
commit are actually false positives, but I've reorganized the
code to hopefully make it easier for static analysis tools to
correctly understand the structure.  I wouldn't do this if I
felt like it was making the code harder to read or worse for
humans, but in this case, these fixes don't seem too bad and will
hopefully make it easier for better analysis tools to get at any
real issues.
2013-02-15 16:02:45 -08:00
Vicent Martí
fcd7733ded Merge pull request #1318 from nulltoken/topic/diff-tree-coverage
Topic/diff tree coverage
2013-02-14 12:49:46 -08:00
Ben Straub
6a0ffe84a7 Merge pull request #1333 from phkelley/push_options
Add git_push_options, to set packbuilder parallelism
2013-02-12 10:50:55 -08:00
Russell Belfer
fbe67de997 Merge pull request #1246 from arrbee/fix-force-text-for-diff-blobs
Add FORCE_TEXT check into git_diff_blobs code path
2013-02-12 10:16:30 -08:00
Russell Belfer
9c258af094 Merge pull request #1316 from ben/clone-cancel
Allow network operations to cancel
2013-02-12 10:13:56 -08:00
Russell Belfer
c2c0874de2 More diff tests with binary data 2013-02-11 14:45:46 -08:00
nulltoken
2bca5b679b remote: Introduce git_remote_is_valid_name()
Fix libgit2/libgit2sharp#318
2013-02-11 23:19:41 +01:00
nulltoken
4d811c3b77 refs: No component of a refname can end with '.lock' 2013-02-11 23:19:40 +01:00
nulltoken
624924e876 remote: reorganize tests 2013-02-11 23:19:39 +01:00
Russell Belfer
390a3c8141 Merge pull request #1190 from nulltoken/topic/reset-paths
reset: Allow the selective reset of pathspecs
2013-02-11 11:44:00 -08:00
Philip Kelley
e026cfee00 Merge pull request #1323 from jamill/resolve_remote
Resolve a remote branch's remote
2013-02-11 09:12:39 -08:00
Jameson Miller
db4bb4158f Teach refspec to transform destination reference to source reference 2013-02-11 11:36:28 -05:00
Jameson Miller
2e3e8c889b Teach remote branch to return its remote 2013-02-11 11:36:22 -05:00
Philip Kelley
b8b897bbc5 Add git_push_options, to set packbuilder parallelism 2013-02-11 09:35:26 -05:00
Philip Kelley
8c29dca6c3 Fix some incorrect MSVC #ifdef's. Fixes #1305 2013-02-11 09:25:57 -05:00
Scott J. Goldman
6ce61a0bf6 tests: fix whitespace in refs/rename.c 2013-02-08 14:25:41 -08:00
yorah
0d64ba4837 diff: add a notify callback to git_diff__from_iterators
The callback will be called for each file, just before the `git_delta_t` gets inserted into the diff list.

When the callback:
- returns < 0, the diff process will be aborted
- returns > 0, the delta will not be inserted into the diff list, but the diff process continues
- returns 0, the delta is inserted into the diff list, and the diff process continues
2013-02-07 20:44:35 +01:00
Scott J. Goldman
c9459abb61 tests: fix indentation in repo/message.c 2013-02-07 03:12:39 -08:00
Scott J. Goldman
f7b060188a tests: fix indentation in repo/init.c 2013-02-07 03:04:50 -08:00
Scott J. Goldman
1ca163ff13 tests: fix code style in threads/basic.c 2013-02-07 02:04:17 -08:00
Ben Straub
beede4321f Fetchhead: don't expect a tag that isn't there 2013-02-06 13:25:43 -08:00