Commit Graph

113 Commits

Author SHA1 Message Date
Russell Belfer
3b4c401a38 Decouple index iterator sort from index
This makes the index iterator honor the GIT_ITERATOR_IGNORE_CASE
and GIT_ITERATOR_DONT_IGNORE_CASE flags without modifying the
index data itself.  To take advantage of this, I had to export a
number of the internal index entry comparison functions.  I also
wrote some new tests to exercise the capability.
2014-04-17 14:43:45 -07:00
Jacques Germishuys
8e14b47fd7 Introduce git__date_rfc2822_fmt. Allows for RFC2822 date headers 2014-04-11 21:55:35 +02:00
Carlos Martín Nieto
24f3024f36 Split p_strlen into its own header
We need this from util.h and posix.h, but the latter includes common.h
which includes util.h, which means p_strlen is not defined by the time
we get to git__strndup().

Split the definition on p_strlen() off into its own header so we can use
it in util.h.
2014-02-05 14:34:15 +01:00
Carlos Martín Nieto
1e6f0ac436 utils: don't reimplement strnlen
The standard library provides a very nice strnlen function, which knows
to use SSE, let's not reimplement it ourselves.
2014-02-05 14:31:13 +01:00
Ben Straub
816d28e7bc Mark git__timer as inline on OSX 2013-10-01 12:56:47 -07:00
Jameson Miller
b176ededb7 Initial Implementation of progress reports during push
This adds the basics of progress reporting during push. While progress
for all aspects of a push operation are not reported with this change,
it lays the foundation to add these later. Push progress reporting
can be improved in the future - and consumers of the API should
just get more accurate information at that point.

The main areas where this is lacking are:

1) packbuilding progress: does not report progress during deltafication,
   as this involves coordinating progress from multiple threads.

2) network progress: reports progress as objects and bytes are going
   to be written to the subtransport (instead of as client gets
   confirmation that they have been received by the server) and leaves
   out some of the bytes that are transfered as part of the push protocol.
   Basically, this reports the pack bytes that are written to the
   subtransport. It does not report the bytes sent on the wire that
   are received by the server. This should be a good estimate of
   progress (and an improvement over no progress).
2013-09-30 13:22:28 -04:00
nulltoken
191adce875 vector: Teach git_vector_uniq() to free while deduplicating 2013-08-27 20:14:07 +02:00
Edward Thomson
a1f69452a2 git_strndup fix when OOM 2013-08-08 12:36:11 -05:00
Russell Belfer
d730d3f4f0 Major rename detection changes
After doing further profiling, I found that a lot of time was
being spent attempting to insert hashes into the file hash
signature when using the rolling hash because the rolling hash
approach generates a hash per byte of the file instead of one
per run/line of data.

To optimize this, I decided to convert back to a run-based file
signature algorithm which would be more like core Git.

After changing this, a number of the existing tests started to
fail.  In some cases, this appears to have been because the test
was coded to be too specific to the particular results of the file
similarity metric and in some cases there appear to have been bugs
in the core rename detection code where only by the coincidence
of the file similarity scoring were the expected results being
generated.

This renames all the variables in the core rename detection code
to be more consistent and hopefully easier to follow which made it
a bit easier to reason about the behavior of that code and fix the
problems that I was seeing.  I think it's in better shape now.

There are a couple of tests now that attempt to stress test the
rename detection code and they are quite slow.  Most of the time
is spent setting up the test data on disk and in the index.  When
we roll out performance improvements for index insertion, it
should also speed up these tests I hope.
2013-07-31 16:40:42 -07:00
Russell Belfer
302a04b09c Add accessors for refcount value 2013-07-10 12:14:13 -07:00
Edward Thomson
e3b4a47c1e git__strcasesort_cmp: strcasecmp sorting rules but requires strict equality 2013-06-17 10:03:15 -07:00
yorah
3425fee637 util: git__memzero() tweaks
On Linux: fix a warning message related to the volatile qualifier (cast)
On Windows: use SecureZeroMemory()

On both, inline the call, so that no entry point can lead back to this "secure" memory zeroing.
2013-06-17 15:42:33 +02:00
Vicent Marti
6de9b2ee14 util: It's called memzero 2013-06-12 21:10:33 +02:00
Vicent Marti
eb58e2d0be Merge remote-tracking branch 'arrbee/minor-paranoia' into development 2013-06-12 21:05:48 +02:00
Russell Belfer
114f5a6c41 Reorganize diff and add basic diff driver
This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization.  Hopefully this will make the diff code easier to
understand and to extend.

This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided.  The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.

This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.
2013-06-10 10:10:39 -07:00
Russell Belfer
3e9e6cdaff Add safe memset and use it
This adds a `git__memset` routine that will not be optimized away
and updates the places where I memset() right before a free() call
to use it.
2013-06-07 09:54:33 -07:00
Russell Belfer
d958e37a48 Fix issues with git_diff_find_similar
There are a number of bugs in the rename code that only were
obvious when I started testing it against large old repos with
more complex patterns.  (The code to do that testing is not ready
to merge with libgit2, but I do plan to add more thorough tests.)

This contains a significant number of changes and also tweaks the
public API slightly to make emulating core git easier.

Most notably, this separates the GIT_DIFF_FIND_AND_BREAK_REWRITES
flag into FIND_REWRITES (which adds a self-similarity score to
every modified file) and BREAK_REWRITES (which splits the modified
deltas into add/remove pairs in the diff list).  When you do a raw
output of core git, rewrites show up as M090 or such, not at A and
D output, so I wanted to be able to emulate that.

Publicly, this also changes the flags to be uint16_t since we
don't need values out of that range.

Internally, this contains significant changes from a number of
small bug fixes (like using the wrong side of the diff to decide
if the object could be found in the ODB vs the workdir) to larger
issues about which files can and should be compared and how the
various edge cases of similarity scores should be treated.

Honestly, I don't think this is the last update that will have to
be made to this code, but I think this moves us closer to correct
behavior and I tried to document the code so it would be easier
to follow..
2013-05-17 17:21:45 -07:00
Linquize
0cb16fe924 Unify whitespaces to tabs 2013-05-15 20:26:55 +08:00
Carlos Martín Nieto
05b179648a Make refcounting atomic 2013-04-22 17:12:11 +02:00
Russell Belfer
e976b56dda Add git__compare_and_swap and use it
This removes the lock from the repository object and changes the
internals to use the new atomic git__compare_and_swap to update
the _odb, _config, _index, and _refdb variables in a threadsafe
manner.
2013-04-22 16:52:07 +02:00
Russell Belfer
5360786885 Further threading fixes
This builds on the earlier thread safety work to make it so that
setting the odb, index, refdb, or config for a repository is done
in a threadsafe manner with minimized locking time.  This is done
by adding a lock to the repository object and using it to guard
the assignment of the above listed pointers.  The lock is only
held to assign the pointer value.

This also contains some minor fixes to the other work with pack
files to reduce the time that locks are being held to and fix an
apparently memory leak.
2013-04-22 16:52:07 +02:00
Russell Belfer
4dcd878019 Move refdb_backend to include/git2/sys
This moves most of the refdb stuff over to the include/git2/sys
directory, with some minor shifts in function organization.

While I was making the necessary updates, I also removed the
trailing whitespace in a few files that I modified just because I
was there and it was bugging me.
2013-04-21 11:57:21 -07:00
Russell Belfer
62beacd300 Sorting function cleanup and MinGW fix
Clean up some sorting function stuff including fixing qsort_r
on MinGW, common function pointer type for comparison, and basic
insertion sort implementation (which we, regrettably, fall back
on for MinGW).
2013-03-11 16:43:58 -07:00
Russell Belfer
e40f1c2d23 Make tree iterator handle icase equivalence
There is a serious bug in the previous tree iterator implementation.
If case insensitivity resulted in member elements being equivalent
to one another, and those member elements were trees, then the
children of the colliding elements would be processed in sequence
instead of in a single flattened list.  This meant that the tree
iterator was not truly acting like a case-insensitive list.

This completely reworks the tree iterator to manage lists with
case insensitive equivalence classes and advance through the items
in a unified manner in a single sorted frame.

It is possible that at a future date we might want to update this
to separate the case insensitive and case sensitive tree iterators
so that the case sensitive one could be a minimal amount of code
and the insensitive one would always know what it needed to do
without checking flags.

But there would be so much shared code between the two, that I'm
not sure it that's a win.  For now, this gets what we need.

More tests are needed, though.
2013-03-08 16:39:57 -08:00
Vicent Marti
41051e3fe1 signature: Shut up MSVC, you silly goose 2013-02-20 17:09:51 +01:00
Ben Straub
15760c598d Use malloc rather than calloc 2013-02-01 19:21:55 -08:00
Ben Straub
c4beee7681 Introduce git__substrdup 2013-02-01 10:00:55 -08:00
Philip Kelley
11d9f6b304 Vector improvements and their fallout 2013-01-27 14:17:07 -05:00
Russell Belfer
851ad65081 Add payload "_r" versions of bsearch and tsort
git__bsearch and git__tsort did not pass a payload through to the
comparison function.  This makes it impossible to implement sorted
lists where the sort order depends on external data (e.g. building
a secondary sort order for the entries in a tree).  This commit
adds git__bsearch_r and git__tsort_r versions that pass a third
parameter to the cmp function of a user payload.
2013-01-15 09:51:35 -08:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Edward Thomson
7fcec83428 fetchhead reading/iterating 2012-12-19 16:57:30 -06:00
Russell Belfer
a277345e05 Create internal strcmp variants for function ptrs
Using the builtin strcmp and strcasecmp as function pointers is
problematic on win32.  This adds internal implementations and
divorces us from the platform linkage.
2012-11-14 22:37:13 -08:00
Russell Belfer
55cbd05b18 Some diff refactorings to help code reuse
There are some diff functions that are useful in a rewritten
checkout and this lays some groundwork for that.  This contains
three main things:

1. Share the function diff uses to calculate the OID for a file
   in the working directory (now named `git_diff__oid_for_file`
2. Add a `git_diff__paired_foreach` function to iterator over
   two diff lists concurrently.  Convert status to use it.
3. Move all the string/prefix/index entry comparisons into
   function pointers inside the `git_diff_list` object so they
   can be switched between case sensitive and insensitive
   versions.  This makes them easier to reuse in various
   functions without replicating logic.  As part of this, move
   a couple of index functions out of diff.c and into index.c.
2012-11-09 13:52:07 -08:00
Philip Kelley
2364735c8f Fix implementation of strndup to not overrun 2012-11-09 15:39:10 -05:00
Philip Kelley
ec40b7f99f Support for core.ignorecase 2012-09-17 15:42:41 -04:00
yorah
02a0d651d7 Add git_buf_unescape and git__unescape to unescape all characters in a string (in-place) 2012-07-24 14:03:07 +02:00
Vicent Martí
48bcf81dd2 Merge pull request #812 from arrbee/assorted-tweaks
Assorted goodies
2012-07-12 09:32:44 -07:00
Russell Belfer
c3a875c975 Adding unicode space to match crlf patterns
Adding 0x85 to `git__isspace` since we also look for that in filter.c
as a whitespace character.
2012-07-10 23:19:47 -07:00
nulltoken
98d6a1fdda util: add git__isdigit() 2012-07-07 12:16:09 +02:00
nulltoken
494ae940a0 revparse: fix parsing of date specifiers 2012-07-02 19:56:41 +02:00
Ben Straub
8a385c0482 Move git__date_parse declaration to util.h. 2012-06-06 12:25:22 -07:00
Ben Straub
56a5000d58 Merge branch 'development' into rev-parse
Conflicts:
	src/util.h
	tests-clar/refs/branches/listall.c
2012-06-05 12:52:44 -07:00
Vicent Martí
29e948debe global: Change parameter ordering in API
Consistency is good.
2012-05-18 01:25:57 +02:00
Ben Straub
1ce4cc0164 Fix date.c build in msvc.
Ported the win32 implementations of gmtime_r,
localtime_r, and gettimeofday to be part of the 
posix compatibility layer, and fixed
git_signature_now to use them.
2012-05-15 15:41:05 -07:00
Russell Belfer
41a82592ef Ranged iterators and rewritten git_status_file
The goal of this work is to rewrite git_status_file to use the
same underlying code as git_status_foreach.

This is done in 3 phases:

1. Extend iterators to allow ranged iteration with start and
   end prefixes for the range of file names to be covered.
2. Improve diff so that when there is a pathspec and there is
   a common non-wildcard prefix of the pathspec, it will use
   ranged iterators to minimize excess iteration.
3. Rewrite git_status_file to call git_status_foreach_ext
   with a pathspec that covers just the one file being checked.

Since ranged iterators underlie the status & diff implementation,
this is actually fairly efficient.  The workdir iterator does
end up loading the contents of all the directories down to the
single file, which should ideally be avoided, but it is pretty
good.
2012-05-15 14:34:15 -07:00
Ben Straub
a346992f7e Rev-parse: @{time} syntax.
Ported date.c (for approxidate_careful) from git.git
revision aa39b85. Trimmed out the parts we're not
using.
2012-05-11 11:35:50 -07:00
nulltoken
9cd25d0003 util: Fix git__isspace() implementation
The characters <space>, <form-feed>, <newline>, <carriage-return>, <tab>, and <vertical-tab> are part of the "space" definition.

cf. http://www.kernel.org/doc/man-pages/online/pages/man5/locale.5.html
2012-05-09 13:21:21 +02:00
Vicent Martí
0f49200c9a msvc: Do not use isspace
Locale-aware bullshit bitting my ass again yo
2012-05-09 04:37:02 +02:00
Vicent Martí
3fbcac89c4 Remove old and unused error codes 2012-05-02 19:56:38 -07:00
Russell Belfer
25f258e735 Moving power-of-two bit utilities into util.h 2012-04-25 11:14:34 -07:00