Commit Graph

128 Commits

Author SHA1 Message Date
Jeff Hostetler
d06c589f48 Add MSVC CRTDBG memory leak reporting. 2015-04-15 10:25:09 -04:00
Edward Thomson
f1453c59b2 Make our overflow check look more like gcc/clang's
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it.  This means dropping the ability to pass `NULL` as
an out parameter.

As a result, the macros also get updated to reflect this as well.
2015-02-13 09:27:33 -05:00
Edward Thomson
190b76a698 Introduce git__add_sizet_overflow and friends
Add some helper functions to check for overflow in a type-specific
manner.
2015-02-12 22:54:47 -05:00
Edward Thomson
8d534b4758 p_read: ensure requested len is ssize_t
Ensure that the given length to `p_read` is of ssize_t and ensure
that callers test the return as if it were an `ssize_t`.
2015-02-12 22:54:47 -05:00
Edward Thomson
2884cc42de overflow checking: don't make callers set oom
Have the ALLOC_OVERFLOW testing macros also simply set_oom in the
case where a computation would overflow, so that callers don't
need to.
2015-02-12 22:54:47 -05:00
Edward Thomson
3603cb0978 git__*allocarray: safer realloc and malloc
Introduce git__reallocarray that checks the product of the number
of elements and element size for overflow before allocation.  Also
introduce git__mallocarray that behaves like calloc, but without the
`c`.  (It does not zero memory, for those truly worried about every
cycle.)
2015-02-12 22:54:47 -05:00
Edward Thomson
392702ee2c allocations: test for overflow of requested size
Introduce some helper macros to test integer overflow from arithmetic
and set error message appropriately.
2015-02-12 22:54:46 -05:00
Maks Naumov
d8b5c8c329 Remove strlen() calls from loop condition
Avoid str length recalculation every iteration
2015-01-15 15:16:19 +02:00
Sebastian Bauer
7cf86f923a Added AmigaOS-specific implementation of git__timer().
The clock_gettime() function is normally not available under
AmigaOS, hence another solution is required. We are using now
GetUpTime() that is present in current versions of this
operating system.
2014-12-28 10:35:47 +01:00
Edward Thomson
fe5f7722f5 don't treat 0x85 as whitespace
A byte value of 0x85 is not whitespace, we were conflating that with
U+0085 (UTF8: 0xc2 0x85).  This caused us to incorrectly treat valid
multibyte characters like U+88C5 (UTF8: 0xe8 0xa3 0x85) as whitespace.
2014-12-23 11:27:01 -06:00
Vicent Marti
8e35527de2 path: Use UTF8 iteration for HFS chars 2014-12-16 10:24:18 -06:00
Edward Thomson
a64119e396 checkout: disallow bad paths on win32
Disallow:
 1. paths with trailing dot
 2. paths with trailing space
 3. paths with trailing colon
 4. paths that are 8.3 short names of .git folders ("GIT~1")
 5. paths that are reserved path names (COM1, LPT1, etc).
 6. paths with reserved DOS characters (colons, asterisks, etc)

These paths would (without \\?\ syntax) be elided to other paths - for
example, ".git." would be written as ".git".  As a result, writing these
paths literally (using \\?\ syntax) makes them hard to operate with from
the shell, Windows Explorer or other tools.  Disallow these.
2014-12-16 10:08:53 -06:00
Russell Belfer
3822d2cc4f Fix typo in timer normalization constants
The effect of this would be that various update callbacks would
not be made at the correct interval.
2014-08-05 15:06:45 -07:00
Russell Belfer
947a58c175 Clean up the handling of large binary diffs 2014-05-31 10:14:14 -07:00
Philip Kelley
7110000dd5 React to feedback for UTF-8 <-> WCHAR and reparse work 2014-04-23 09:23:50 -04:00
Russell Belfer
3b4c401a38 Decouple index iterator sort from index
This makes the index iterator honor the GIT_ITERATOR_IGNORE_CASE
and GIT_ITERATOR_DONT_IGNORE_CASE flags without modifying the
index data itself.  To take advantage of this, I had to export a
number of the internal index entry comparison functions.  I also
wrote some new tests to exercise the capability.
2014-04-17 14:43:45 -07:00
Jacques Germishuys
8e14b47fd7 Introduce git__date_rfc2822_fmt. Allows for RFC2822 date headers 2014-04-11 21:55:35 +02:00
Carlos Martín Nieto
24f3024f36 Split p_strlen into its own header
We need this from util.h and posix.h, but the latter includes common.h
which includes util.h, which means p_strlen is not defined by the time
we get to git__strndup().

Split the definition on p_strlen() off into its own header so we can use
it in util.h.
2014-02-05 14:34:15 +01:00
Carlos Martín Nieto
1e6f0ac436 utils: don't reimplement strnlen
The standard library provides a very nice strnlen function, which knows
to use SSE, let's not reimplement it ourselves.
2014-02-05 14:31:13 +01:00
Ben Straub
816d28e7bc Mark git__timer as inline on OSX 2013-10-01 12:56:47 -07:00
Jameson Miller
b176ededb7 Initial Implementation of progress reports during push
This adds the basics of progress reporting during push. While progress
for all aspects of a push operation are not reported with this change,
it lays the foundation to add these later. Push progress reporting
can be improved in the future - and consumers of the API should
just get more accurate information at that point.

The main areas where this is lacking are:

1) packbuilding progress: does not report progress during deltafication,
   as this involves coordinating progress from multiple threads.

2) network progress: reports progress as objects and bytes are going
   to be written to the subtransport (instead of as client gets
   confirmation that they have been received by the server) and leaves
   out some of the bytes that are transfered as part of the push protocol.
   Basically, this reports the pack bytes that are written to the
   subtransport. It does not report the bytes sent on the wire that
   are received by the server. This should be a good estimate of
   progress (and an improvement over no progress).
2013-09-30 13:22:28 -04:00
nulltoken
191adce875 vector: Teach git_vector_uniq() to free while deduplicating 2013-08-27 20:14:07 +02:00
Edward Thomson
a1f69452a2 git_strndup fix when OOM 2013-08-08 12:36:11 -05:00
Russell Belfer
d730d3f4f0 Major rename detection changes
After doing further profiling, I found that a lot of time was
being spent attempting to insert hashes into the file hash
signature when using the rolling hash because the rolling hash
approach generates a hash per byte of the file instead of one
per run/line of data.

To optimize this, I decided to convert back to a run-based file
signature algorithm which would be more like core Git.

After changing this, a number of the existing tests started to
fail.  In some cases, this appears to have been because the test
was coded to be too specific to the particular results of the file
similarity metric and in some cases there appear to have been bugs
in the core rename detection code where only by the coincidence
of the file similarity scoring were the expected results being
generated.

This renames all the variables in the core rename detection code
to be more consistent and hopefully easier to follow which made it
a bit easier to reason about the behavior of that code and fix the
problems that I was seeing.  I think it's in better shape now.

There are a couple of tests now that attempt to stress test the
rename detection code and they are quite slow.  Most of the time
is spent setting up the test data on disk and in the index.  When
we roll out performance improvements for index insertion, it
should also speed up these tests I hope.
2013-07-31 16:40:42 -07:00
Russell Belfer
302a04b09c Add accessors for refcount value 2013-07-10 12:14:13 -07:00
Edward Thomson
e3b4a47c1e git__strcasesort_cmp: strcasecmp sorting rules but requires strict equality 2013-06-17 10:03:15 -07:00
yorah
3425fee637 util: git__memzero() tweaks
On Linux: fix a warning message related to the volatile qualifier (cast)
On Windows: use SecureZeroMemory()

On both, inline the call, so that no entry point can lead back to this "secure" memory zeroing.
2013-06-17 15:42:33 +02:00
Vicent Marti
6de9b2ee14 util: It's called memzero 2013-06-12 21:10:33 +02:00
Vicent Marti
eb58e2d0be Merge remote-tracking branch 'arrbee/minor-paranoia' into development 2013-06-12 21:05:48 +02:00
Russell Belfer
114f5a6c41 Reorganize diff and add basic diff driver
This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization.  Hopefully this will make the diff code easier to
understand and to extend.

This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided.  The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.

This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.
2013-06-10 10:10:39 -07:00
Russell Belfer
3e9e6cdaff Add safe memset and use it
This adds a `git__memset` routine that will not be optimized away
and updates the places where I memset() right before a free() call
to use it.
2013-06-07 09:54:33 -07:00
Russell Belfer
d958e37a48 Fix issues with git_diff_find_similar
There are a number of bugs in the rename code that only were
obvious when I started testing it against large old repos with
more complex patterns.  (The code to do that testing is not ready
to merge with libgit2, but I do plan to add more thorough tests.)

This contains a significant number of changes and also tweaks the
public API slightly to make emulating core git easier.

Most notably, this separates the GIT_DIFF_FIND_AND_BREAK_REWRITES
flag into FIND_REWRITES (which adds a self-similarity score to
every modified file) and BREAK_REWRITES (which splits the modified
deltas into add/remove pairs in the diff list).  When you do a raw
output of core git, rewrites show up as M090 or such, not at A and
D output, so I wanted to be able to emulate that.

Publicly, this also changes the flags to be uint16_t since we
don't need values out of that range.

Internally, this contains significant changes from a number of
small bug fixes (like using the wrong side of the diff to decide
if the object could be found in the ODB vs the workdir) to larger
issues about which files can and should be compared and how the
various edge cases of similarity scores should be treated.

Honestly, I don't think this is the last update that will have to
be made to this code, but I think this moves us closer to correct
behavior and I tried to document the code so it would be easier
to follow..
2013-05-17 17:21:45 -07:00
Linquize
0cb16fe924 Unify whitespaces to tabs 2013-05-15 20:26:55 +08:00
Carlos Martín Nieto
05b179648a Make refcounting atomic 2013-04-22 17:12:11 +02:00
Russell Belfer
e976b56dda Add git__compare_and_swap and use it
This removes the lock from the repository object and changes the
internals to use the new atomic git__compare_and_swap to update
the _odb, _config, _index, and _refdb variables in a threadsafe
manner.
2013-04-22 16:52:07 +02:00
Russell Belfer
5360786885 Further threading fixes
This builds on the earlier thread safety work to make it so that
setting the odb, index, refdb, or config for a repository is done
in a threadsafe manner with minimized locking time.  This is done
by adding a lock to the repository object and using it to guard
the assignment of the above listed pointers.  The lock is only
held to assign the pointer value.

This also contains some minor fixes to the other work with pack
files to reduce the time that locks are being held to and fix an
apparently memory leak.
2013-04-22 16:52:07 +02:00
Russell Belfer
4dcd878019 Move refdb_backend to include/git2/sys
This moves most of the refdb stuff over to the include/git2/sys
directory, with some minor shifts in function organization.

While I was making the necessary updates, I also removed the
trailing whitespace in a few files that I modified just because I
was there and it was bugging me.
2013-04-21 11:57:21 -07:00
Russell Belfer
62beacd300 Sorting function cleanup and MinGW fix
Clean up some sorting function stuff including fixing qsort_r
on MinGW, common function pointer type for comparison, and basic
insertion sort implementation (which we, regrettably, fall back
on for MinGW).
2013-03-11 16:43:58 -07:00
Russell Belfer
e40f1c2d23 Make tree iterator handle icase equivalence
There is a serious bug in the previous tree iterator implementation.
If case insensitivity resulted in member elements being equivalent
to one another, and those member elements were trees, then the
children of the colliding elements would be processed in sequence
instead of in a single flattened list.  This meant that the tree
iterator was not truly acting like a case-insensitive list.

This completely reworks the tree iterator to manage lists with
case insensitive equivalence classes and advance through the items
in a unified manner in a single sorted frame.

It is possible that at a future date we might want to update this
to separate the case insensitive and case sensitive tree iterators
so that the case sensitive one could be a minimal amount of code
and the insensitive one would always know what it needed to do
without checking flags.

But there would be so much shared code between the two, that I'm
not sure it that's a win.  For now, this gets what we need.

More tests are needed, though.
2013-03-08 16:39:57 -08:00
Vicent Marti
41051e3fe1 signature: Shut up MSVC, you silly goose 2013-02-20 17:09:51 +01:00
Ben Straub
15760c598d Use malloc rather than calloc 2013-02-01 19:21:55 -08:00
Ben Straub
c4beee7681 Introduce git__substrdup 2013-02-01 10:00:55 -08:00
Philip Kelley
11d9f6b304 Vector improvements and their fallout 2013-01-27 14:17:07 -05:00
Russell Belfer
851ad65081 Add payload "_r" versions of bsearch and tsort
git__bsearch and git__tsort did not pass a payload through to the
comparison function.  This makes it impossible to implement sorted
lists where the sort order depends on external data (e.g. building
a secondary sort order for the entries in a tree).  This commit
adds git__bsearch_r and git__tsort_r versions that pass a third
parameter to the cmp function of a user payload.
2013-01-15 09:51:35 -08:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Edward Thomson
7fcec83428 fetchhead reading/iterating 2012-12-19 16:57:30 -06:00
Russell Belfer
a277345e05 Create internal strcmp variants for function ptrs
Using the builtin strcmp and strcasecmp as function pointers is
problematic on win32.  This adds internal implementations and
divorces us from the platform linkage.
2012-11-14 22:37:13 -08:00
Russell Belfer
55cbd05b18 Some diff refactorings to help code reuse
There are some diff functions that are useful in a rewritten
checkout and this lays some groundwork for that.  This contains
three main things:

1. Share the function diff uses to calculate the OID for a file
   in the working directory (now named `git_diff__oid_for_file`
2. Add a `git_diff__paired_foreach` function to iterator over
   two diff lists concurrently.  Convert status to use it.
3. Move all the string/prefix/index entry comparisons into
   function pointers inside the `git_diff_list` object so they
   can be switched between case sensitive and insensitive
   versions.  This makes them easier to reuse in various
   functions without replicating logic.  As part of this, move
   a couple of index functions out of diff.c and into index.c.
2012-11-09 13:52:07 -08:00
Philip Kelley
2364735c8f Fix implementation of strndup to not overrun 2012-11-09 15:39:10 -05:00
Philip Kelley
ec40b7f99f Support for core.ignorecase 2012-09-17 15:42:41 -04:00