Commit Graph

126 Commits

Author SHA1 Message Date
Patrick Steinhardt
f31cb45ad2 khash: avoid using kh_put directly 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
a8cd560b10 khash: avoid using kh_del directly 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
cb18386f72 khash: avoid using kh_val/kh_value directly 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
a853c52723 khash: avoid using kh_get directly 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
64e46dc3b5 khash: avoid using kh_end directly 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
036daa59e9 khash: use git_map_exists where applicable 2017-02-17 11:41:06 +01:00
Patrick Steinhardt
9694d9ba79 khash: avoid using kh_foreach/kh_foreach_value directly 2017-02-17 11:41:06 +01:00
Edward Thomson
bf339ab0ef indexer: introduce git_packfile_close
Encapsulation!
2017-01-21 15:21:29 -05:00
Edward Thomson
909d549436 giterr_set: consistent error messages
Error messages should be sentence fragments, and therefore:

1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
2016-12-29 12:26:03 +00:00
Carlos Martín Nieto
903955f7e5 Merge pull request #4027 from pks-t/pks/pack-deref-cache-on-error
pack: dereference cached pack entry on error
2016-12-19 17:26:09 +00:00
Patrick Steinhardt
ff5eea06a9 pack: dereference cached pack entry on error
When trying to uncompress deltas in a packfile's delta chain, we try to
add object bases to the packfile cache, subsequently decrementing its
reference count if it has been added successfully. This may lead to a
mismatched reference count in the case where we exit the loop early due
to an encountered error.

Fix the issue by decrementing the reference count in error cleanup.
2016-12-12 09:45:07 +01:00
Patrick Steinhardt
34b320535b Fix potential use of uninitialized values 2016-12-12 09:16:33 +01:00
Patrick Steinhardt
0cf15e39f3 pack: fix race in pack_entry_find_offset
In `pack_entry_find_offset`, we try to find the offset of a
certain object in the pack file. To do so, we first assert if the
packfile has already been opened and open it if not. Opening the
packfile is guarded with a mutex, so concurrent access to this is
in fact safe.

What is not thread-safe though is our calculation of offsets
inside the packfile. Assume two threads calling
`pack_entry_find_offset` at the same time. We first calculate the
offset and index location and only then determine if the pack has
already been opened. If so, we re-calculate the offset and index
address.

Now the case for two threads: thread 1 first calculates the
addresses and is subsequently suspended. The second thread will
now call `pack_index_open` and initialize the pack file,
calculating its addresses correctly. When the first thread is
resumed now, he'll see that the pack file has already been
initialized and will happily proceed with the addresses it has
already calculated before the check. As the pack file was not
initialized before, these addresses are bogus.

Fix the issue by only calculating the addresses after having
checked if the pack file is open.
2016-11-02 12:23:12 +01:00
Edward Thomson
6a2d2f8aa1 delta: move delta application to delta.c
Move the delta application functions into `delta.c`, next to the
similar delta creation functions.  Make the `git__delta_apply`
functions adhere to other naming and parameter style within the
library.
2016-05-26 13:01:03 -05:00
Carlos Martín Nieto
a97b769a0e odb: avoid inflating the full delta to read the header
When we read the header, we want to know the size and type of the
object. We're currently inflating the full delta in order to read the
first few bytes. This can mean hundreds of kB needlessly inflated for
large objects.

Instead use a packfile stream to read just enough so we can read the two
varints in the header and avoid inflating most of the delta.
2016-05-02 17:37:26 +02:00
Carlos Martín Nieto
d53cc13e3a Merge pull request #3575 from pmq20/master-13jan16
Remove duplicated calls to git_mwindow_close
2016-03-31 04:12:46 -07:00
Edward Thomson
e10144ae57 odb: improved not found error messages
When looking up an abbreviated oid, show the actual (abbreviated) oid
the caller passed instead of a full (but ambiguously truncated) oid.
2016-03-07 10:20:01 -05:00
Carlos Martín Nieto
6d97beb91f pack: don't allow a negative offset 2016-02-25 15:46:59 +01:00
Carlos Martín Nieto
ea9e00cb5c pack: make sure we don't go out of bounds for extended entries
A corrupt index might have data that tells us to go look past the end of
the file for data. Catch these cases and return an appropriate error
message.
2016-02-25 15:43:17 +01:00
Patrick Steinhardt
a53d2e3985 pack: do not free passed in poiter on error
The function `git_packfile_stream_open` tries to free the passed
in stream when an error occurs. The only call site is
`git_indexer_append`, though, which passes in the address of a
stream struct which has not been allocated on the heap.

Fix the issue by simply removing the call to free. In case of an
error we did not allocate any memory yet and otherwise it should
be the caller's responsibility to manage it's object's lifetime.
2016-02-09 09:58:56 +01:00
P.S.V.R
d4e4f27204 Remove duplicated calls to git_mwindow_close 2016-01-13 11:07:14 +08:00
P.S.V.R
b644e223aa Make packfile_unpack_compressed a private API 2016-01-13 11:02:38 +08:00
Stefan Widgren
c369b37919 Remove extra semicolon outside of a function
Without this change, compiling with gcc and pedantic generates warning:
ISO C does not allow extra ‘;’ outside of a function.
2015-07-31 16:23:11 +02:00
Carlos Martín Nieto
878293f7e1 pack: use git_buf when building the index name
The way we currently do it depends on the subtlety of strlen vs sizeof
and the fact that .pack is one longer than .idx. Let's use a git_buf so
we can express the manipulation we want much more clearly.
2015-06-10 10:44:14 +02:00
Edward Thomson
38c10ecd99 indexer: don't look for the index we're creating
When creating an index, know that we do not have an index for
our own packfile, preventing some unnecessary file opens and
error reporting.
2015-05-22 15:27:48 -04:00
Carlos Martín Nieto
b63b76e0b0 Reorder some khash declarations
Keep the definitions in the headers, while putting the declarations in
the C files. Putting the function definitions in headers causes
them to be duplicated if you include two headers with them.
2015-03-11 02:36:11 +01:00
Carlos Martín Nieto
5091aff782 Merge pull request #2907 from jasonhaslam/git_packfile_unpack_race
Fix race in git_packfile_unpack.
2015-02-20 08:40:40 +01:00
Jason Haslam
8588cb0cbf Fix race in git_packfile_unpack.
Increment refcount of newly added cache entries just like existing
entries looked up from the cache. Otherwise the new entry can be
evicted from the cache and destroyed while it's still in use.
2015-02-14 23:43:26 -07:00
Edward Thomson
f1453c59b2 Make our overflow check look more like gcc/clang's
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it.  This means dropping the ability to pass `NULL` as
an out parameter.

As a result, the macros also get updated to reflect this as well.
2015-02-13 09:27:33 -05:00
Edward Thomson
392702ee2c allocations: test for overflow of requested size
Introduce some helper macros to test integer overflow from arithmetic
and set error message appropriately.
2015-02-12 22:54:46 -05:00
Jacques Germishuys
6f73e02605 Plug some leaks 2014-12-29 18:18:49 +02:00
Ravindra Patel
ec7e680c6c Fix for misleading "missing delta bases" error - Fix #2721. 2014-11-21 15:05:34 -05:00
Pierre-Olivier Latour
ea66215d87 Removed some useless variable assignments 2014-10-27 09:19:07 -07:00
Jacques Germishuys
e640a77c9f Silence uninitialized warning 2014-09-26 12:12:08 +02:00
Arkady Shapkin
5cd81bb3d8 Several CppCat warnings fixed 2014-09-03 01:01:25 +04:00
Carlos Martín Nieto
b3d3459f32 pack: return the correct final offset
The callers of git_packfile_unpack() expect the obj_offset argument to
be set to the beginning of the next object. We were mistakenly returning
the the offset of the object's data, which causes the CRC function to
try to use the wrong offset.

Set obj_offset to curpos instead of elem->offset to point to the next
element and bring back expected behaviour.
2014-08-26 15:09:47 +02:00
Carlos Martín Nieto
5e0f47c375 pack: free the new pack struct if we fail to insert
If we fail to insert the packfile in the map, make sure to free it.

This makes the free function only attempt to remove its mwindows from
the global list if we have opened the packfile to avoid accessing the
list unlocked.
2014-06-25 21:20:39 +02:00
Carlos Martín Nieto
b3b66c5793 Share packs across repository instances
Opening the same repository multiple times will currently open the same
file multiple times, as well as map the same region of the file multiple
times. This is not necessary, as the packfile data is immutable.

Instead of opening and closing packfiles directly, introduce an
indirection and allocate packfiles globally. This does mean locking on
each packfile open, but we already use this lock for the global mwindow
list so it doesn't introduce a new contention point.
2014-06-23 21:50:36 +02:00
Carlos Martín Nieto
649214be4b pack: init the cache on packfile alloc
When running multithreaded, it is not enough to check for the offmap
allocation. Move the call to cache_init() to packfile allocation so we
can be sure it is always allocated free of races.

This fixes #2355.
2014-05-15 19:59:05 +02:00
Carlos Martín Nieto
c968ce2c2c pack: don't forget to cache the base object
The base object is a good cache candidate, so we shouldn't forget to add
it to the cache.
2014-05-13 02:48:52 +02:00
Carlos Martín Nieto
15bcced223 pack: use stack allocation for smaller delta chains
This avoid allocating the array on the heap for relatively small
chains. The expected performance increase is sadly not really
noticeable.
2014-05-13 02:48:52 +02:00
Carlos Martín Nieto
a3ffbf230e pack: expose a cached delta base directly
Instead of going through a special entry in the chain, let's pass it as
an output parameter.
2014-05-13 02:48:48 +02:00
Carlos Martín Nieto
9dbd150f5f pack: simplify delta chain code
The switch makes the loop somewhat unwieldy. Let's assume it's fine and
perform the check when we're accessing the data.

This makes our code look a lot more like git's.
2014-05-09 09:59:24 +02:00
Carlos Martín Nieto
b2559f477a pack: preallocate a 64-element chain
Dependency chains are often large and require a few
reallocations. Allocate a 64-element chain before doing anything else to
avoid allocations during the loop.

This value comes from the stack-allocated one git uses. We still
allocate this on the heap, but it does help performance a little bit.
2014-05-09 09:40:29 +02:00
Carlos Martín Nieto
e6d10c58b5 pack: make sure not to leak the dep chain 2014-05-09 09:40:29 +02:00
Carlos Martín Nieto
a332e91c92 pack: use a cache for delta bases when unpacking
Bring back the use of the delta base cache for unpacking objects. When
generating the delta chain, we stop when we find a delta base in the
pack's cache and use that as the starting point.
2014-05-09 09:40:29 +02:00
Carlos Martín Nieto
2acdf4b854 pack: unpack using a loop
We currently make use of recursive function calls to unpack an object,
resolving the deltas as we come back down the chain. This means that we
have unbounded stack growth as we look up objects in a pack.

This is now done in two steps: first we figure out what the dependency
chain is by looking up the delta bases until we reach a non-delta
object, pushing the information we need onto a stack and then we pop
from that stack and apply the deltas until there are no more left.

This version of the code does not make use of the delta base cache so it
is slower than what's in the mainline. A later commit will reintroduce
it.
2014-05-09 09:40:29 +02:00
Carlos Martín Nieto
ae0817393c pack: do not repeat the same error message four times
Repeating this error message makes it harder to find out where we
actually are finding the error, and they don't really describe what
we're trying to do.
2014-05-09 09:40:29 +02:00
Carlos Martín Nieto
86d5810b82 pack: remove misleading comment 2014-05-09 09:40:29 +02:00
Linquize
8610487cd3 Drop parsing pack filename SHA1 part, no one cares the filename 2014-01-23 23:28:28 +08:00