Commit Graph

77 Commits

Author SHA1 Message Date
Linquize
8610487cd3 Drop parsing pack filename SHA1 part, no one cares the filename 2014-01-23 23:28:28 +08:00
Russell Belfer
26c1cb91be One more rename/cleanup for callback err functions 2013-12-11 10:57:50 -08:00
Russell Belfer
c7b3e1b320 Some callback error check style cleanups
I find this easier to read...
2013-12-11 10:57:50 -08:00
Russell Belfer
25e0b1576d Remove converting user error to GIT_EUSER
This changes the behavior of callbacks so that the callback error
code is not converted into GIT_EUSER and instead we propagate the
return value through to the caller.  Instead of using the
giterr_capture and giterr_restore functions, we now rely on all
functions to pass back the return value from a callback.

To avoid having a return value with no error message, the user
can call the public giterr_set_str or some such function to set
an error message.  There is a new helper 'giterr_set_callback'
that functions can invoke after making a callback which ensures
that some error message was set in case the callback did not set
one.

In places where the sign of the callback return value is
meaningful (e.g. positive to skip, negative to abort), only the
negative values are returned back to the caller, obviously, since
the other values allow for continuing the loop.

The hardest parts of this were in the checkout code where positive
return values were overloaded as meaningful values for checkout.
I fixed this by adding an output parameter to many of the internal
checkout functions and removing the overload.  This added some
code, but it is probably a better implementation.

There is some funkiness in the network code where user provided
callbacks could be returning a positive or a negative value and
we want to rely on that to cancel the loop.  There are still a
couple places where an user error might get turned into GIT_EUSER
there, I think, though none exercised by the tests.
2013-12-11 10:57:49 -08:00
Russell Belfer
dab89f9b68 Further EUSER and error propagation fixes
This continues auditing all the places where GIT_EUSER is being
returned and making sure to clear any existing error using the
new giterr_user_cancel helper.  As a result, places that relied
on intercepting GIT_EUSER but having the old error preserved also
needed to be cleaned up to correctly stash and then retrieve the
actual error.

Additionally, as I encountered places where error codes were not
being propagated correctly, I tried to fix them up.  A number of
those fixes are included in the this commit as well.
2013-12-11 10:57:49 -08:00
Vicent Marti
51a3dfb595 pack: __object_header always returns unsigned values 2013-11-01 17:36:09 +01:00
Linquize
3343b5ffd3 Fix warning on win64 2013-11-01 17:36:04 +01:00
Carlos Martín Nieto
51e82492ef pack: move the object header function here 2013-10-04 10:18:20 +02:00
Vicent Marti
67591c8cd8 sha1_lookup: do not use the "experimental" lookup mode 2013-08-14 10:28:01 +02:00
Sven Strickroth
3a2d48d5ee Close p->mwf.fd only if necessary
This fixes a regression introduced in revision 9d2f841a5d.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
2013-07-25 15:21:55 +02:00
Rémi Duraffort
050af8bbe0 pack: fix memory leak in error path 2013-07-15 16:29:13 +02:00
Russell Belfer
1a42dd17eb Mutex init can fail
It is obviously quite a serious problem if this happens, but mutex
initialization can fail and we should detect it.  It's a bit like
a memory allocation failure, in that you're probably pretty screwed
if this occurs, but at least we'll catch it.
2013-05-31 14:13:11 -07:00
Russell Belfer
f658dc433c Zero memory for major objects before freeing
By zeroing out the memory when we free larger objects (i.e. those
that serve as collections of other data, such as repos, odb, refdb),
I'm hoping that it will be easier for libgit2 bindings to find
errors in their object management code.
2013-05-31 14:09:58 -07:00
Carlos Martín Nieto
0ddfcb40d5 Switch to index_version as "git_pack_file is ready" flag
We use p->index_map.data to check whether the struct has been set up
and all the information about the index is stored there. This variable
gets set up halfway through the setup process, however, and a thread
can come along and use fields that haven't been written to yet.

Crucially, pack_entry_find_offset() needs to read the index version
(which is written after index_map) to know the offset and stride
length to pass to sha1_entry_pos(). If these values are wrong,
assertions in it will fail, as it will be reading bogus data.

Make index_version the last field to be written and switch from using
p->index_map.data to p->index_version as "git_pack_file is ready" flag
as we can use it to know if every field has been written.
2013-05-02 18:27:02 +02:00
Carlos Martín Nieto
34bd59992e Revert "Protect sha1_entry_pos call with mutex"
This reverts commit 8c535f3f68.
2013-05-02 17:14:05 +02:00
Russell Belfer
8c535f3f68 Protect sha1_entry_pos call with mutex
There is an occasional assertion failure in sha1_entry_pos from
pack_entry_find_index when running threaded.  Holding the mutex
around the code that grabs the index_map data and processes it
makes this assertion failure go away.
2013-05-02 03:34:56 -07:00
Russell Belfer
9d2f841a5d Add extra locking around packfile open
We were still seeing a few issues in threaded access to packs.
This adds extra locks around the opening of the mwindow to
avoid a different race.
2013-05-02 03:03:54 -07:00
Russell Belfer
b7f167da29 Make git_oid_cmp public and add git_oid__cmp 2013-04-29 13:52:12 -07:00
Russell Belfer
5d2d21e536 Consolidate packfile allocation further
Rename git_packfile_check to git_packfile_alloc since it is now
being used more in that capacity.  Fix the various places that use
it.  Consolidate some repeated code in odb_pack.c related to the
allocation of a new pack_backend.
2013-04-22 16:52:07 +02:00
Russell Belfer
38eef6113d Make indexer use shared packfile open code
The indexer was creating a packfile object separately from the
code in pack.c which was a problem since I put a call to
git_mutex_init into just pack.c.  This commit updates the pack
function for creating a new pack object (i.e. git_packfile_check())
so that it can be used in both places and then makes indexer.c
use the shared initialization routine.

There are also a few minor formatting and warning message fixes.
2013-04-22 16:52:07 +02:00
Russell Belfer
5360786885 Further threading fixes
This builds on the earlier thread safety work to make it so that
setting the odb, index, refdb, or config for a repository is done
in a threadsafe manner with minimized locking time.  This is done
by adding a lock to the repository object and using it to guard
the assignment of the above listed pointers.  The lock is only
held to assign the pointer value.

This also contains some minor fixes to the other work with pack
files to reduce the time that locks are being held to and fix an
apparently memory leak.
2013-04-22 16:52:07 +02:00
Russell Belfer
24c70804e8 Add mutex around mapping and unmapping pack files
When I was writing threading tests for the new cache, the main
error I kept running into was a pack file having it's content
unmapped underneath the running thread.  This adds a lock around
the routines that map and unmap the pack data so that threads can
effectively reload the data when they need it.

This also required reworking the error handling paths in a couple
places in the code which I tried to make consistent.
2013-04-22 16:50:51 +02:00
Carlos Martín Nieto
0e040c031e indexer: use a hashtable for keeping track of offsets
These offsets are needed for REF_DELTA objects, which encode which
object they use as a base, but not where it lies in the packfile, so
we need a list.

These objects are mostly from older packfiles, before OFS_DELTA was
widely spread. The time spent in indexing these packfiles is greatly
reduced, though remains above what git is able to do.
2013-03-03 23:18:29 +01:00
Philip Kelley
11d9f6b304 Vector improvements and their fallout 2013-01-27 14:17:07 -05:00
Philip Kelley
aa3bf89df2 Fix a mutex leak in pack.c 2013-01-26 15:12:53 -05:00
Carlos Martín Nieto
9c62aaab40 pack: evict all of the pages at once
Somewhat surprisingly, this can increase the speed considerably, as we
don't bother trying to decide what to evict, and the most used entries
are quickly back into the cache.
2013-01-14 18:10:56 +01:00
Carlos Martín Nieto
ed6648ba46 pack: evict objects from the cache in groups of eight
This drops the cache eviction below libcrypto and zlib in the perf
output. The number has been chosen empirically.
2013-01-14 17:16:45 +01:00
Carlos Martín Nieto
09e29e47b3 pack: fixes to the cache
The offset should be git_off_t, and we should check the return value
of the mutex lock function.
2013-01-12 19:41:30 +01:00
Carlos Martín Nieto
96c9b9f0e5 indexer: properly free the packfile resources
The indexer needs to call the packfile's free function so it takes care of
freeing the caches.

We still need to close the mwf descriptor manually so we can rename the
packfile into its final name on Windows.
2013-01-12 18:44:58 +01:00
Carlos Martín Nieto
80d647adc3 Revert "pack: packfile_free -> git_packfile_free and use it in the indexers"
This reverts commit f289f886cb, which
makes the tests fail on Windows. Revert until we can figure out a
solution.
2013-01-11 20:17:21 +01:00
nulltoken
090d5e1fda Fix MSVC compilation warnings 2013-01-11 19:30:59 +01:00
Carlos Martín Nieto
f289f886cb pack: packfile_free -> git_packfile_free and use it in the indexers
It turns out the indexers have been ignoring the pack's free function
and leaking data. Plug that.
2013-01-11 17:33:00 +01:00
Carlos Martín Nieto
0ed7562006 pack: limit the amount of memory the base delta cache can use
Currently limited to 16MB (like git) and to objects up to 1MB in
size.
2013-01-11 17:32:59 +01:00
Carlos Martín Nieto
c8f79c2bdf pack: abstract out the cache into its own functions 2013-01-11 17:32:59 +01:00
Carlos Martín Nieto
525d961c24 pack: refcount entries and add a mutex around cache access 2013-01-11 16:55:37 +01:00
Carlos Martín Nieto
c0f4a0118d pack: introduce a delta base cache
Many delta bases are re-used. Cache them to avoid inflating the same
data repeatedly.

This version doesn't limit the amount of entries to store, so it can
end up using a considerable amound of memory.
2013-01-11 16:55:37 +01:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Vicent Martí
0249a5032e Merge pull request #1091 from carlosmn/stream-object
Indexer speedup with large objects
2012-12-07 09:40:21 -08:00
David Michael Barr
44f9f54797 pack: add git_packfile_resolve_header
To paraphrase @peff:

You can get both size and type from a packed object reasonably cheaply.
If you have:

* An object that is not a delta; both type and size are available in the
  packfile header.
* An object that is a delta. The packfile type will be OBJ_*_DELTA, and
  you have to resolve back to the base to find the real type. That means
  potentially a lot of packfile index lookups, but each one is
  relatively cheap. For the size, you inflate the first few bytes of the
  delta, whose header will tell you the resulting size of applying the
  delta to the base.

For simplicity, we just decompress the whole delta for now.
2012-12-03 10:39:17 +11:00
Carlos Martín Nieto
46635339e9 pack: introduce a streaming API for raw objects
This allows us to take objects from the packfile as a stream instead
of having to keep it all in memory.
2012-11-30 15:55:23 +01:00
Russell Belfer
c3fb7d04ed Make git_odb_foreach_cb take const param
This makes the first OID param of the ODB callback a const pointer
and also propogates that change all the way to the backends.
2012-11-27 15:00:49 -08:00
Sven Strickroth
fcb48e0680 Set p->mwf.fd to -1 on error
If p->mwf.fd is e.g. -2 then it is closed in packfile_free and an exception might be thrown.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
2012-11-24 15:50:51 +01:00
Martin Woodward
826bc4a81b Remove use of English expletives
Remove words such as fuck, crap, shit etc.
Remove other potentially offensive words from comments.
Tidy up other geopolicital terms in comments.
2012-11-23 13:31:22 +00:00
David Michael Barr
60ecdf59d3 pack: iterate objects in offset order
Compute the ordering on demand and persist until the index is freed.
2012-09-14 15:52:41 +10:00
Vicent Marti
51e1d80846 Merge remote-tracking branch 'arrbee/tree-walk-fixes' into development
Conflicts:
	src/notes.c
	src/transports/git.c
	src/transports/http.c
	src/transports/local.c
	tests-clar/odb/foreach.c
2012-08-06 12:41:08 +02:00
Russell Belfer
5dca201072 Update iterators for consistency across library
This updates all the `foreach()` type functions across the library
that take callbacks from the user to have a consistent behavior.
The rules are:

* A callback terminates the loop by returning any non-zero value
* Once the callback returns non-zero, it will not be called again
  (i.e. the loop stops all iteration regardless of state)
* If the callback returns non-zero, the parent fn returns GIT_EUSER
* Although the parent returns GIT_EUSER, no error will be set in
  the library and `giterr_last()` will return NULL if called.

This commit makes those changes across the library and adds tests
for most of the iteration APIs to make sure that they follow the
above rules.
2012-08-03 17:08:01 -07:00
nulltoken
b8457baae2 portability: Improve x86/amd64 compatibility 2012-07-24 16:10:12 +02:00
Carlos Martín Nieto
521aedad30 odb: add git_odb_foreach()
Go through each backend and list every objects that exists in
them. This allows fsck-like uses.
2012-07-03 12:50:51 +02:00
Carlos Martin Nieto
1d8943c640 mwindow: allow memory-window files to deregister
Once a file is registered, there is no way to deregister it, even
after the structure that contains it is no longer needed and has been
freed. This may be the source of #624.

Allow and use the deregister function to remove our file from the
global list.
2012-06-28 12:10:33 +02:00
Chris Young
2aeadb9c78 Actually do the mmap... unsurprisingly, this makes the indexer work on SFS
On RAM: the .idx and .pack files become links to a .lock and the original download respectively.
Assume some feature (such as record locking) supported by SFS but not JXFS or RAM: is required.
2012-06-12 19:25:09 +01:00