Commit Graph

99 Commits

Author SHA1 Message Date
Jakob Pfender
c0cd9d506b revwalk.c: Move to new error handling mechanism 2011-05-23 21:38:35 +03:00
Vicent Marti
5ca2f58057 Do not set error message on GIT_EREVWALKOVER
This is not really an error, just a special return code to mark the end
of an iteration.
2011-05-15 23:48:05 +03:00
schu
b51c92693d Move revwalk.c to the new error handling
Signed-off-by: schu <schu-github@schulog.org>
2011-05-11 14:44:44 +02:00
schu
402a47a7fa Fix -Wunused-but-set-variable warnings
As of gcc 4.6 -Wall includes -Wunused-but-set-variable. Use GIT_UNUSED
or remove actually unused variables to prevent those warnings.
2011-04-26 11:29:05 +02:00
Vicent Marti
c6e65acae6 Properly check strtol for errors
We are now using a custom `strtol` implementation to make sure we're not
missing any overflow errors.
2011-04-09 15:22:11 -07:00
Vicent Marti
21d73e7195 Always free the parents of a revwalk commit
Thanks to Carlos Martín Nieto for spotting this.
2011-03-22 20:38:36 +02:00
Vicent Marti
72a3fe42fb I broke your bindings
Hey. Apologies in advance -- I broke your bindings.

This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.

Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!

Major features include:

	- Real caching and refcounting on parsed objects
	- Real caching and refcounting on objects read from the ODB
	- Streaming writes & reads from the ODB
	- Single-method writes for all object types
	- The external API is now partially thread-safe

The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.

Here's a full changelog for the external API:

blob.h
------

	- Remove `git_blob_new`
	- Remove `git_blob_set_rawcontent`
	- Remove `git_blob_set_rawcontent_fromfile`
	- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
	- Change `git_blob_create_fromfile`:
		The `path` argument is now relative to the repository's working dir
	- Add `git_blob_create_frombuffer`

commit.h
--------

	- Remove `git_commit_new`
	- Remove `git_commit_add_parent`
	- Remove `git_commit_set_message`
	- Remove `git_commit_set_committer`
	- Remove `git_commit_set_author`
	- Remove `git_commit_set_tree`

	- Add `git_commit_create`
	- Add `git_commit_create_v`
	- Add `git_commit_create_o`
	- Add `git_commit_create_ov`

tag.h
-----

	- Remove `git_tag_new`
	- Remove `git_tag_set_target`
	- Remove `git_tag_set_name`
	- Remove `git_tag_set_tagger`
	- Remove `git_tag_set_message`

	- Add `git_tag_create`
	- Add `git_tag_create_o`

tree.h
------

	- Change `git_tree_entry_2object`:
		New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`

	- Remove `git_tree_new`
	- Remove `git_tree_add_entry`
	- Remove `git_tree_remove_entry_byindex`
	- Remove `git_tree_remove_entry_byname`
	- Remove `git_tree_clearentries`
	- Remove `git_tree_entry_set_id`
	- Remove `git_tree_entry_set_name`
	- Remove `git_tree_entry_set_attributes`

object.h
------------

	- Remove `git_object_new
	- Remove `git_object_write`

	- Change `git_object_close`:
		This method is now *mandatory*. Not closing an object causes a
		memory leak.

odb.h
-----

	- Remove type `git_rawobj`
	- Remove `git_rawobj_close`
	- Rename `git_rawobj_hash` -> `git_odb_hash`
	- Change `git_odb_hash`:
		New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`

	- Add type `git_odb_object`
	- Add `git_odb_object_close`

	- Change `git_odb_read`:
		New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
	- Change `git_odb_read_header`:
		New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
	- Remove `git_odb_write`
	- Add `git_odb_open_wstream`
	- Add `git_odb_open_rstream`

odb_backend.h
-------------

	- Change type `git_odb_backend`:
		New internal signatures are as follows

			int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
			int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)

	- Add type `git_odb_stream`
	- Add enum `git_odb_streammode`

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-20 21:45:11 +02:00
Vicent Marti
b5c5f0f808 Fix headers for the new Revision Walker
The "oid.h" header is now included instead of "object.h".

The old "revwalk.h" header has been removed; it was empty.
2011-03-16 23:59:09 +02:00
Vicent Marti
36aaf1ff1a Change the Revwalk reset behavior to the old version
The `reset` call now removes the pushed commits so we can reuse
the revwalker. The API documentation has been updated with the details.
2011-03-16 01:53:25 +02:00
Vicent Marti
36b3132966 Properly free commit a commit list in revwalk
The commit list was not being properly free'd when a walk was stopped
halfway through.
2011-03-16 01:04:17 +02:00
Vicent Marti
71db842fac Rewrite the Revision Walker
The new revision walker uses an internal Commit object storage system,
custom memory allocator and much improved topological and time sorting
algorithms. It's about 20x times faster than the previous implementation
when browsing big repositories.

The following external API calls have changed:

	`git_revwalk_next` returns an OID instead of a full commit object.
	The initial call to `git_revwalk_next` is no longer blocking when
	iterating through a repo with a time-sorting mode.

	Iterating with Topological or inverted modes still makes the initial
	call blocking to preprocess the commit list, but this block should be
	mostly unnoticeable on most repositories (topological preprocessing
	times at 0.3s on the git.git repo).

	`git_revwalk_push` and `git_revwalk_hide` now take an OID instead
	of a full commit object.
2011-03-14 23:52:15 +02:00
Vicent Marti
f335b42c72 Fix segmentation fault when freeing a repository
Disable garbage collection of cross-references to prevent
double-freeing. Internal reference management is now done
with a separate method.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-05 02:05:26 +02:00
Vicent Marti
48c27f86bb Implement reference counting for git_objects
All `git_object` instances looked up from the repository are reference
counted. User is expected to use the new `git_object_close` when an
object is no longer needed to force freeing it.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-03 20:23:52 +02:00
Vicent Marti
fc658755bf Rewrite git_hashtable internals
The old hash table with chained buckets has been replaced by a new one
using Cuckoo hashing, which offers guaranteed constant lookup times.
This should improve speeds on most use cases, since hash tables in
libgit2 are usually used as caches where the objects are stored once and
queried several times.

The Cuckoo hash implementation is based off the one in the Basekit
library [1] for the IO language, but rewritten to support an arbritrary
number of hashes. We currently use 3 to maximize the usage of the nodes pool.

[1]: https://github.com/stevedekorte/basekit/blob/master/source/CHash.c

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-02-22 21:59:36 +02:00
Vicent Marti
cb77ad0d4e Fix segfault when iterating a revlist backwards
The `prev` and `next` pointers were not being updated after popping one
of the list elements.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-02-18 12:23:53 +02:00
Vicent Marti
c836c332f1 Make more methods return error codes
git_revwalk_next now returns an error code when the iteration is over.
git_repository_index now returns an error code when the index file could
not be opened.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-02-05 09:29:37 +02:00
Vicent Marti
b5ced41e85 Merge branch 'timezone' 2010-12-18 02:35:45 +02:00
Vicent Marti
638c2ca428 Rename 'git_person' to 'git_signature'
The new signature struct is public, and contains information about the
timezone offset. Must be free'd manually by the user.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-18 02:35:33 +02:00
Vicent Marti
1f080e2da4 Fix initialization & freeing of inexistent repos
Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-13 03:43:56 +02:00
Vicent Marti
eec9523513 Commit parents now use the common 'vector' code
No more linked lists, no more O(n) access.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-02 04:58:22 +02:00
Vicent Marti
1795f87952 Improve error handling
All initialization functions now return error codes instead of pointers.
Error codes are now properly propagated on most functions. Several new
and more specific error codes have been added in common.h

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-11-05 03:20:17 +02:00
Vicent Marti
a13bc8e74f Add getter methods for object owners
You can know access the owning repository of any existing object, or the
repository on which a revision walker is working on.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-10-29 02:22:38 +03:00
Vicent Marti
0c3596f18a Add setter methods & write support for git_commit
All the required git_commit_set_XXX methods have been implemented; all
the attributes of a commit object can now be modified in-memory.

The new method git_object_write() automatically writes back the
in-memory changes of any object to the repository. So far it only
supports git_commit objects.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-09-20 02:04:06 +03:00
Vicent Marti
3315782cb4 Redesigned the walking/object lookup interface
The old 'git_revpool' object has been removed and
split into two distinct objects with separate
functionality, in order to have separate methods for
object management and object walking.

*	A new object 'git_repository' does the high-level
	management of a repository's objects (commits, trees,
	tags, etc) on top of a 'git_odb'.

	Eventually, it will also manage other repository
	attributes (e.g. tag resolution, references, etc).

	See: src/git/repository.h

*	A new external method
		'git_repository_lookup(repo, oid, type)'
	has been added to the 'git_repository' API.

	All object lookups (git_XXX_lookup()) are now
	wrappers to this method, and duplicated code
	has been removed. The method does automatic type
	checking and returns a generic 'git_revpool_object'
	that can be cast to any specific object.

	See: src/git/repository.h

*	The external methods for object parsing of repository
	objects (git_XXX_parse()) have been removed.

	Loading objects from the repository is now managed
	through the 'lookup' functions. These objects are
	loaded with minimal information, and the relevant
	parsing is done automatically when the user requests
	any of the parsed attributes through accessor methods.

	An attribute has been added to 'git_repository' in
	order to force the parsing of all the repository objects
	immediately after lookup.

	See: src/git/commit.h
	See: src/git/tag.h
	See: src/git/tree.h

*	The previous walking functionality of the revpool
	is now found in 'git_revwalk', which does the actual
	revision walking on a repository; the attributes
	when walking through commits in a database have been
	decoupled from the actual commit objects.
	This increases performance when accessing commits
	during the walk and allows to have several
	'git_revwalk' instances working at the same time on
	top of the same repository, without having to load
	commits in memory several times.

	See: src/git/revwalk.h

*	The old 'git_revpool_table' has been renamed to
	'git_hashtable' and now works as a generic hashtable
	with support for any kind of object and custom hash
	functions.

	See: src/hashtable.h

*	All the relevant unit tests have been updated, renamed
	and grouped accordingly.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-08-12 18:48:55 +02:00
Vicent Marti
52f2390b43 Add external API to access detailed commit attributes
The following new external methods have been added:

GIT_EXTERN(const char *) git_commit_message_short(git_commit *commit);
GIT_EXTERN(const char *) git_commit_message(git_commit *commit);
GIT_EXTERN(time_t) git_commit_time(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_committer(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_author(git_commit *commit);
GIT_EXTERN(const git_tree *) git_commit_tree(git_commit *commit);

A new structure, git_commit_person has been added to represent a
commit's author or committer.

The parsing of a commit has been split in two phases.
When adding a commit to the revision pool:
	- the commit's ODB object is opened
	- its raw contents are parsed for commit TIME, PARENTS and TREE
		(the minimal amount of data required to traverse the pool)
	- the commit's ODB object is closed

When querying for extended information on a commit:
	- the commit's ODB object is reopened
	- its raw contents are parsed for the requested information
	- the commit's ODB object remains open to handle additional queries

New unit tests have been added for the new functionality:

	In t0401-parse: parse_person_test
	In t0402-details: query_details_test

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-15 23:40:52 +02:00
Vicent Marti
225fe21522 Add support for tree objects in revision pools
Commits now store pointers to their tree objects.
Tree objects now work as separate git_revpool_object
entities.
Tree objects can be loaded and parsed inedependently
from commits.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-15 23:39:30 +02:00
Vicent Marti
40721f6b12 Changed revpool's object table to support arbitrary objects
git_revpool_object now has a type identifier for each object
type in a revpool (commits, trees, blobs, etc).

Trees can now be stored in the revision pool.

git_revpool_tableit now supports filtering objects by their
type when iterating through the object table.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-15 23:39:22 +02:00
Vicent Marti
088a731f00 Fixed memory leaks in test suite
Created commit objects in t0401-parse weren't being freed properly.
Updated the API documentation to note that commit objects are owned
by the revision pool and should not be freed manually.

The parents list of each commit was being freed twice after each test.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-10 12:15:12 -07:00
Vicent Marti
58b0cbea74 Actually free all commits when freeing a commit pool
Previously the objects table was being freed, but not
the actuall commits. All git_commit objects are freed
and hence invalidated when freeing the git_rp object
they belong to.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-10 12:14:30 -07:00
Ramsay Jones
f29249340c Style: Do not use (C99) // comments
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 11:18:56 +02:00
Vicent Marti
de141d4bb9 Improved error handling on auxilirary functions.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
6bb7aa1318 Added new error codes. Improved error handling.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
9b3577eda0 Fixed brace placement and converted spaces to tabs.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
69dca95950 Fixed parsing commit times (they weren't being stored at all!)
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
e5d1faefab Add external API for revision sorting.
The GIT_RPSORT_XXX flags have been moved to the external API,
and a new method 'gitrp_sorting(...)' has been added to safely
change the sorting method of a revision pool.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
9bdb759471 Properly reset all commit properties when doing a gitrp_reset().
Add git_revpool_table_free() method.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
655d381a19 Add topological sorting and new insertion methods for commit lists.
'git_commit_list_toposort()' and 'git_commit_list_timesort()' now
sort a commit list by topological and time order respectively.
Both sorts are stable and in place.

'git_commit_list_append' has been replaced by 'git_commit_list_push_back'
and 'git_commit_list_push_front'.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
a7c182c594 Add object cache to the revision pool.
Fixed issue when generating pending commits list during iteration.

The 'git_commit_lookup' function will now check the pool's cache
for commits which have been previously loaded/parsed; there can only
be a single 'git_commit' structure for each commit on the same pool.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:07 +02:00
Vicent Marti
5e15176dac Add commit caching on the commit table.
Properly initialize the pending commits list.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Vicent Marti
c5696427b6 Add 'git_revpool_object' and 'git_revpool_table' structures.
All the objects which will will be eventually transversable from
a revision pool (commits, trees, etc) now inherit from the
'git_revpool_object' structure which identifies them with their
own OID.

Furthermore, the 'git_revpool_table' and related functions have
been added, which allow for constant time lookup (hash table)
of the loaded revpool objects based on their OID.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Vicent Marti
36b7cdb6a1 Changed 'git_commit_list' from a linked list to a doubly-linked list.
Changed 'git_commit' to use bit fields instead of flags.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Vicent Marti
1a895dd787 Add arbritrary ordering revision walking.
The 'gitrp_next()' method now correctly does a revision walking
of all the pushed revisions in arbritary ordering.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Vicent Marti
8add015392 Split git_commit_lookup into separate functions.
git_commit_lookup() now creates commit references
without loading them from the ODB.

git_commit_parse() creates a commit reference, loads
it and parses it from the ODB.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Vicent Marti
08d5d00056 Add commit parents to parsed commits and commit lists to the revpool.
Basic support for iterating the revpool.

The following functions of the revwalk API have been partially
implemented:

    void gitrp_reset(git_revpool *pool);
    void gitrp_push(git_revpool *pool, git_commit *commit);
    void gitrp_prepare_walk(git_revpool *pool);
    git_commit *gitrp_next(git_revpool *pool);

Parsed commits' parents are now also parsed and stored in a
"git_commit_list" structure (linked list).

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
2010-06-02 10:32:06 +02:00
Shawn O. Pearce
64a47c0142 Wrap malloc and friends and report out of memory as GIT_ENOMEM
We now forbid direct use of malloc, strdup or calloc within the
library and instead use wrapper functions git__malloc, etc. to
invoke the underlying library malloc and set git_errno to a no
memory error code if the allocation fails.

In the future once we have pack objects in memory we are likely
to enhance these routines with garbage collection logic to purge
cached pack data when allocations fail.  Because the size of the
function will grow somewhat large, we don't want to mark them for
inline as gcc tends to aggressively inline, creating larger than
expected executables.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-12-30 23:28:30 -08:00
Andreas Ericsson
c215be4120 Rename git_revpool_* functions gitrp_*
Otherwise their prototypes don't match their declarations.

Detected by 'sparse', which is obviously good to run
before each commit.

Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-11-22 12:08:00 -08:00
Andreas Ericsson
1b9e92c73b s/git_revp/git_revpool/
git_revp is something I personally can't stop pronouncing
"rev pointer". I'm sure others would suffer the same
problem.

Also, rename the git_revp_ sub-api "gitrp_". This is the
first of many such renames, primarily done to prevent
extreme inflation in the "git_" namespace, which we'd like
to reserve for a higher-level API.

While we're at it, we remove the noise-char "c" from a lot
of functions. Since revision walking is all about commits,
the common case should be that we're dealing with commits.
Exceptions can get a more mnemonic description as needed.

Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-11-18 10:32:53 -08:00
Shawn O. Pearce
50298f44a4 Switch the license from BSD to GPL+libgcc exception
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-11-01 15:55:47 -07:00
Shawn O. Pearce
d1ea30c399 Move include files to include/git/, drop git_ prefix from file names
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-11-01 15:42:23 -07:00