Commit Graph

43 Commits

Author SHA1 Message Date
Edward Thomson
e2e4bae9a0 tree: drop the now-unnecessary entries vector
Remove the now-unnecessary entries vector.  Add `git_array_search`
to binary search through an array to accomplish this.
2016-03-22 06:21:13 -07:00
Carlos Martín Nieto
4ed9e939e2 tree: store the entries in a growable array
Take advantage of the constant size of tree-owned arrays and store them
in an array instead of a pool. This still lets us free them all at once
but lets the system allocator do the work of fitting them in.
2016-03-20 12:01:45 +01:00
Carlos Martín Nieto
60a194aa86 tree: re-use the id and filename in the odb object
Instead of copying over the data into the individual entries, point to
the originals, which are already in a format we can use.
2016-03-20 11:00:12 +01:00
Carlos Martín Nieto
ee42bb0e3d tree: make path len uint16_t and avoid holes
This reduces the size of the struct from 32 to 26 bytes, and leaves a
single padding byte at the end of the struct (which comes from the
zero-length array).
2015-11-28 19:21:52 +01:00
Carlos Martín Nieto
ed970748b6 tree: pool the entry memory allocations
These are rather small allocations, so we end up spending a non-trivial
amount of time asking the OS for memory. Since these entries are tied to
the lifetime of their tree, we can give the tree a pool so we speed up
the allocations.
2015-11-28 19:21:51 +01:00
Edward Thomson
dce7b1a4e7 treebuilder: take a repository for path validation
Path validation may be influenced by `core.protectHFS` and
`core.protectNTFS` configuration settings, thus treebuilders
can take a repository to influence their configuration.
2014-12-17 13:05:27 -05:00
Carlos Martín Nieto
fcc6006607 treentry: no need for manual size book-keeping
We can simply ask the hasmap.
2014-06-10 15:14:13 +02:00
Carlos Martín Nieto
978fbb4c34 treebuilder: don't keep removed entries around
If the user wants to keep a copy for themselves, they should make a
copy. It adds unnecessary complexity to make sure the returned entries
are valid until the builder is cleared.
2014-06-10 15:14:13 +02:00
Carlos Martín Nieto
4d3f1f9740 treebuilder: use a map instead of vector to store the entries
Finding a filename in a vector means we need to resort it every time we
want to read from it, which includes every time we want to write to it
as well, as we want to find duplicate keys.

A hash-map fits what we want to do much more accurately, as we do not
care about sorting, but just the particular filename.

We still keep removed entries around, as the interface let you assume
they were going to be around until the treebuilder is cleared or freed,
but in this case that involves an append to a vector in the filter case,
which can now fail.

The only time we care about sorting is when we write out the tree, so
let's make that the only time we do any sorting.
2014-06-10 15:14:13 +02:00
Russell Belfer
58206c9ae7 Add cat-file example and increase const use in API
This adds an example implementation that emulates git cat-file.
It is a convenient and relatively simple example of getting data
out of a repository.

Implementing this also revealed that there are a number of APIs
that are still not using const pointers to objects that really
ought to be.  The main cause of this is that `git_vector_bsearch`
may need to call `git_vector_sort` before doing the search, so a
const pointer to the vector is not allowed.  However, for tree
objects, with a little care, we can ensure that the vector of
tree entries is always sorted and allow lookups to take a const
pointer.  Also, the missing const in commit objects just looks
like an oversight.
2013-05-16 10:38:27 -07:00
Russell Belfer
3f27127d15 Simplify object table parse functions
This unifies the object parse functions into one signature that
takes an odb_object.
2013-04-22 16:52:06 +02:00
Russell Belfer
786062639f Add callback to git_objects_table
This adds create and free callback to the git_objects_table so
that more of the creation and destruction of objects can be table
driven instead of using switch statements.  This also makes the
semantics of certain object creation functions consistent so that
we can make better use of function pointers.  This also fixes a
theoretical error case where an object allocation fails and we
end up storing NULL into the cache.
2013-04-22 16:51:40 +02:00
Vicent Marti
575a54db85 object: Export git_object_dup 2013-04-10 16:56:32 +02:00
Russell Belfer
93ab370b53 Store treebuilder length separately from entries vec
The treebuilder entries vector flags removed items which means
we can't rely on the entries vector length to accurately get the
number of entries.  This adds an entrycount value and maintains it
while updating the treebuilder entries.
2013-02-20 10:50:01 -08:00
Russell Belfer
98527b5b24 Add git_tree_entry_cmp and git_tree_entry_icmp
This adds a new external API git_tree_entry_cmp and a new internal
API git_tree_entry_icmp for sorting tree entries.  The case
insensitive one is internal only because general users should
never be seeing case-insensitively sorted trees.
2013-01-15 09:51:35 -08:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Russell Belfer
91e7d26303 Fix iterator reset and add reset ranges
The `git_iterator_reset` command has not been working in all cases
particularly when there is a start and end range.  This fixes it
and adds tests for it, and also extends it with the ability to
update the start/end range strings when an iterator is reset.
2012-12-10 15:38:41 -08:00
Russell Belfer
16248ee2d1 Fix up some missing consts in tree & index
This fixes some missed places where we can apply const-ness to
various public APIs.

There are still some index and tree APIs that cannot take const
pointers because we sort our `git_vectors` lazily and so we can't
reliably bsearch the index and tree content without applying a
`git_vector_sort()` first.

This also fixes some missed places where size_t can be used and
where const can be applied to a couple internal functions.
2012-11-27 13:18:29 -08:00
Vicent Marti
276ea401b3 index: Add git_index_write_tree 2012-11-01 20:17:10 +01:00
nulltoken
a7dbac0b23 filemode: deploy enum usage 2012-08-21 23:15:10 +02:00
Vicent Marti
0e2fcca850 tree: Bring back entry_bypath
Smaller, simpler, faster.
2012-06-29 02:21:12 +02:00
Vicent Martí
9d0011fd83 tree: Naming conventions 2012-05-16 19:24:35 +02:00
Russell Belfer
41a82592ef Ranged iterators and rewritten git_status_file
The goal of this work is to rewrite git_status_file to use the
same underlying code as git_status_foreach.

This is done in 3 phases:

1. Extend iterators to allow ranged iteration with start and
   end prefixes for the range of file names to be covered.
2. Improve diff so that when there is a pathspec and there is
   a common non-wildcard prefix of the pathspec, it will use
   ranged iterators to minimize excess iteration.
3. Rewrite git_status_file to call git_status_foreach_ext
   with a pathspec that covers just the one file being checked.

Since ranged iterators underlie the status & diff implementation,
this is actually fairly efficient.  The workdir iterator does
end up loading the contents of all the directories down to the
single file, which should ideally be avoided, but it is pretty
good.
2012-05-15 14:34:15 -07:00
Russell Belfer
277e304149 Fix handling of submodules in trees 2012-03-26 11:22:27 -07:00
schu
5e0de32818 Update Copyright header
Signed-off-by: schu <schu-github@schulog.org>
2012-02-13 17:11:09 +01:00
schu
b3408e3e66 treebuilder: remove needless variable entry_count
Signed-off-by: schu <schu-github@schulog.org>
2012-02-05 14:59:45 +01:00
Clemens Buchacher
a26a156349 move entry_is_tree to tree.h 2011-12-30 20:14:01 +01:00
Vicent Marti
bb742ede3d Cleanup legal data
1. The license header is technically not valid if it doesn't have a
copyright signature.

2. The COPYING file has been updated with the different licenses used in
the project.

3. The full GPLv2 header in each file annoys me.
2011-09-19 01:54:32 +03:00
Vicent Marti
0ad6efa110 Build & write custom trees in memory 2011-04-04 19:25:33 +03:00
Vicent Marti
72a3fe42fb I broke your bindings
Hey. Apologies in advance -- I broke your bindings.

This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.

Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!

Major features include:

	- Real caching and refcounting on parsed objects
	- Real caching and refcounting on objects read from the ODB
	- Streaming writes & reads from the ODB
	- Single-method writes for all object types
	- The external API is now partially thread-safe

The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.

Here's a full changelog for the external API:

blob.h
------

	- Remove `git_blob_new`
	- Remove `git_blob_set_rawcontent`
	- Remove `git_blob_set_rawcontent_fromfile`
	- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
	- Change `git_blob_create_fromfile`:
		The `path` argument is now relative to the repository's working dir
	- Add `git_blob_create_frombuffer`

commit.h
--------

	- Remove `git_commit_new`
	- Remove `git_commit_add_parent`
	- Remove `git_commit_set_message`
	- Remove `git_commit_set_committer`
	- Remove `git_commit_set_author`
	- Remove `git_commit_set_tree`

	- Add `git_commit_create`
	- Add `git_commit_create_v`
	- Add `git_commit_create_o`
	- Add `git_commit_create_ov`

tag.h
-----

	- Remove `git_tag_new`
	- Remove `git_tag_set_target`
	- Remove `git_tag_set_name`
	- Remove `git_tag_set_tagger`
	- Remove `git_tag_set_message`

	- Add `git_tag_create`
	- Add `git_tag_create_o`

tree.h
------

	- Change `git_tree_entry_2object`:
		New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`

	- Remove `git_tree_new`
	- Remove `git_tree_add_entry`
	- Remove `git_tree_remove_entry_byindex`
	- Remove `git_tree_remove_entry_byname`
	- Remove `git_tree_clearentries`
	- Remove `git_tree_entry_set_id`
	- Remove `git_tree_entry_set_name`
	- Remove `git_tree_entry_set_attributes`

object.h
------------

	- Remove `git_object_new
	- Remove `git_object_write`

	- Change `git_object_close`:
		This method is now *mandatory*. Not closing an object causes a
		memory leak.

odb.h
-----

	- Remove type `git_rawobj`
	- Remove `git_rawobj_close`
	- Rename `git_rawobj_hash` -> `git_odb_hash`
	- Change `git_odb_hash`:
		New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`

	- Add type `git_odb_object`
	- Add `git_odb_object_close`

	- Change `git_odb_read`:
		New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
	- Change `git_odb_read_header`:
		New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
	- Remove `git_odb_write`
	- Add `git_odb_open_wstream`
	- Add `git_odb_open_rstream`

odb_backend.h
-------------

	- Change type `git_odb_backend`:
		New internal signatures are as follows

			int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
			int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)

	- Add type `git_odb_stream`
	- Add enum `git_odb_streammode`

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-20 21:45:11 +02:00
Vicent Marti
48c27f86bb Implement reference counting for git_objects
All `git_object` instances looked up from the repository are reference
counted. User is expected to use the new `git_object_close` when an
object is no longer needed to force freeing it.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-03 20:23:52 +02:00
Vicent Marti
4569bfa55a Keep the tree entries always internally sorted
Don't allow access to any tree entries whilst the entries array is
unsorted. We keep track on when the array is unsorted, and any methods
that access the array while it is unsorted now sort the array before
accessing it.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-02-05 09:11:17 +02:00
Vicent Marti
c8f5ff8f65 Fix initialization of in-memory trees
In-memory tree objects were not being properly initialized, because the
internal entries vector was created on the 'parse' method.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-01-20 14:43:27 -08:00
Vicent Marti
44908fe763 Change the library include file
Libgit2 is now officially include as

	#include "<git2.h>"

or indidividual files may be included as

	#include <git2/index.h>

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-06 23:03:16 +02:00
Vicent Marti
c4034e63f3 Refactor all 'vector' functions into common code
All the operations on the 'git_index_entry' array and the
'git_tree_entry' array have been refactored into common code in the
src/vector.c file.

The new vector methods support:
	- insertion:	O(1) (avg)
	- deletion:		O(n)
	- searching:	O(logn)
	- sorting:		O(logn)
	- r. access:	O(1)

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-02 04:31:54 +02:00
Vicent Marti
585190183b Fix internal memory management on the library
String mememory is now managed in a much more sane manner.

Fixes include:

	- git_person email and name is no longer limited to 64 characters
	- git_tree_entry filename is no longer limited to 255 characters
	- raw objects are properly opened & closed the minimum amount of
	times required for parsing
	- unit tests no longer leak
	- removed 5 other misc memory leaks as reported by Valgrind
	- tree writeback no longer segfaults on rare ocassions

The git_person struct is no longer public. It is now managed by the
library, and getter methods are in place to access its internal
attributes.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-10-28 02:07:18 +03:00
Vicent Marti
2a884588b4 Add write-back support for git_tree
All the setter methods for git_tree have been added, including the
setters for attributes on each git_tree_entry and methods to add/remove
entries of the tree.

Modified trees and trees created in-memory from scratch can be written
back to the repository using git_object_write().

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-09-21 17:17:10 +03:00
Vicent Marti
f49a2e4981 Give object structures more descriptive names
The 'git_obj' structure is now called 'git_rawobj', since
it represents a raw object read from the ODB.

The 'git_repository_object' structure is now called 'git_object',
since it's the base object class for all objects.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-09-19 03:21:06 +03:00
Vicent Marti
370ce56910 Fix: do not export custom types in the extern API
Some compilers give linking problems when exporting 'uint32_t' as a
return type in the external API. Use generic types instead.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-09-09 00:48:09 +03:00
Vicent Marti
003c269094 Finish the tree object API
The interface for loading and parsing tree objects from a repository has
been completed with all the required accesor methods for attributes,
support for manipulating individual tree entries and a new unit test
t0901-readtree which tries to load and parse a tree object from a
repository.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-08-12 18:49:04 +02:00
Vicent Marti
3315782cb4 Redesigned the walking/object lookup interface
The old 'git_revpool' object has been removed and
split into two distinct objects with separate
functionality, in order to have separate methods for
object management and object walking.

*	A new object 'git_repository' does the high-level
	management of a repository's objects (commits, trees,
	tags, etc) on top of a 'git_odb'.

	Eventually, it will also manage other repository
	attributes (e.g. tag resolution, references, etc).

	See: src/git/repository.h

*	A new external method
		'git_repository_lookup(repo, oid, type)'
	has been added to the 'git_repository' API.

	All object lookups (git_XXX_lookup()) are now
	wrappers to this method, and duplicated code
	has been removed. The method does automatic type
	checking and returns a generic 'git_revpool_object'
	that can be cast to any specific object.

	See: src/git/repository.h

*	The external methods for object parsing of repository
	objects (git_XXX_parse()) have been removed.

	Loading objects from the repository is now managed
	through the 'lookup' functions. These objects are
	loaded with minimal information, and the relevant
	parsing is done automatically when the user requests
	any of the parsed attributes through accessor methods.

	An attribute has been added to 'git_repository' in
	order to force the parsing of all the repository objects
	immediately after lookup.

	See: src/git/commit.h
	See: src/git/tag.h
	See: src/git/tree.h

*	The previous walking functionality of the revpool
	is now found in 'git_revwalk', which does the actual
	revision walking on a repository; the attributes
	when walking through commits in a database have been
	decoupled from the actual commit objects.
	This increases performance when accessing commits
	during the walk and allows to have several
	'git_revwalk' instances working at the same time on
	top of the same repository, without having to load
	commits in memory several times.

	See: src/git/revwalk.h

*	The old 'git_revpool_table' has been renamed to
	'git_hashtable' and now works as a generic hashtable
	with support for any kind of object and custom hash
	functions.

	See: src/hashtable.h

*	All the relevant unit tests have been updated, renamed
	and grouped accordingly.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-08-12 18:48:55 +02:00
Vicent Marti
d8603ed901 Add parsing of tree file contents.
The basic information (pointed trees and blobs) of each tree object in a
revision pool can now be parsed and queried.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-15 23:41:49 +02:00
Vicent Marti
225fe21522 Add support for tree objects in revision pools
Commits now store pointers to their tree objects.
Tree objects now work as separate git_revpool_object
entities.
Tree objects can be loaded and parsed inedependently
from commits.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-07-15 23:39:30 +02:00