Commit Graph

79 Commits

Author SHA1 Message Date
Carlos Martín Nieto
a1fdea2855 tree: implement tree diffing
For each difference in the trees, the callback gets called with the
relevant information so the user can fill in their own data
structures.

Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
2011-12-03 17:47:06 +01:00
Vicent Marti
45e79e3701 Rename all _close methods
There's no difference between `_free` and `_close` semantics: keep
everything with the same name to avoid confusions.
2011-11-26 08:48:00 +01:00
Vicent Marti
9462c47143 repository: Change ownership semantics
The ownership semantics have been changed all over the library to be
consistent. There are no more "borrowed" or duplicated references.

Main changes:

	- `git_repository_open2` and `3` have been dropped.

	- Added setters and getters to hotswap all the repository owned
	objects:

		`git_repository_index`
		`git_repository_set_index`
		`git_repository_odb`
		`git_repository_set_odb`
		`git_repository_config`
		`git_repository_set_config`
		`git_repository_workdir`
		`git_repository_set_workdir`

	Now working directories/index files/ODBs and so on can be
	hot-swapped after creating a repository and between operations.

	- All these objects now have proper ownership semantics with
	refcounting: they all require freeing after they are no longer
	needed (the repository always keeps its internal reference).

	- Repository open and initialization has been updated to keep in
	mind the configuration files. Bare repositories are now always
	detected, and a default config file is created on init.

	- All the tests affected by these changes have been dropped from the
	old test suite and ported to the new one.
2011-11-26 08:37:08 +01:00
Vicent Marti
2ba14f2367 tree: Add payload to git_tree_walk 2011-11-18 01:40:35 +01:00
Vicent Marti
9432af36fc Rename git_tree_frompath to git_tree_get_subtree
That makes more sense to me.
2011-11-18 01:40:35 +01:00
Vicent Marti
3286c408ec global: Properly use git__ memory wrappers
Ensure that all memory related functions (malloc, calloc, strdup, free,
etc) are using their respective `git__` wrappers.
2011-10-28 19:02:36 -07:00
Vicent Marti
da37654d04 tree: Add traversal in post-order 2011-10-27 22:33:31 -07:00
Vicent Marti
28c1451a7c tree: Fix name lookups once and for all
Double-pass binary search. Jeez.
2011-10-20 02:40:14 +02:00
Vicent Marti
8cf2de078d tree: Fix lookups by entry name 2011-10-19 01:34:42 +02:00
nulltoken
3fa735ca3b tree: Add git_tree_frompath() which, given a relative path to a tree entry, retrieves the tree object containing this tree entry 2011-10-13 23:30:07 +02:00
Vicent Marti
8e9bfa4cf0 tree: Fix check for valid attributes 2011-09-27 14:33:19 +02:00
Vicent Marti
9ef9e8c3ad tree: Use an internal append functiont to add new entries 2011-09-27 14:33:18 +02:00
Carlos Martín Nieto
8255c69b10 Make use of the tree cache
Taking advantage of the tree cache, git_tree_create_fromindex becomes
comparable in speed to git write-tree when the cache is available.

Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
2011-09-27 14:33:18 +02:00
nulltoken
ad196c6ae6 config: make git_config_[get|set]_long() able to properly deal with 8 bytes wide values
Should fix issue #419.

Signed-off-by: nulltoken <emeric.fermas@gmail.com>
2011-09-22 18:58:47 +02:00
Vicent Martí
71a4c1f16f Merge pull request #384 from kiryl/warnings
Add more -W flags to CFLAGS
2011-09-18 20:07:59 -07:00
Vicent Martí
ae996e029f Merge pull request #394 from carlosmn/tree-fromindex
Use git_treebuilder to write the index as a tree
2011-09-18 19:59:34 -07:00
Vicent Marti
bb742ede3d Cleanup legal data
1. The license header is technically not valid if it doesn't have a
copyright signature.

2. The COPYING file has been updated with the different licenses used in
the project.

3. The full GPLv2 header in each file annoys me.
2011-09-19 01:54:32 +03:00
Carlos Martín Nieto
4a619797ec tree: use git_treebuilder to write the index as a tree
There is no point in reinventing the wheel when using the treebuilder
is much more straightforward and makes the code more readable. There
is no optimisation, and the performance is no worse than when writing
the tree object ourselves.
2011-09-10 02:05:38 +02:00
Kirill A. Shutemov
d568d5856b CMakefile: add -Wmissing-prototypes and fix warnings
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
2011-08-30 23:55:22 +03:00
Kirill A. Shutemov
0b2c406187 CMakefile: add -Wstrict-aliasing=2 and fix warnings
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
2011-08-30 23:06:04 +03:00
Luc Bertrand
8f643ce8e3 Remove duplicated sort 2011-08-03 13:44:28 +02:00
Kirill A. Shutemov
0cbbdc26a9 tree: fix cast warnings
/home/kas/git/public/libgit2/src/tree.c: In function ‘entry_search_cmp’:
/home/kas/git/public/libgit2/src/tree.c:47:36: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/tree.c: In function ‘git_treebuilder_remove’:
/home/kas/git/public/libgit2/src/tree.c:443:31: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
2011-07-25 21:12:47 +02:00
nulltoken
f4ad64c109 tree: fix insertion of entries with invalid filenames 2011-07-13 07:58:17 +02:00
Vicent Marti
e6629d8313 tree: More accurate matching on entries
The old matcher was returning fake matches when given stupid entry
names. E.g.

	`git2` could be matched by `git2   /`, `git2/foobar`, git2/////`
	and other stupid stuff
2011-07-13 03:36:03 +02:00
Vicent Marti
761aa2aa35 tree: Fix wrong sort order when querying entries
Fixes #127 (that was quite an outstanding issue).

Rationale:

The tree objects on Git are stored and read following a very specific
sorting algorithm that places folders before files. That original sort
was the sort we were storing on memory, but this sort was being queried
with a binary search that used a simple `strcmp` for comparison, so
there were many instances where the search was failing.

Obviously, the most straightforward way to fix this is changing the
binary search CB to use the same comparison method as the sorting CB.
The problem with this is that the binary search callback compares a path
and an entry, so there is no way to know if the given path is a folder
or a standard file.

How do we work around this? Instead of splitting the `entry_byname`
method in two (one for searching directories and one for searching
normal files), we just assume that the path we are searching for is of
the same kind as the path it's being compared at the moment.

	return git_futils_cmp_path(
		ksearch->filename, ksearch->filename_len, entry->attr & 040000,
        entry->filename, entry->filename_len, entry->attr & 040000);

Since there cannot be a folder and a regular file with the same name on
the same tree, the most basic equality check will always fail
for all comparsions, until our path is compared with the actual entry we
are looking for; in this case, the matching will succeed with the file
type of the entry -- whatever it was initially.

I hope that makes sense.

PS: While I was at it, I switched the cmp methods to use cached values
for the length of each filename. That makes searches and sorts
retardedly fast -- I was wondering the reason of the performance hiccups
on massive trees; it's because of 2*strlen for each comparsion call.
2011-07-13 02:49:47 +02:00
Vicent Marti
afeecf4f26 odb: Direct writes are back
DIRECT WRITES ARE BACK AND FASTER THAN EVER. The streaming writer to the
ODB was an overkill for the smaller objects like Commit and Tags; most
of the streaming logic was taking too long.

This commit makes Commits, Tags and Trees to be built-up in memory, and
then written to disk in 2 pushes (header + data), instead of streaming
everything.

This is *always* faster, even for big files (since the git_filebuf class
still does streaming writes when the memory cache overflows). This is
also a gazillion lines of code smaller, because we don't have to
precompute the final size of the object before starting the stream (this
was kind of defeating the point of streaming, anyway).

Blobs are still written with full streaming instead of loading them in
memory, since this is still the fastest way.

A new `git_buf` class has been added. It's missing some features, but
it'll get there.
2011-07-09 02:40:16 +02:00
Vicent Marti
de18f27668 vector: Timsort all of the things
Drop the GLibc implementation of Merge Sort and replace it with Timsort.

The algorithm has been tuned to work on arrays of pointers (void **),
so there's no longer a need to abstract the byte-width of each element
in the array.

All the comparison callbacks now take pointers-to-elements, not
pointers-to-pointers, so there's now one less level of dereferencing.

E.g.

	 int index_cmp(const void *a, const void *b)
	 {
	-	const git_index_entry *entry_a = *(const git_index_entry **)(a);
	+	const git_index_entry *entry_a = (const git_index_entry *)(a);

The result is up to a 40% speed-up when sorting vectors. Memory usage
remains lineal.

A new `bsearch` implementation has been added, whose callback also
supplies pointer-to-elements, to uniform the Vector API again.
2011-07-07 02:54:07 +02:00
Vicent Marti
f79026b491 fileops: Cleanup
Cleaned up the structure of the whole OS-abstraction layer.

fileops.c now contains a set of utility methods for file management used
by the library. These are abstractions on top of the original POSIX
calls.

There's a new file called `posix.c` that contains
emulations/reimplementations of all the POSIX calls the library uses.
These are prefixed with `p_`. There's a specific posix file for each
platform (win32 and unix).

All the path-related methods have been moved from `utils.c` to `path.c`
and have their own prefix.
2011-07-05 02:04:03 +02:00
Kirill A. Shutemov
932d1baf29 cleanup: remove trailing spaces
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
2011-07-01 18:02:56 +02:00
Vicent Marti
fa48608ec3 oid: Rename methods
Yeah. Finally. Fuck the old names, this ain't POSIX
and they don't make any sense at all.
2011-06-16 02:36:21 +02:00
Vicent Martí
1097dacd7d Merge pull request #240 from Romain-Geissler/tree-object-type
Tree: Added a function that returns the type of a tree entry.
2011-06-06 18:33:38 -07:00
Romain Geissler
ff9a4c130d Tree: Added a function that returns the type of a tree entry. 2011-06-06 17:14:30 +02:00
Romain Geissler
c5d8745fca Tree: Some more size_t to unsigned int type change. 2011-06-06 10:55:54 +02:00
Romain Geissler
e5c8009731 Tree: API uniformasation: Use unsigned int for all index number. 2011-06-05 21:18:05 +02:00
Jakob Pfender
bc06a4eeec tree.c: Move to new error handling mechanism 2011-05-23 21:38:39 +03:00
schu
d6de92b6fe Move tree.c to the new error handling
Signed-off-by: schu <schu-github@schulog.org>
2011-05-11 12:40:04 +02:00
Sergey Nikishin
555ce56819 Fix tree-entry attribute convertion (fix corrupted trees)
Magic constant replaced by direct to-string covertion because of:
1) with value length 6 (040000 - subtree) final tree will be corrupted;
2) for wrong values length <6 final tree will be corrupted too.
2011-04-26 15:32:11 +04:00
Vicent Marti
c6e65acae6 Properly check strtol for errors
We are now using a custom `strtol` implementation to make sure we're not
missing any overflow errors.
2011-04-09 15:22:11 -07:00
Shuhei Tanuma
98ac678085 fix git_treebuilder_insert probrem.
couldn't add new entry when inserting new one with `git_treebuilder_insert`.
2011-04-08 03:30:47 +03:00
Vicent Marti
0ad6efa110 Build & write custom trees in memory 2011-04-04 19:25:33 +03:00
Vicent Marti
29e1789b34 Fix the git_tree_write implementation 2011-04-04 12:14:43 +03:00
Sarath Lakshman
47d8ec56e9 New external API method: git_tree_create
Creates a tree by scanning the index file. The method handles recursive
creation of trees for subdirectories and adds them to the parent tree.
2011-04-03 17:18:56 +05:30
Vicent Marti
720d5472f8 Change parse methods to const buffer
Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-04-02 12:42:04 +03:00
Vicent Marti
72a3fe42fb I broke your bindings
Hey. Apologies in advance -- I broke your bindings.

This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.

Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!

Major features include:

	- Real caching and refcounting on parsed objects
	- Real caching and refcounting on objects read from the ODB
	- Streaming writes & reads from the ODB
	- Single-method writes for all object types
	- The external API is now partially thread-safe

The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.

Here's a full changelog for the external API:

blob.h
------

	- Remove `git_blob_new`
	- Remove `git_blob_set_rawcontent`
	- Remove `git_blob_set_rawcontent_fromfile`
	- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
	- Change `git_blob_create_fromfile`:
		The `path` argument is now relative to the repository's working dir
	- Add `git_blob_create_frombuffer`

commit.h
--------

	- Remove `git_commit_new`
	- Remove `git_commit_add_parent`
	- Remove `git_commit_set_message`
	- Remove `git_commit_set_committer`
	- Remove `git_commit_set_author`
	- Remove `git_commit_set_tree`

	- Add `git_commit_create`
	- Add `git_commit_create_v`
	- Add `git_commit_create_o`
	- Add `git_commit_create_ov`

tag.h
-----

	- Remove `git_tag_new`
	- Remove `git_tag_set_target`
	- Remove `git_tag_set_name`
	- Remove `git_tag_set_tagger`
	- Remove `git_tag_set_message`

	- Add `git_tag_create`
	- Add `git_tag_create_o`

tree.h
------

	- Change `git_tree_entry_2object`:
		New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`

	- Remove `git_tree_new`
	- Remove `git_tree_add_entry`
	- Remove `git_tree_remove_entry_byindex`
	- Remove `git_tree_remove_entry_byname`
	- Remove `git_tree_clearentries`
	- Remove `git_tree_entry_set_id`
	- Remove `git_tree_entry_set_name`
	- Remove `git_tree_entry_set_attributes`

object.h
------------

	- Remove `git_object_new
	- Remove `git_object_write`

	- Change `git_object_close`:
		This method is now *mandatory*. Not closing an object causes a
		memory leak.

odb.h
-----

	- Remove type `git_rawobj`
	- Remove `git_rawobj_close`
	- Rename `git_rawobj_hash` -> `git_odb_hash`
	- Change `git_odb_hash`:
		New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`

	- Add type `git_odb_object`
	- Add `git_odb_object_close`

	- Change `git_odb_read`:
		New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
	- Change `git_odb_read_header`:
		New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
	- Remove `git_odb_write`
	- Add `git_odb_open_wstream`
	- Add `git_odb_open_rstream`

odb_backend.h
-------------

	- Change type `git_odb_backend`:
		New internal signatures are as follows

			int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
			int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)

	- Add type `git_odb_stream`
	- Add enum `git_odb_streammode`

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-20 21:45:11 +02:00
Vicent Marti
b5c5f0f808 Fix headers for the new Revision Walker
The "oid.h" header is now included instead of "object.h".

The old "revwalk.h" header has been removed; it was empty.
2011-03-16 23:59:09 +02:00
Vicent Marti
545a6915eb Change interface for Tree Index attr (always unsigned)
Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-05 13:46:37 +02:00
Sakari Jokinen
9de27ad0c4 Check for valid range of attributes for tree entry 2011-03-05 13:46:27 +02:00
Vicent Marti
48c27f86bb Implement reference counting for git_objects
All `git_object` instances looked up from the repository are reference
counted. User is expected to use the new `git_object_close` when an
object is no longer needed to force freeing it.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-03 20:23:52 +02:00
Vicent Marti
86d7e1ca6f Fix searching in git_vector
We now store only one sorting callback that does entry comparison. This
is used when sorting the entries using a quicksort, and when looking for
a specific entry with the new search methods.

The following search methods now exist:

	git_vector_search(vector, entry)
	git_vector_search2(vector, custom_search_callback, key)

	git_vector_bsearch(vector, entry)
	git_vector_bsearch2(vector, custom_search_callback, key)

The sorting state of the vector is now stored internally.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-03 20:23:52 +02:00
Vicent Marti
5de079b86d Change the object creation/lookup API
The methods previously known as

	git_repository_lookup
	git_repository_newobject
	git_repository_lookup_ref

are now part of their respective namespaces:

	git_object_lookup
	git_object_new
	git_reference_lookup

This makes the API more consistent with the new references API.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-03 20:23:52 +02:00