"git_config_backend" have been renamed to "git_config_file", which
implements a generic interface to access a configuration file -- be it
either on disk, from a DB or whatever mumbojumbo.
I think this makes more sense.
Regarding "initialize" vs. "initialise", www.dict.cc says the first is American
English whereas the latter in British English. For consistency, we should
stick to American English.
When I changed it over to use different strings for the variable and
the name, cvar_name_normalize was left behind. Fix this and rename to
cvar_normalize_name to reflect the incompatible change.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Now we use a simple closed-addressing cache. Cuckoo hashing was creating
too many issues with race conditions. Fuck that.
Let's see what happens performance wise, we may have to roll back or
come up with another way to implement an efficient multi-threaded cache.
Configuration options can come from different sources. Currently,
there is only support for reading them from a flat file, but it might
make sense to read it from a database at some point.
Move the parsing code into src/config_file.c and create an include
file include/git2/config_backend.h to allow for other backends to be
developed.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Ok, this is the real deal. Hopefully. Here's how it's going to work:
- One main method, called `git__throw`, that sets the error
code and error message when an error happens.
This method must be called in every single place where an error
code was being returned previously, setting an error message
instead.
Example, instead of:
return GIT_EOBJCORRUPTED;
Use:
return git__throw(GIT_EOBJCORRUPTED,
"The object is missing a finalizing line feed");
And instead of:
[...] {
error = GIT_EOBJCORRUPTED;
goto cleanup;
}
Use:
[...] {
error = git__throw(GIT_EOBJCORRUPTED, "What an error!");
goto cleanup;
}
The **only** exception to this are the allocation methods, which
return NULL on failure but already set the message manually.
/* only place where an error code can be returned directly,
because the error message has already been set by the wrapper */
if (foo == NULL)
return GIT_ENOMEM;
- One secondary method, called `git__rethrow`, which can be used to
fine-grain an error message and build an error stack.
Example, instead of:
if ((error = foobar(baz)) < GIT_SUCCESS)
return error;
You can now do:
if ((error = foobar(baz)) < GIT_SUCCESS)
return git__rethrow(error, "Failed to do a major operation");
The return of the `git_lasterror` method will be a string in the
shape of:
"Failed to do a major operation. (Failed to do an internal
operation)"
E.g.
"Failed to open the index. (Not enough permissions to access
'/path/to/index')."
NOTE: do not abuse this method. Try to write all `git__throw`
messages in a descriptive manner, to avoid having to rethrow them to
clarify their meaning.
This method should only be used in the places where the original
error message set by a subroutine is not specific enough.
It is encouraged to continue using this style as much possible to
enforce error propagation:
if ((error = foobar(baz)) < GIT_SUCCESS)
return error; /* `foobar` has set an error message, and
we are just propagating it */
The error handling revamp will take place in two phases:
- Phase 1: Replace all pieces of code that return direct error codes
with calls to `git__throw`. This can be done semi-automatically
using `ack` to locate all the error codes that must be replaced.
- Phase 2: Add some `git__rethrow` calls in those cases where the
original error messages are not specific enough.
Phase 1 is the main goal. A minor libgit2 release will be shipped once
Phase 1 is ready, and the work will start on gradually improving the
error handling mechanism by refining specific error messages.
OTHER NOTES:
- When writing error messages, please refrain from using weasel
words. They add verbosity to the message without giving any real
information. (<3 Emeric)
E.g.
"The reference file appears to be missing a carriage return"
Nope.
"The reference file is missing a carriage return"
Yes.
- When calling `git__throw`, please try to use more generic error
codes so we can eventually reduce the list of error codes to
something more reasonable. Feel free to add new, more generic error
codes if these are going to replace several of the old ones.
E.g.
return GIT_EREFCORRUPTED;
Can be turned into:
return git__throw(GIT_EOBJCORRUPTED,
"The reference is corrupted");
Win32 critical section objects (CRITICAL_SECTION) are not kernel objects.
Only kernel objects are destroyed by using CloseHandle. Critical sections
are supposed to be deleted with the DeleteCriticalSection API
(http://msdn.microsoft.com/en-us/library/ms682552(VS.85).aspx).
The section and variable names use different rules, so store them as
two different variables internally.
This will simplify the configuration-writing code as well later on,
but even with parsing, the code is simpler.
Take this opportunity to add a variable to the list directly when
parsing instead of passing through config_set.
A root commit is a commit whose branch (usually what HEAD points to)
doesn't exist (yet). This situation can happen when the commit is the
first after 1) a repository is initialized or 2) a orphan checkout has
been performed.
Take this opportunity to remove the symbolic link check, as
git_reference_resolve works on OID refs as well.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Typical use is git_reference_resolve(&ref, ref). Currently, if there is
an error, ref will point to NULL, causing the user to lose that
reference.
Always update resolved_ref instead of just on finding an OID ref,
storing the last valid reference in it.
This change helps simplify the code for allowing root commits.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Magic constant replaced by direct to-string covertion because of:
1) with value length 6 (040000 - subtree) final tree will be corrupted;
2) for wrong values length <6 final tree will be corrupted too.
Removed the optional `replace` argument, we now have 4 add methods:
`git_index_add`: add or update from path
`git_index_add2`: add or update from struct
`git_index_append`: add without replacing from path
`git_index_append2`: add without replacing from struct
Yes, this breaks the bindings.
New external functions:
- git_index_unmerged_entrycount: Counts the unmerged entries in
the index
- git_index_get_unmerged: Gets an unmerged entry from the index
by name
New internal functions:
- read_unmerged: Wrapper for read_unmerged_internal
- read_unmerged_internal: Reads unmerged entries from the index
if the index has the INDEX_EXT_UNMERGED_SIG set
- unmerged_srch: Search function for unmerged vector
- unmerged_cmp: Compare function for unmerged vector
New data structures:
- git_index now contains a git_vector unmerged that stores
unmerged entries
- git_index_entry_unmerged: Representation of an unmerged file
entry. It represents all three versions of the file at the
same time, with one name, three modes and three OIDs
When in the middle of a merge, the index needs to contain several files
with the same name. git_index_insert() used to prevent this by not adding a new entry if an entry with the same name already existed.
Most tags will have a timestamp of whenever the code is running and
dealing with time and timezones is error-prone. Optimize for this case
by adding a function which causes the signature to be created with a
current timestamp.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
git_repository_path() and git_repository_workdir() respectively return the path to the git repository and the working directory. Those paths are absolute and normalized.
Don't blindly pass the target type to git_tag_type2string as it will
give an empty string on GIT_OBJ_ANY which would cause us to create an
invalid tag object.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
We cannot assume that Redis is never going to return an error code; when
Reddit fails, we cannot crash our library, we need to handle the crash
gracefully.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
long int is a safer type than int unless the user knows that the
variable is going to be quite small.
The code has been reworked to use strtol instead of the more
complicated sscanf.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Many error paths freed their local data althought it is freed later on
when the end of the function notices that there was an error. This can
cause double frees and invalid memory access.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Make cvar_free return void instad of the next element, as it was
mostly a hack to make cvar_list_free shorter but it's now using the
list macros.
Also check if the input is NULL and return immediately in that case.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
There is no need to keep config file in memory until the the
configuration is freed. Free the buffer immediately after the
configuration has been parsed.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If a variable name appears on its own in a line, it's assumed the
value is true. Store the variable name as NULL in that case.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If a line ends at EOF there is no need to check for the newline
character and doing so will cause us to read memory beyond the
allocatd memory as we check for the Windows-style new-line, which is
two bytes long.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If a variable value has the traditional continuation character (\) as
the last non-space character in the line, then we continue reading the
value on the next line.
Using more than two lines is also supported.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Make header and variable parse functions use their own buffers instead
of giving them the line they need to read as a parameter which they
mostly ignore.
This is in preparation for multiline configuration variables.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Streaming writes will no longer fail when writing to a backend that
doesn't support streaming writes but supports direct ones.
Now we create a fake stream on memory and then write it as a single
block using the backend `write` callback.
A variable name is stored internally with its section the way it
appeared in the configuration file in order to have the information
about what parts are case-sensitive inline.
Really implement parse_section_header_ext and move the assignment of
variables to config_parse.
The variable name matching is now done in a case-away way by
cvar_name_match and cvar_section_match. Before the user sees it, it's
normalized to the two- or three-dot version.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Such a list preserves the order the variables were first read in which
will be useful later for merging different data-sets. Furthermore,
reading and writing out the same configuration should not reorganize
the variables, which could happen when iterating through all the items
in a hash table.
A hash table is overkill for this small a data-set anyway.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
git_config_open shouldn't have to initialise variables that are only
used inside config_parse and its callees.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Config variables should be interpreted at run-time, as we don't know if a
zero means false or zero, or if yes means true or "yes".
As a variable has no intrinsic type, git_cvtype is gone and the public
API takes care of enforcing a few rules.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Allow any well-formed reference name to live under refs/ removing the
condition that they be under refs/{heads,tags,remotes}/ as was the
design of git.
An exception is made for HEAD which is allowed to contain an OID
reference in detached HEAD state.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Add internal reference create and rename functions which take a force
parameter, telling them to overwrite an existing reference if it
exists.
These functions try to update the reference if it's of the same type
as the one it's going to be replaced by. Otherwise the old reference
becomes invalid.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
These functions can be used to query or modify the variables in a
given configuration. No sanity checking is done on the variable names.
This is mostly meant as an API preview.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If cfg_readline consumes the line, then parse_section_header will read
past it and if we read a character, parse_variable won't have the full
name.
This solution is a bit hackish, but it's the simplest way to get the
code to parse correctly.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Save the location of the name in section_out instead of returning it
as an int. Use the return code to signal success or failure.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Expose the tag parsing capabilities already present in the
library.
Exporting this function makes it possible to implement the
mktag command without duplicating this functionality.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
List all the references in the repository, calling a custom
callback for each one.
The listed references may be filtered by type, or using
a bitwise OR of several types. Use the magic value
`GIT_REF_LISTALL` to obtain all references, including
packed ones.
The `callback` function will be called for each of the references
in the repository, and will receive the name of the reference and
the `payload` value passed to this method.
The current behaviour of git_index_open{bare,inrepo}() is unexpected.
When an index is opened, an in-memory index object is created that is
linked to the index discovered by git_repository_open(). However, this
index object is empty, as the on-disk index is not read. To fully open
the on-disk index file, git_index_read() has to be called. This leads to
confusing behaviour. Consider the following code:
git_index *idx;
git_index_open_inrepo(&idx, repo);
git_index_write(idx);
You would expect this to have no effect, as the index is never
ostensibly manipulated. However, what actually happens is that the index
entries are removed from the on-disk index because the empty in-memory
index object created by open_inrepo() is written back to the disk.
This patch reads the index after opening it.
Temporary files when doing streaming writes are now stored inside the
Objects folder, to prevent issues when moving files between
disks/partitions.
Add support for block writes to the ODB again (for those backends that
cannot implement streaming).
When the system temporary folder is located on a different volume than the working directory into which libgit2 is executing, MoveFileEx() requires an additional flag.
Hey. Apologies in advance -- I broke your bindings.
This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.
Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!
Major features include:
- Real caching and refcounting on parsed objects
- Real caching and refcounting on objects read from the ODB
- Streaming writes & reads from the ODB
- Single-method writes for all object types
- The external API is now partially thread-safe
The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.
Here's a full changelog for the external API:
blob.h
------
- Remove `git_blob_new`
- Remove `git_blob_set_rawcontent`
- Remove `git_blob_set_rawcontent_fromfile`
- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
- Change `git_blob_create_fromfile`:
The `path` argument is now relative to the repository's working dir
- Add `git_blob_create_frombuffer`
commit.h
--------
- Remove `git_commit_new`
- Remove `git_commit_add_parent`
- Remove `git_commit_set_message`
- Remove `git_commit_set_committer`
- Remove `git_commit_set_author`
- Remove `git_commit_set_tree`
- Add `git_commit_create`
- Add `git_commit_create_v`
- Add `git_commit_create_o`
- Add `git_commit_create_ov`
tag.h
-----
- Remove `git_tag_new`
- Remove `git_tag_set_target`
- Remove `git_tag_set_name`
- Remove `git_tag_set_tagger`
- Remove `git_tag_set_message`
- Add `git_tag_create`
- Add `git_tag_create_o`
tree.h
------
- Change `git_tree_entry_2object`:
New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`
- Remove `git_tree_new`
- Remove `git_tree_add_entry`
- Remove `git_tree_remove_entry_byindex`
- Remove `git_tree_remove_entry_byname`
- Remove `git_tree_clearentries`
- Remove `git_tree_entry_set_id`
- Remove `git_tree_entry_set_name`
- Remove `git_tree_entry_set_attributes`
object.h
------------
- Remove `git_object_new
- Remove `git_object_write`
- Change `git_object_close`:
This method is now *mandatory*. Not closing an object causes a
memory leak.
odb.h
-----
- Remove type `git_rawobj`
- Remove `git_rawobj_close`
- Rename `git_rawobj_hash` -> `git_odb_hash`
- Change `git_odb_hash`:
New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`
- Add type `git_odb_object`
- Add `git_odb_object_close`
- Change `git_odb_read`:
New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
- Change `git_odb_read_header`:
New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
- Remove `git_odb_write`
- Add `git_odb_open_wstream`
- Add `git_odb_open_rstream`
odb_backend.h
-------------
- Change type `git_odb_backend`:
New internal signatures are as follows
int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)
- Add type `git_odb_stream`
- Add enum `git_odb_streammode`
Signed-off-by: Vicent Marti <tanoku@gmail.com>
We now depend on libpthread on all Unix platforms (should be installed
by default) and use a simple wrapper for Windows threads under Win32.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
It's no longer retarded. All object interdependencies are stored as OIDs
instead of actual objects. This should be hundreds of times faster,
specially on big repositories. Heck, who knows, maye it doesn't even
segfault -- wouldn't that be awesome?
What has changed on the API?
`git_commit_parent`, `git_commit_tree`, `git_tag_target` now return
their values through a pointer-to-pointer, and have an error code.
`git_commit_set_tree` and `git_tag_set_target` now return an error
code and may fail.
`git_repository_free__no_gc` has been deprecated because it's
stupid. Since there are no longer any interdependencies between
objects, we don't need internal reference counting, and GC
never fails or double-free's pointers.
`git_object_close` now does a very sane thing: marks an object
as unused. Closed objects will be eventually free'd from the
object cache based on LRU. Please use `git_object_close` from
the garbage collector `destroy` method on your bindings. It's
100% safe.
`git_repository_gc` is a new method that forces a garbage collector
pass through the repo, to free as many LRU objects as possible.
This is useful if we are running out of memory.
The new pack backend is an adaptation of the original git.git code in
`sha1_file.c`. It's slightly faster than the previous version and
severely less memory-hungry.
The call-stack of a normal pack backend query has been properly
documented in the top of the header for future reference. And by
properly I mean with ASCII diagrams 'n shit.
The new revision walker uses an internal Commit object storage system,
custom memory allocator and much improved topological and time sorting
algorithms. It's about 20x times faster than the previous implementation
when browsing big repositories.
The following external API calls have changed:
`git_revwalk_next` returns an OID instead of a full commit object.
The initial call to `git_revwalk_next` is no longer blocking when
iterating through a repo with a time-sorting mode.
Iterating with Topological or inverted modes still makes the initial
call blocking to preprocess the commit list, but this block should be
mostly unnoticeable on most repositories (topological preprocessing
times at 0.3s on the git.git repo).
`git_revwalk_push` and `git_revwalk_hide` now take an OID instead
of a full commit object.
Set of methods to find the minimal-length to uniquely identify every OID
in a list. Useful for GUI applications, commit logs and so on.
Includes stress test.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Set of methods to find the minimal-length to uniquely identify every OID
in a list.
Includes stress test.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
We cannot make sure that the user doesn't use the same buffer as source
and destination, so write to it using memmove.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Disable garbage collection of cross-references to prevent
double-freeing. Internal reference management is now done
with a separate method.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
- Added several missing reference increases
- Add new destructor to the repository that does not GC the objects
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All `git_object` instances looked up from the repository are reference
counted. User is expected to use the new `git_object_close` when an
object is no longer needed to force freeing it.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
We now store only one sorting callback that does entry comparison. This
is used when sorting the entries using a quicksort, and when looking for
a specific entry with the new search methods.
The following search methods now exist:
git_vector_search(vector, entry)
git_vector_search2(vector, custom_search_callback, key)
git_vector_bsearch(vector, entry)
git_vector_bsearch2(vector, custom_search_callback, key)
The sorting state of the vector is now stored internally.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The methods previously known as
git_repository_lookup
git_repository_newobject
git_repository_lookup_ref
are now part of their respective namespaces:
git_object_lookup
git_object_new
git_reference_lookup
This makes the API more consistent with the new references API.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The following methods have been implemented:
git_reference_packall
git_reference_rename
git_reference_delete
The library now has full support for packed references, including
partial and total writing. Internal documentation has been updated with
the details.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
These two reference types are now stored separately to eventually allow
the removal/renaming of loose references and rewriting of the refs
packfile.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
We now use MoveFileEx, which is not assured to be atomic but works for
always (both if the destination exists, or if it doesn't) and is
available in MinGW.
Since this is a Win32 API call, complaint about lost or overwritten files
should be forwarded at Steve Ballmer.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The `rename` call doesn't quite work on Win32: expects the destination
file to not exist. We're using a native Win32 call in those cases --
that should do the trick.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The old hash table with chained buckets has been replaced by a new one
using Cuckoo hashing, which offers guaranteed constant lookup times.
This should improve speeds on most use cases, since hash tables in
libgit2 are usually used as caches where the objects are stored once and
queried several times.
The Cuckoo hash implementation is based off the one in the Basekit
library [1] for the IO language, but rewritten to support an arbritrary
number of hashes. We currently use 3 to maximize the usage of the nodes pool.
[1]: https://github.com/stevedekorte/basekit/blob/master/source/CHash.c
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The new `git_filebuf` structure provides atomic high-performance writes
to disk by using a write cache, and optionally a double-buffered scheme
through a worker thread (not enabled yet).
Writes can be done 3-layered, like in git.git (user code -> write cache
-> disk), or 2-layered, by writing directly on the cache. This makes
index writing considerably faster.
The `git_filebuf` structure contains all the old functionality of
`git_filelock` for atomic file writes and reads. The `git_filelock`
structure has been removed.
Additionally, the `git_filebuf` API allows to automatically hash (SHA1)
all the data as it is written to disk (hashing is done smartly on big
chunks to improve performance).
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The interlocking on the write threads was not being done properly (index
entries were sometimes written out of order). With proper interlocking,
the threaded write is only marginally faster on big index files, and
slower on the smaller ones because of the overhead when creating
threads.
The threaded index writing has been temporarily disabled; after more
accurate benchmarks, if might be possible to enable it again only when
writing very large index files (> 1000 entries).
Signed-off-by: Vicent Marti <tanoku@gmail.com>
64-bit types stored in memory have to be truncated into 32 bits when
writing to disk. Was causing warnings in MSVC.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
In response to issue #60 (git_index_write really slow), the write_index
function has been rewritten to improve its performance -- it should now
be in par with the performance of git.git.
On top of that, if Posix Threads are available when compiling libgit2, a
new threaded writing system will be used (3 separate threads take care
of solving byte-endianness, hashing the contents of the index and
writing to disk, respectively). For very long Index files, this method
is up to 3x times faster than git.git.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The priority value for different backends has been removed from the
public `git_odb_backend` struct. We handle that internally. The priority
value is specified on the `git_odb_add_alternate`.
This is convenient because it allows us to poll a backend twice with
different priorities without having to instantiate it twice.
We also differentiate between main backends and alternates; alternates have
lower priority and cannot be written to.
These changes come with some unit tests to make sure that the backend
sorting is consistent.
The libgit2 version has been bumped to 0.4.0.
This commit changes the external API:
CHANGED:
struct git_odb_backend
No longer has a `priority` attribute; priority for the backend
in managed internally by the library.
git_odb_add_backend(git_odb *odb, git_odb_backend *backend, int priority)
Now takes an additional priority parameter, the priority that
will be given to the backend.
ADDED:
git_odb_add_alternate(git_odb *odb, git_odb_backend *backend, int priority)
Add a backend as an alternate. Alternate backends have always
lower priority than main backends, and writing is disabled on
them.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The alternates file is now parsed, and the alternate ODB folders are
added as separate backends. This allows the library to efficiently query
the alternate folders.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The `git__joinpath` function has been changed to use a statically
allocated buffer; we assume the buffer to be 4096 bytes, because fuck
you.
The new method also supports an arbritrary number of paths to join,
which may come in handy in the future.
Some methods which were manually joining paths with `strcpy` now use the
new function, namely those in `index.c` and `refs.c`.
Based on Emeric Fermas' original patch, which was using the old
`git__joinpath` because I'm stupid. Thanks!
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Removed `git_tree_add_entry_unsorted`. Now the `git_tree_add_entry`
method doesn't sort the entries array by default; entries are only
sorted lazily when required. This is done automatically by the library
(the `git_tree_sort_entries` call has been removed).
This should improve performance. No point on sorting entries all the time, anyway.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
We now have proper sonames in Mac OS X and Linux, proper versioning on
the pkg-config file and proper DLL naming in Windows.
The version of the library is defined exclusively in 'src/git2.h'; the build scripts
read it from there automatically.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Configure again the build system to look for SQLite3. If the library is
found, the SQLite backend will be automatically compiled.
Enjoy *very* fast reads and writes.
MASTER PROTIP: Initialize the backend with ":memory" as the path to the
SQLite database for fully-hosted in-memory repositories. Rejoice.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The `dirname` and `dirbase` methods have been replaced with the Android
implementation, which is actually compilant to some kind of standard.
A new method `topdir` has been added, which returns the topmost
directory in a path.
These changes fix issue #49:
`gitfo_prettify_dir_path` converts "./.git/" to ".git/", so
the code at src/repository.c:190 goes out of bounds when
trying to find the topmost directory.
The new `git__topdir` method handles this gracefully, and the
fixed `git__dirname` now returns the proper value for the
repository's working dir.
E.g.
/repo/.git/ ==> working dir '/repo/'
.git/ ==> working dir '.'
Signed-off-by: Vicent Marti <tanoku@gmail.com>
git_revwalk_next now returns an error code when the iteration is over.
git_repository_index now returns an error code when the index file could
not be opened.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Don't allow access to any tree entries whilst the entries array is
unsorted. We keep track on when the array is unsorted, and any methods
that access the array while it is unsorted now sort the array before
accessing it.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
If plain strcmp is used, as this code did before, the final sorting may
end up different from what git-add would do (for example, 'boost'
appearing before 'boost-build.jam', because Git sorts as if it were
spelled 'boost/').
If the sorting is incorrect like this, Git 1.7.4 insists that unmodified
files have been modified. For example, my test repository has these
four entries:
drwxr-xr-x 199 johnw wheel 6766 Feb 2 17:21 boost
-rw-r--r-- 1 johnw wheel 849 Feb 2 17:22 boost-build.jam
-rw-r--r-- 1 johnw wheel 989 Feb 2 17:21 boost.css
-rw-r--r-- 1 johnw wheel 6308 Feb 2 17:21 boost.png
Here is the output from git-ls-tree for these files, in a commit tree
created using git-add and git-commit:
100644 blob 8b8775433aef73e9e12609610ae2e35cf1e7ec2c boost-build.jam
100644 blob 986c4050fa96d825a1311c8e871cdcc9a3e0d2c3 boost.css
100644 blob b4d51fcd5c9149fd77f5ca6ed2b6b1b70e8fe24f boost.png
040000 tree 46537eeaa4d577010f19b1c9e940cae9a670ff5c boost
Here is the output for the same commit produced using libgit2:
040000 tree c27c0fd1436f28a6ba99acd0a6c17d178ed58288 boost
100644 blob 8b8775433aef73e9e12609610ae2e35cf1e7ec2c boost-build.jam
100644 blob 986c4050fa96d825a1311c8e871cdcc9a3e0d2c3 boost.css
100644 blob b4d51fcd5c9149fd77f5ca6ed2b6b1b70e8fe24f boost.png
Due to this reordering, git-status claims the three blobs are always
modified, no matter what I do using git-read-tree or git-reset or
git-checkout to update the index.
Several changes have been committed to allow the user to create
in-memory references and write back to disk. Peeling of symbolic
references has been made explicit. Added getter and setter methods for
all attributes on a reference. Added corresponding documentation.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All the commits have been squashed into a single one before refactoring
the final code, to keep everything tidy.
Individual commit messages are as follows:
Added repository reference looking up functionality placeholder.
Added basic reference database definition and caching infrastructure.
Removed useless constant.
Added GIT_EINVALIDREFNAME error and description. Added missing description for GIT_EBAREINDEX.
Added GIT_EREFCORRUPTED error and description.
Added GIT_ETOONESTEDSYMREF error and description.
Added resolving of direct and symbolic references.
Prepared the packed-refs parsing.
Added parsing of the packed-refs file content.
When no loose reference has been found, the full content of the packed-refs file is parsed. All of the new (i.e. not previously parsed as a loose reference) references are eagerly stored in the cached references storage.
The method packed_reference_file__parse() is in deer need of some refactoring. :-)
Extracted to a method the parsing of the peeled target of a tag.
Extracted to a method the parsing of a standard packed ref.
Fixed leaky removal of the cached references.
Ensured that a previously parsed packed reference isn't returned if a more up-to-date loose reference exists.
Enhanced documentation of git_repository_reference_lookup().
Moved some refs related constants from repository.c to refs.h.
Made parsing of a packed tag reference more robust.
Updated git_repository_reference_lookup() documentation.
Added some references to the test repository.
Added some tests covering tag references looking up.
Added some tests covering symbolic and head references looking up.
Added some tests covering packed references looking up.
Yes, we are breaking the API. Alpha software, deal with it.
We need a way of getting a pointer to each newly added entry to the
index, because manually looking up the entry after creation is
outrageously expensive.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
In-memory tree objects were not being properly initialized, because the
internal entries vector was created on the 'parse' method.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Don't need a brand new header for two typedefs when we already have a
types.h header.
Change comment style to ANSI C.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Clean up a provided absolute or relative directory path.
This prettification relies on basic operations such as coalescing multiple forward slashes into a single slash, removing '.' and './' current directory segments, and removing parent directory whenever '..' is encountered. If not empty, the returned path ends with a forward slash.
For instance, this will turn "d1/s1///s2/..//../s3" into "d1/s3/".
This only performs a string based analysis of the path. No checks are done to make sure the path actually makes sense from the file system perspective.