The tree array wasn't being initialized when instantiating a tree object
in memory instead of loading it from disk.
New unit tests added to check for the problem.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Basic write-back & in-memory editing for objects is now tested in t0403
(commits), t0802 (tags) and t0902 (trees).
Add new helper functions in test_helpers.c to remove the loose objects
created when doing write-back tests.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Tag files can now be created and modified in-memory (all the setter
methods have been implemented), and written back to disk using the
generic git_object_write() method.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
git_tree_entry_byname was dereferencing a NULL pointer when the searched
file couldn't be found on the tree.
New test cases have been added to check for entry access methods.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All the setter methods for git_tree have been added, including the
setters for attributes on each git_tree_entry and methods to add/remove
entries of the tree.
Modified trees and trees created in-memory from scratch can be written
back to the repository using git_object_write().
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All repository objects can now be created from scratch in memory using
either the git_object_new() method, or the corresponding git_XXX_new()
for each object.
So far, only git_commits can be written back to disk once created in
memory.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All the required git_commit_set_XXX methods have been implemented; all
the attributes of a commit object can now be modified in-memory.
The new method git_object_write() automatically writes back the
in-memory changes of any object to the repository. So far it only
supports git_commit objects.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The new 'git__source_printf' does an overflow-safe printf on a source
bfufer.
The new 'git__source_write' does an overflow-safe byte write on a source
buffer.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The 'git_obj' structure is now called 'git_rawobj', since
it represents a raw object read from the ODB.
The 'git_repository_object' structure is now called 'git_object',
since it's the base object class for all objects.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
git_repository_object has now several internal methods to write back the
object information in the repository.
- git_repository__dbo_prepare_write()
Prepares the DBO object to be modified
- git_repository__dbo_write()
Writes new bytes to the DBO object
- git_repository__dbo_writeback()
Writes back the changes to the repository
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Added several methods to access:
- The ODB behind a repo
- The SHA1 id behind a generic repo object
- The type of a generic repo object
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Some compilers give linking problems when exporting 'uint32_t' as a
return type in the external API. Use generic types instead.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
A new method 'git_repository_object_free' allows to manually force the
freeing of a repository object, even though they are still automatically
managed by the repository and don't need to be freed by the user.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The interface for loading and parsing tree objects from a repository has
been completed with all the required accesor methods for attributes,
support for manipulating individual tree entries and a new unit test
t0901-readtree which tries to load and parse a tree object from a
repository.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Three new unit tests, t06XX files have been added.
t0601-read: tests for loading index files from disk,
for creating in-memory indexes and for accessing
index entries.
t0602-write: tests for writing index files back to disk
t0603-sort: tests for properly sorting the entries array of an index
Two test indexes have been added in 'tests/resources/':
test/resources/index: a sample index from a libgit2 repository
test/resources/gitgit.index: a sample index from a git.git
repository (includes TREE extension data)
Signed-off-by: Vicent Marti <tanoku@gmail.com>
All the external resources used by the tests are now placed inside the
common 'tests/resources' directory.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The new 'git_index' structure is an in-memory representation
of a git index on disk; the 'git_index_entry' structures represent
each one of the file entries on the index.
The following calls for index instantiation have been added:
git_index_alloc(): instantiate a new index structure
git_index_free(): free an existing index
git_index_clear(): clear all the entires in an existing file
The following calls for index reading and writing have been added:
git_index_read(): update the contents of the index structure from
its file on disk.
Internally implemented through:
git_index__parse()
Index files are stored on disk in network byte order; all integer fields
inside them are properly converted to the machine's byte order when
loading them in memory. The parsing engine also distinguishes
between normal index entries and extended entries with 2 extra bytes
of flags.
The 'TREE' extension for index entries is also loaded into memory:
Tree caches stored in Index files are loaded into the
'git_index_tree' structure pointed by the 'tree' pointer inside
'git_index'.
'index->tree' points to the root node of the tree cache; the full tree
can be traversed through each of the node's 'tree->children'.
Index files can be written back to disk through:
git_index_write(): atomic writing of existing index objects
backed by internal method git_index__write()
The following calls for entry manipulation have been added:
git_index_add(): insert an empty entry to the index
git_index_find(): search an entry by its path name
git_index__append(): appends a new index entry to the end of the
list, resizing the entries array if required
New index entries are always inserted at the end of the array; since the
index entries must be sorted for it to be internally consistent, the
index object is only sorted once, and if required, before accessing the
whole entriea array (e.g. before writing to disk, before traversing,
etc).
git_index__remove_pos(): remove an index entry in a specific position
git_index__sort(): sort the entries in the array by path name
The entries array is sorted stably and in place using an
insertion sort, which ought to be the most efficient approach
since the entries array is always mostly-sorted.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The struct 'git_filelock' represents an atomically-locked
file, git-style.
Locked files can be modified atomically through the new file lock
interface:
int git_filelock_init(git_filelock *lock, const char *path);
int git_filelock_lock(git_filelock *lock, int append);
void git_filelock_unlock(git_filelock *lock);
int git_filelock_commit(git_filelock *lock);
int git_filelock_write(git_filelock *lock, const char *buffer, size_t length);
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The old 'git_revpool' object has been removed and
split into two distinct objects with separate
functionality, in order to have separate methods for
object management and object walking.
* A new object 'git_repository' does the high-level
management of a repository's objects (commits, trees,
tags, etc) on top of a 'git_odb'.
Eventually, it will also manage other repository
attributes (e.g. tag resolution, references, etc).
See: src/git/repository.h
* A new external method
'git_repository_lookup(repo, oid, type)'
has been added to the 'git_repository' API.
All object lookups (git_XXX_lookup()) are now
wrappers to this method, and duplicated code
has been removed. The method does automatic type
checking and returns a generic 'git_revpool_object'
that can be cast to any specific object.
See: src/git/repository.h
* The external methods for object parsing of repository
objects (git_XXX_parse()) have been removed.
Loading objects from the repository is now managed
through the 'lookup' functions. These objects are
loaded with minimal information, and the relevant
parsing is done automatically when the user requests
any of the parsed attributes through accessor methods.
An attribute has been added to 'git_repository' in
order to force the parsing of all the repository objects
immediately after lookup.
See: src/git/commit.h
See: src/git/tag.h
See: src/git/tree.h
* The previous walking functionality of the revpool
is now found in 'git_revwalk', which does the actual
revision walking on a repository; the attributes
when walking through commits in a database have been
decoupled from the actual commit objects.
This increases performance when accessing commits
during the walk and allows to have several
'git_revwalk' instances working at the same time on
top of the same repository, without having to load
commits in memory several times.
See: src/git/revwalk.h
* The old 'git_revpool_table' has been renamed to
'git_hashtable' and now works as a generic hashtable
with support for any kind of object and custom hash
functions.
See: src/hashtable.h
* All the relevant unit tests have been updated, renamed
and grouped accordingly.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Tag objects are now properly loaded from the revision pool.
New test t0801 checks for loading a parsing a series of tags, including
the tag of a tag.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The 'parse_oid' and 'parse_person' methods which were used by the commit
parser are now global so they can be used when parsing other objects.
The 'git_commit_person' struct has been changed to a generic
'git_person'.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Packed objects inside packfiles are now properly unpacked when calling
the git_odb__read_packed() method; delta'ed objects are also properly
generated when needed.
A new unit test 0204-readpack tries to read a couple hundred packed
objects from a standard packed repository.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The basic information (pointed trees and blobs) of each tree object in a
revision pool can now be parsed and queried.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The following new external methods have been added:
GIT_EXTERN(const char *) git_commit_message_short(git_commit *commit);
GIT_EXTERN(const char *) git_commit_message(git_commit *commit);
GIT_EXTERN(time_t) git_commit_time(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_committer(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_author(git_commit *commit);
GIT_EXTERN(const git_tree *) git_commit_tree(git_commit *commit);
A new structure, git_commit_person has been added to represent a
commit's author or committer.
The parsing of a commit has been split in two phases.
When adding a commit to the revision pool:
- the commit's ODB object is opened
- its raw contents are parsed for commit TIME, PARENTS and TREE
(the minimal amount of data required to traverse the pool)
- the commit's ODB object is closed
When querying for extended information on a commit:
- the commit's ODB object is reopened
- its raw contents are parsed for the requested information
- the commit's ODB object remains open to handle additional queries
New unit tests have been added for the new functionality:
In t0401-parse: parse_person_test
In t0402-details: query_details_test
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Commits now store pointers to their tree objects.
Tree objects now work as separate git_revpool_object
entities.
Tree objects can be loaded and parsed inedependently
from commits.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
git_revpool_object now has a type identifier for each object
type in a revpool (commits, trees, blobs, etc).
Trees can now be stored in the revision pool.
git_revpool_tableit now supports filtering objects by their
type when iterating through the object table.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
"t0502-table" tests for basic functionality of the objects
table:
table_create (creating a new object table)
table_populate (fill & lookup on the object table)
table_resize (dynamically resize the table)
"t0503-tableit" tests the iterator for object tables:
table_iterator (make sure the iterator reaches all objects)
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Created commit objects in t0401-parse weren't being freed properly.
Updated the API documentation to note that commit objects are owned
by the revision pool and should not be freed manually.
The parents list of each commit was being freed twice after each test.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Previously the objects table was being freed, but not
the actuall commits. All git_commit objects are freed
and hence invalidated when freeing the git_rp object
they belong to.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
This fix had been delayed by Ramsay because on 32-bit systems it
highlights the fact that off_t is set to an invalid value.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
This reduces the global namespace pollution. These functions
were the only remaining external symbols (with the exception
of an PPC_SHA1 build) which did not start with 'git', and
since these are private library symbols the 'git__' prefix is
appropriate.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Given that the sha1.h header file should never be included into
any other file, since it represents an implementation detail of
hash.c, we remove the header and inline it's content.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
On Intel machines, the msvc compiler defines the CPU architecture
macros _M_IX86 and _M_X64 (equivalent to __i386__ and __x86_64__
respectively). Use these macros in the pre-processor expression
to select the "fast" definition of the {get,put}_be32() macros.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
When git_oid_to_string() was passed a buffer size larger than
GIT_OID_HEXSZ+1, the function placed the c-string NUL char at
the wrong position. Fix the code to place the NUL at the end
of the (possibly truncated) oid string.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
In order to avoid inconsistent definitions of type off_t, all
compilation units should include the "common.h" header file
before certain system headers (those which directly or indirectly
lead to the definition of off_t). The "common.h" header contains
the definition of _FILE_OFFSET_BITS to select 64-bit file offsets.
The symptom of this inconsistency, while compiling with -Wextra, is
the following warning:
In file included from src/common.h:50,
from src/commit.c:28:
src/util.h: In function git__is_sizet:
src/util.h:41: warning: comparison between signed and unsigned
In order to fix the problem, we simply remove the #include <time.h>
statement at the head of src/commit.c. Note that src/commit.h also
includes <time.h>.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
gcc (4.4.0) issues the following warning:
src/revobject.c:33: warning: dereferencing type-punned pointer \
will break strict-aliasing rules
We suppress the warning by copying the first 4 bytes from the oid
structure into an 'unsigned int' using memcpy(). This will also
fix any potential alignment issues on certain platforms.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, doxygen issues the following warning:
.../src/git/revwalk.h:86: Warning: The following parameters of \
gitrp_sorting(git_revpool *pool, unsigned int sort_mode) are \
not documented:
parameter 'pool'
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, sparse issues the following warnings:
src/revobject.c:29:14: warning: symbol 'max_load_factor' was \
not declared. Should it be static?
src/revobject.c:31:14: warning: symbol 'git_revpool_table__hash' was \
not declared. Should it be static?
In order to suppress these warnings, we simply declare them as
static, since they are not (currently) referenced outside of this
file.
In the case of max_load_factor, this is probably correct. However,
this may not be appropriate for git_revpool_table__hash(), given
how it is named. So, this should either be re-named to reflect it's
non-external status, or a declaration needs to be added to the
revobject.h header file.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In order to suppress this warning, we could simply replace the
constant 0 with NULL. However, in this case, replacing the
comparison with 0 by !buffer is more idiomatic.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
For more recent versions of msvc, the time_t type, as returned by
the time() function, is a 64-bit type. The srand() function, however,
expects an 'unsigned int' input parameter, leading to the warning.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, the compiler issues the following warning:
src/revwalk.c(61) : warning C4244: '=' : conversion from \
'unsigned int' to 'unsigned char', possible loss of data
In order to suppress the warning, we change the type of the
sorting "enum" field of the git_revpool structure to be consistent
with the sort_mode parameter of the gitrp_sorting() function.
Note that if the size of the git_revpool structure is an issue,
then we could change the type of the sort_mode parameter instead.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>