The basic information (pointed trees and blobs) of each tree object in a
revision pool can now be parsed and queried.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
The following new external methods have been added:
GIT_EXTERN(const char *) git_commit_message_short(git_commit *commit);
GIT_EXTERN(const char *) git_commit_message(git_commit *commit);
GIT_EXTERN(time_t) git_commit_time(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_committer(git_commit *commit);
GIT_EXTERN(const git_commit_person *) git_commit_author(git_commit *commit);
GIT_EXTERN(const git_tree *) git_commit_tree(git_commit *commit);
A new structure, git_commit_person has been added to represent a
commit's author or committer.
The parsing of a commit has been split in two phases.
When adding a commit to the revision pool:
- the commit's ODB object is opened
- its raw contents are parsed for commit TIME, PARENTS and TREE
(the minimal amount of data required to traverse the pool)
- the commit's ODB object is closed
When querying for extended information on a commit:
- the commit's ODB object is reopened
- its raw contents are parsed for the requested information
- the commit's ODB object remains open to handle additional queries
New unit tests have been added for the new functionality:
In t0401-parse: parse_person_test
In t0402-details: query_details_test
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Commits now store pointers to their tree objects.
Tree objects now work as separate git_revpool_object
entities.
Tree objects can be loaded and parsed inedependently
from commits.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
git_revpool_object now has a type identifier for each object
type in a revpool (commits, trees, blobs, etc).
Trees can now be stored in the revision pool.
git_revpool_tableit now supports filtering objects by their
type when iterating through the object table.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Created commit objects in t0401-parse weren't being freed properly.
Updated the API documentation to note that commit objects are owned
by the revision pool and should not be freed manually.
The parents list of each commit was being freed twice after each test.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Previously the objects table was being freed, but not
the actuall commits. All git_commit objects are freed
and hence invalidated when freeing the git_rp object
they belong to.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
This fix had been delayed by Ramsay because on 32-bit systems it
highlights the fact that off_t is set to an invalid value.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
This reduces the global namespace pollution. These functions
were the only remaining external symbols (with the exception
of an PPC_SHA1 build) which did not start with 'git', and
since these are private library symbols the 'git__' prefix is
appropriate.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Given that the sha1.h header file should never be included into
any other file, since it represents an implementation detail of
hash.c, we remove the header and inline it's content.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
On Intel machines, the msvc compiler defines the CPU architecture
macros _M_IX86 and _M_X64 (equivalent to __i386__ and __x86_64__
respectively). Use these macros in the pre-processor expression
to select the "fast" definition of the {get,put}_be32() macros.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
When git_oid_to_string() was passed a buffer size larger than
GIT_OID_HEXSZ+1, the function placed the c-string NUL char at
the wrong position. Fix the code to place the NUL at the end
of the (possibly truncated) oid string.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
In order to avoid inconsistent definitions of type off_t, all
compilation units should include the "common.h" header file
before certain system headers (those which directly or indirectly
lead to the definition of off_t). The "common.h" header contains
the definition of _FILE_OFFSET_BITS to select 64-bit file offsets.
The symptom of this inconsistency, while compiling with -Wextra, is
the following warning:
In file included from src/common.h:50,
from src/commit.c:28:
src/util.h: In function git__is_sizet:
src/util.h:41: warning: comparison between signed and unsigned
In order to fix the problem, we simply remove the #include <time.h>
statement at the head of src/commit.c. Note that src/commit.h also
includes <time.h>.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
gcc (4.4.0) issues the following warning:
src/revobject.c:33: warning: dereferencing type-punned pointer \
will break strict-aliasing rules
We suppress the warning by copying the first 4 bytes from the oid
structure into an 'unsigned int' using memcpy(). This will also
fix any potential alignment issues on certain platforms.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, doxygen issues the following warning:
.../src/git/revwalk.h:86: Warning: The following parameters of \
gitrp_sorting(git_revpool *pool, unsigned int sort_mode) are \
not documented:
parameter 'pool'
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, sparse issues the following warnings:
src/revobject.c:29:14: warning: symbol 'max_load_factor' was \
not declared. Should it be static?
src/revobject.c:31:14: warning: symbol 'git_revpool_table__hash' was \
not declared. Should it be static?
In order to suppress these warnings, we simply declare them as
static, since they are not (currently) referenced outside of this
file.
In the case of max_load_factor, this is probably correct. However,
this may not be appropriate for git_revpool_table__hash(), given
how it is named. So, this should either be re-named to reflect it's
non-external status, or a declaration needs to be added to the
revobject.h header file.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In order to suppress this warning, we could simply replace the
constant 0 with NULL. However, in this case, replacing the
comparison with 0 by !buffer is more idiomatic.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, the compiler issues the following warning:
src/revwalk.c(61) : warning C4244: '=' : conversion from \
'unsigned int' to 'unsigned char', possible loss of data
In order to suppress the warning, we change the type of the
sorting "enum" field of the git_revpool structure to be consistent
with the sort_mode parameter of the gitrp_sorting() function.
Note that if the size of the git_revpool structure is an issue,
then we could change the type of the sort_mode parameter instead.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, the compiler issues the following warnings:
src/revobject.c(29) : warning C4305: 'initializing' : truncation \
from 'double' to 'const float'
src/revobject.c(56) : warning C4244: '=' : conversion from \
'const float' to 'unsigned int', possible loss of data
src/revobject.c(149) : warning C4244: '=' : conversion from \
'const float' to 'unsigned int', possible loss of data
In order to suppress the warnings we change the type of max_load_factor
to double, rather than change the initialiser to 0.65f, and cast the
result type of the expressions to 'unsigned int' as expected by the
assignment operators. Note that double should be able to represent all
unsigned int values without loss.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
These warnings are issued by both gcc (-Wextra) and msvc (-W3).
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
On the msvc build, the tests t0401-parse and t0501-walk both
crash with a runtime error (ACCESS_VIOLATION). This is caused
by writing to un-allocated memory due to an under-allocation
of a git_revpool_table data structure.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
sorting ('prev' pointers in the linked list are no longer lost).
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
The GIT_RPSORT_XXX flags have been moved to the external API,
and a new method 'gitrp_sorting(...)' has been added to safely
change the sorting method of a revision pool.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
'git_commit_list_toposort()' and 'git_commit_list_timesort()' now
sort a commit list by topological and time order respectively.
Both sorts are stable and in place.
'git_commit_list_append' has been replaced by 'git_commit_list_push_back'
and 'git_commit_list_push_front'.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
Fixed issue when generating pending commits list during iteration.
The 'git_commit_lookup' function will now check the pool's cache
for commits which have been previously loaded/parsed; there can only
be a single 'git_commit' structure for each commit on the same pool.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
All the objects which will will be eventually transversable from
a revision pool (commits, trees, etc) now inherit from the
'git_revpool_object' structure which identifies them with their
own OID.
Furthermore, the 'git_revpool_table' and related functions have
been added, which allow for constant time lookup (hash table)
of the loaded revpool objects based on their OID.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
The 'gitrp_next()' method now correctly does a revision walking
of all the pushed revisions in arbritary ordering.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
git_commit_lookup() now creates commit references
without loading them from the ODB.
git_commit_parse() creates a commit reference, loads
it and parses it from the ODB.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
Basic support for iterating the revpool.
The following functions of the revwalk API have been partially
implemented:
void gitrp_reset(git_revpool *pool);
void gitrp_push(git_revpool *pool, git_commit *commit);
void gitrp_prepare_walk(git_revpool *pool);
git_commit *gitrp_next(git_revpool *pool);
Parsed commits' parents are now also parsed and stored in a
"git_commit_list" structure (linked list).
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
A few initial tests for commit parsing:
"parse_buffer_test" tests git_commit__parse_buffer() with
several malformed commit messages and a few corner cases
which should pass.
"parse_oid_test" tests git_commit__parse_oid() with several
malformed commit lines containing broken SHA1 OIDs.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
The external API function "git_commit_parse" has been renamed
to "git_commit_lookup" and has been partially implemented with
support for loading commits straight from the ODB. It still lacks
the functionality to lookup cached commits in the revpool and to
resolve tags to commits.
The following internal functions have been partially implemented:
int git_commit__parse_buffer(...);
int git_commit__parse_time(...);
int git_commit__parse_oid(...);
Commits are now fully parsed but the generated parent and tree
references are not handled yet.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, using the normal (or production) compiler
warning level (-W3), msvc complains as follows:
.../sha1.c(244) : warning C4018: '<' : signed/unsigned mismatch
.../sha1.c(270) : warning C4244: 'function' : conversion from \
'unsigned __int64' to 'unsigned long', possible loss of data
.../sha1.c(271) : warning C4244: 'function' : conversion from \
'unsigned __int64' to 'unsigned long', possible loss of data
Note that gcc issues a similar complaint about line 244 when
compiling with -Wextra.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Commit 5dddf7c (Add block-sha1 in favour of the mozilla routines
2010-04-14) introduced the "bswap.h" header file which contains
an inline function (default_swab32()). The msvc compiler does
not support the inline keyword which causes the build to fail
with a syntax error.
However, msvc does support inline functions using the __inline
keyword language extension. We already have the GIT_INLINE()
macro that allows us to hide this syntatic difference. In order
to fix the build, we simply use GIT_INLINE() in the definition
of the default_swab32() function.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
* ramsay/dev:
Add a pack index 'virtual function' to fetch an index entry
Add a pack index 'virtual function' to search by file offset
Change the interface of the pack index search function
Add an 64-bit offset table index bounds check for v2 pack index
Add a minimum size check when opening an v2 pack index file
win32: Add separate MinGW and MSVC compatability header files
Makefile: Add support for custom build options in config.mak file
Fix some coding style issues
Since block-sha1 from git.git has such excellent performance, we
can also get rid of the openssl dependency. It's rather simple
to add it back later as an optional extra, but we really needn't
bother to pull in the entire ssl library and have to deal with
linking issues now that we have the portable and, performance-wise,
truly excellent block-sha1 code to fall back on.
Since this requires a slight revamp of the build rules anyway, we
take the opportunity to fix including EXTRA_OBJS in the final build
as well.
The block-sha1 code was originally implemented for git.git by
Linus Torvalds <torvalds@linux-foundation.org> and was later
polished by Nicolas Pitre <nico@cam.org>.
Signed-off-by: Andreas Ericsson <ae@op5.se>
We don't use it yet, but now we have it there at least.
All the non-trivial parts of it appears to have been written
and contributed to git.git by some anonymous genius. The original
implementation was done by Paul Mackerras <paulus@samba.org>.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Given an index entry number, the idx_get() function returns an
(version agnostic) index_entry structure containing all of the
information required to unpack the corresponding object from
the '.pack' file.
Since the v1 and v2 file formats differ in the layout of the
object records, we provide two implementations of the get
function and initialise the function pointer appropriately.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
In addition to searching the index by oid, we need to search by
'.pack' file offset, particularly when processing OBJ_OFS_DELTA
objects. Since the v1 and v2 file formats differ in the layout
of the object records, we provide two implementations of the
search function and initialise the (virtual) function pointer
appropriately.
Note that, as part of the creation of the 'offset index', we also
add a check that the offset data in the index is within the bounds
of the '.pack' file. Having sorted the file offsets, while creating
the index, we only need to check the smallest and largest values.
The offset index consists of the im_off_idx array, which contains
the index entry numbers sorted into file offset order, and the
im_off_next mapping array. The im_off_next array maps an index
entry number to the 'next' index entry in file offset order.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
In particular, on a successful search, we now return the index
entry number of the object rather than the '.pack' file offset.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
This reduces the global namespace pollution and allows for
a win32 compiler (eg. Open Watcom) to provide these routines
in a header other than <dirent.h> (eg in <io.h>).
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Some win32 compilers define the SSIZE_T type, with the same
meaning and intent as ssize_t. If available, make ssize_t a
synonym of SSIZE_T.
At present, the Digital-Mars compiler is known not to define
SSIZE_T, so we provide an SSIZE_T macro to use in the typedef.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
In addition to removing the inline #define, commit 209849a also
removed a #pragma to disable msvc deprecated function warnings.
Without this #pragma, msvc currently issues 19 warnings related
to "deprecated insecure c-library functions", such as strcpy()
and 22 warnings related to "deprecated POSIX function names",
such as open().
In order to supress these warnings, re-instate the #pragma.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
No need to define inline as __inline because libgit2 code
should be using GIT_INLINE instead.
Signed-off-by: Julio Espinoza-Sokal <julioes@gmail.com>
Signed-off-by: Andreas Ericsson <ae@op5.se>
For information on FlushFileBuffers(), see the msdn document
at msdn.microsoft.com/en-us/library/aa364439(VS.85).aspx
Note that Windows 2000 is shown as the minimum windows version
to support FlushFileBuffers(), so if we wish to support Win9X
and NT4, we will need to add code to dynamically check if
kernel32.dll contains the function.
The only error return mentioned in the msdn document is
ERROR_INVALID_HANDLE, which is returned if the file/device
(eg console) is not buffered. The fsync(2) manpage says that
EINVAL is returned in errno, if "fd is bound to a special
file which does not support synchronization".
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
When setting the default value, the macro name was specified
as GIT_FLEX_ARRAY, which is inconsistent with it's earlier
usage in the file. This caused a compilation error, using the
MS Visual C/C++ compiler, when compiling the git_packlist
struct definition in src/odb.c.
In addition to changing the spelling of the FLEX_ARRAY macro
to GIT_FLEX_ARRAY, including it's use in src/odb.c, we also
rename the TYPEOF macro to GIT_TYPEOF.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
This supresses some "conversion from 'size_t' to 'unsigned int',
possible loss of data" warning messages from the MS Visual C/C++
compiler with -Wp64.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In particular, in standard C, a struct or union must have at
least one member declared (ie. structs and unions cannot be
empty). Some compilers allow empty structs as an extension
and won't even issue a warning unless asked for it (eg, gcc
requires -pedantic). Some compilers allow empty structs as
an extension and will only treat it as an error if asked for
strict checking (eg Digital-Mars with -A). Some compilers
simply treat it as an error (eg MS Visual C/C++).
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
In 82324ac, the new static function exists_loose() called
object_file_name() and, in order to detect an error return,
tested for a negative value. This usage is incorrect, as
the error return is indicated by a positive return value.
(A successful call is indicated by a zero return value)
The only error return from object_file_name() relates to
insufficient buffer space and the return value gives the
required minimum buffer size (which will always be >0).
If the caller requires a dynamically allocated buffer,
this allows something like the following call sequence:
size_t len = object_file_name(NULL, 0, db->object_dir, id);
char *buf = git__malloc(len);
if (!buf)
error(...);
object_file_name(buf, len, db->object_dir,id);
...
No current callers take advantage of this capability.
Fix up the call site and change the return type of the
function, from int to size_t, which more accurately
reflects the implementation.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Andreas Ericsson <ae@op5.se>
This test assumed that it was invoked in an empty directory,
which is true when run from the Makefile, and so would fail
if run standalone. In order to allow the test to work when
run from any directory, create a sub directory "dir-walk"
and chdir() into this directory while running the tests.
Also, add some additional tests.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In particular, the git__mmap() and git__munmap() routines provide
the interface to platform specific memory-mapped file facilities.
We provide implementations for unix and win32, which can be found
in their own sub-directories.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
On windows, unless we use the O_BINARY flag in the open()
call, the file I/O routines will perform line ending
conversion (\r\n => \n on input, \n => \r\n on output).
In addition to the performance penalty, most files in the
object database are binary and will, therefore, become
corrupted by this conversion.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In particular, conditional expressions which contain an
assignment statement, where the expression type is not
explicitly made to be boolean, elicits the following
message:
warning 2: possible unintended assignment
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In particular, using pointer arithmetic on void pointers,
despite being quite useful, is not legal in standard C.
Avoiding non-standard C constructs will help in porting
the library to other compilers/platforms.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>