git_commit_create is supposed to update the given reference
"update_ref", but segfaulted in case of a yet to be born
reference. Fix it.
Signed-off-by: schu <schu-github@schulog.org>
This fixes an issue which was detected while using one of the libgit2 bindings [0]. The lack of the trailing forward slash led the name of references returned by git_reference_listall() to be prefixed with a forward slash.
[0]: https://github.com/libgit2/libgit2sharp/pull/108
Add unit tests to confirm ignore directory pattern matches and
to confirm that ignore and attribute files are loaded properly
into the attribute file cache.
Now that is_dir is calculated correctly for attr/ignore paths,
it is possible to use it so that ignoring "dir/" will properly
match the directory name and ignore the entire directory.
When building an attr path object, the code that checks if the
file is a directory was evaluating the file as a relative path
to the current working directory, instead of using the repo root.
This lead to inconsistent behavior.
git_refspec_transform_r assumed that the reference name passed would
be only a branch or tag name. This is not the case, and we need to
take into consideration what's in the refspec's source to know how
much of the prefix to ignore.
This had been left over from a time when I believed what the git
documentation had to say about case-sensitivity. The rest of the code
doesn't recognize this form and we hadn't noticed because most tests
don't try to get a recently-set variable but free and reload the
configuration, causing the right format to be used.
This takes all of the functions that look up simple data about
paths (such as `git_futils_isdir`) and moves them over to path.h
(becoming `git_path_isdir`). This leaves fileops.h just with
functions that actually manipulate the filesystem or look at
the file contents in some way.
As part of this, the dir.h header which is really just for win32
support was moved into win32 (with some minor changes).
Going back over this, the git__removechar function was not
needed (only invoked once) and is actually mislabeled. As
implemented, it really only made sense for removing backslash
characters, since two of the "removed" characters in a row
would include the second one -- i.e. it really implements
stripping backslash-escaped strings where a backslash allows
internal whitespace in a word.
Per issue #533, the handling of relative paths in attribute
and ignore files was not right. Fixed this by pre-joining
the relative path of the attribute/ignore file onto the match
string when a full path match is required.
Unfortunately, fixing this required a bit more code than I
would have liked because I had to juggle things around so that
the fnmatch parser would have sufficient information to prepend
the relative path when it was needed.
After reviewing the gitignore support with Vicent, we came up
with a list of minor cleanups to prepare for merge, including:
* checking git_repository_config error returns
* renaming git_ignore_is_ignored and moving to status.h
* fixing next_line skipping to include \r skips
* commenting on where ignores are and are not included
off_t is always 32 bits in Windows, which is beyond stupid, but we just
don't care anymore because we're using `git_off_t` which is assured to
be 64 bits on all platforms, regardless of compilation mode. Just
ensure that no casts to `off_t` are performed.
Also, the check for `off_t` overflows has been dropped, once again,
because the size of our offsets is always 64 bits on all platforms.
Fixes#534
In the main loop we peek to see what kind of line the next one is. If
there are multiple newlines before the end of the file, the eof marker
won't be set after we read the last line with data and we'll try to
peek again. This peek will return LF (as it pretends that we have a
newline at EOF so other function don't need any special handling).
Fix cfg_getchar so it doesn't try to read past the last character in
the file and config_parse so it considers LF as EOF on peek (as we're
ignoring spaces) and sets the reader's EOF flag to exit the parsing
loop.
This contains fixes for several issues discovered by MSVC and
by valgrind, including some bad data access, some memory
leakage (in where certain files were not being successfully
added to the cache), and some code simplification.
This gets rid of the crazy macro version of git_path_walk_up
and makes it into a normal function that takes a callback
parameter. This turned out not to be too messy.
This fixes issue 532 that attributes (and gitignores) could not
be checked for files that don't exist. It should be possible to
query such things regardless of the existence of the file.
Adds support for .gitignore files to git_status_foreach() and
git_status_file(). This includes refactoring the gitattributes
code to share logic where possible. The GIT_STATUS_IGNORED flag
will now be passed in for files that are ignored (provided they
are not already in the index or the head of repo).
It turns out that passing NULL for the second parameter of realpath(3)
is not as portable as one might like. Notably, Mac OS 10.5 and earlier
does not support it. So this moves us back to a large buffer to get
the realpath info.
This updates to implementation of gitattribute macros to be much more
similar to core git (albeit not 100%) and to handle expansion of
macros within macros, etc. It also cleans up the refcounting usage
with macros to be much cleaner.
Also, this adds a new vector function `git_vector_insert_sorted()`
which allows you to maintain a sorted list as you go. In order to
write that function, this changes the function `git__bsearch()` to
take a somewhat different set of parameters, although the core
functionality is still the same.
Currently, diff_index passes the full relative path from the
repository root to the callback. In case of an addition, it passes
the tree entry instead of the index entry.
This change fixes the path used for addition, and it passes only
the basename of the path. This mimics the current behavior of
git_tree_diff.
Add support for git attribute macro definitions. Also, add
support for cache flush API to clear the attribute file content
cache when needed.
Additionally, improved the handling of global and system files,
making common utility functions in fileops and converting config
and attr to both use the common functions.
Adds a bunch more tests and fixed some memory leaks. Note that
adding macros required me to use refcounted attribute assignment
definitions, which complicated, but probably improved memory usage.
This makes libgit2 compliant with the following scenario
$ git ls-remote file:///d:/temp/dwm%20tinou
732d790b702db4b8985f5104fc44642654f6a6b6 HEAD
732d790b702db4b8985f5104fc44642654f6a6b6 refs/heads/master
732d790b702db4b8985f5104fc44642654f6a6b6 refs/remotes/origin/HEAD
732d790b702db4b8985f5104fc44642654f6a6b6 refs/remotes/origin/master
$ mv "/d/temp/dwm tinou" /d/temp/dwm+tinou
$ git ls-remote file:///d:/temp/dwm%20tinou
fatal: 'd:/temp/dwm tinou' does not appear to be a git repository
fatal: The remote end hung up unexpectedly
$ git ls-remote file:///d:/temp/dwm+tinou
732d790b702db4b8985f5104fc44642654f6a6b6 HEAD
732d790b702db4b8985f5104fc44642654f6a6b6 refs/heads/master
732d790b702db4b8985f5104fc44642654f6a6b6 refs/remotes/origin/HEAD
732d790b702db4b8985f5104fc44642654f6a6b6 refs/remotes/origin/master
This adds APIs for querying git attributes. In addition to
the new API in include/git2/attr.h, most of the action is in
src/attr_file.[hc] which contains utilities for dealing with
a single attributes file, and src/attr.[hc] which contains
the implementation of the APIs that merge all applicable
attributes files.
Return an error if we can't write an updated version of the config file
after config_delete.
Along with that, fix an uninitialized warning.
Signed-off-by: schu <schu-github@schulog.org>
Instead of just setting the value to NULL, which gives unwanted
results when asking for that variable after deleting it, delete the
variable from the list and re-write the file.
It was not safe for git_buf_joinpath to be used with a pointer
into the buf itself because a reallocation could invalidate
the input parameter that pointed into the buffer. This patch
makes it safe to self join, at least for the leading input to
the join, which is the common "append" case for self joins.
Also added unit tests to explicitly cover this case.
This should actually fix#511
This converts virtually all of the places that allocate GIT_PATH_MAX
buffers on the stack for manipulating paths to use git_buf objects
instead. The patch is pretty careful not to touch the public API
for libgit2, so there are a few places that still use GIT_PATH_MAX.
This extends and changes some details of the git_buf implementation
to add a couple of extra functions and to make error handling easier.
This includes serious alterations to all the path.c functions, and
several of the fileops.c ones, too. Also, there are a number of new
functions that parallel existing ones except that use a git_buf
instead of a stack-based buffer (such as git_config_find_global_r
that exists alongsize git_config_find_global).
This also modifies the win32 version of p_realpath to allocate whatever
buffer size is needed to accommodate the realpath instead of hardcoding
a GIT_PATH_MAX limit, but that change needs to be tested still.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Signed-off-by: Vicent Marti <tanoku@gmail.com>
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Author: Carlos Martín Nieto <carlos@cmartin.tk>
#
# On branch development
# Your branch is ahead of 'origin/development' by 11 commits.
#
# Changes to be committed:
# (use "git reset HEAD^1 <file>..." to unstage)
#
# modified: include/git2/tree.h
# modified: src/tree.c
# modified: tests-clay/clay_main.c
# modified: tests-clay/object/tree/diff.c
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# 0001-remote-Cleanup-the-remotes-code.patch
# 466.patch
# 466.patch.1
# 488.patch
# Makefile
# libgit2.0.15.0.dylib
# libgit2.0.dylib
# libgit2.dylib
# libgit2_clay
# libgit2_test
# tests-clay/object/tree/
For each difference in the trees, the callback gets called with the
relevant information so the user can fill in their own data
structures.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This streamlines git_buf_join and removes the join-append behavior,
opting instead for a very compact join-replace of the git_buf contents.
The unit tests had to be updated to remove the join-append tests and
have a bunch more exhaustive tests added.
Taking a page from core git's strbuf, this introduces git_buf_initbuf
which is an empty string that is used to initialize the git_buf ptr
value even for new buffers. Now the git_buf ptr will always point to
a valid NUL-terminated string.
This change required jumping through a few hoops for git_buf_grow
and git_buf_free to distinguish between a actual allocated buffer
and the global initial value. Also, this moves the allocation
related functions to be next to each other near the top of buffer.c.
At a tiny cost of 1 extra byte per allocation, this makes
git_buf_cstr into basically a noop, which simplifies error
checking when trying to convert things to use dynamic allocation.
This patch also adds a new function (git_buf_copy_cstr) for copying
the cstr data directly into an external buffer.
This commit addresses two of the comments:
* renamed existing n-input git_buf_join to git_buf_join_n
* added new git_buf_join that always takes two inputs
* moved some parameter error checking to asserts
* extended unit tests to cover new version of git_buf_join
Add new functions to git_buf for:
* initializing a buffer from a string
* joining one or more strings onto a buffer with separators
* swapping two buffers in place
* extracting data from a git_buf (leaving it empty)
Also, make git_buf_free leave a git_buf back in its initted state,
and slightly tweak buffer allocation sizes and thresholds.
Finally, port unit tests to clay and extend with lots of new tests
for the various git_buf functions.
The ownership semantics have been changed all over the library to be
consistent. There are no more "borrowed" or duplicated references.
Main changes:
- `git_repository_open2` and `3` have been dropped.
- Added setters and getters to hotswap all the repository owned
objects:
`git_repository_index`
`git_repository_set_index`
`git_repository_odb`
`git_repository_set_odb`
`git_repository_config`
`git_repository_set_config`
`git_repository_workdir`
`git_repository_set_workdir`
Now working directories/index files/ODBs and so on can be
hot-swapped after creating a repository and between operations.
- All these objects now have proper ownership semantics with
refcounting: they all require freeing after they are no longer
needed (the repository always keeps its internal reference).
- Repository open and initialization has been updated to keep in
mind the configuration files. Bare repositories are now always
detected, and a default config file is created on init.
- All the tests affected by these changes have been dropped from the
old test suite and ported to the new one.
Update all stack allocations of git_filebuf to use GIT_FILEBUF_INIT
and make git_filebuf_open and git_filebuf_cleanup safe to be called
multiple times on the same buffer.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
reference_rename used to delete an old reflog file when renaming a
reference to not confuse git.git. Don't do this anymore but let the user
take care of writing a reflog entry.
Signed-off-by: schu <schu-github@schulog.org>
There is no good reason to expose the negotiation as a different step
to downloading the packfile.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
It's redundant to do this (git doesn't) and Windows doesn't allow us
to overwrite a read-only file (which objects are).
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
See `global.c` for a description of what we're doing.
When libgit2 is built with GIT_THREADS support, the threading system
must be explicitly initialized with `git_threads_init()`.
This new version of the references code is significantly faster and
hopefully easier to read.
External API stays the same. A new method `git_reference_reload()` has
been added to force updating a memory reference from disk. In-memory
references are no longer updated automagically -- this was killing us.
If a reference is deleted externally and the user doesn't reload the
memory object, nothing critical happens: any functions using that
reference should fail gracefully (e.g. deletion, renaming, and so on).
All generated references from the API are read only and must be free'd
by the user. There is no reference counting and no traces of generated
references are kept in the library.
There is no longer an internal representation for references. There is
only one reference struct `git_reference`, and symbolic/oid targets are
stored inside an union.
Packfile references are stored using an optimized struct with flex array
for reference names. This should significantly reduce the memory cost of
loading the packfile from disk.
git_reference_rename() didn't properly cleanup old references given by
the user to not break some ugly old tests. Since references don't point
to libgit's internal cache anymore we can cleanup git_reference_rename()
to be somewhat less messy.
Signed-off-by: schu <schu-github@schulog.org>
Currently libgit2 shares pointers to its internal reference cache with
the user. This leads to several problems like invalidation of reference
pointers when reordering the cache or manipulation of the cache from
user side.
Give each user its own git_reference instead of leaking the internal
representation (struct reference).
Add the following new API functions:
* git_reference_free
* git_reference_is_packed
Signed-off-by: schu <schu-github@schulog.org>
This ensures that entries from the working directory are retrieved according to the following rules:
- The file "subdir" should appear before the file "subdir.txt"
- The folder "subdir" should appear after the file "subdir.txt"
The only caller has been changed to treat a NULL tree as a special
case and use the existing git_tree_entry_byindex.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This function is already implemented (better) as git_index_get. Change
the only caller to use that function.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Our previous assumption that all paths in Windows are encoded in UTF-8
is rather weak, specially when considering that Git is
encoding-agnostic.
These set of functions allow the user to change the library's active
codepage globally, so it is possible to access paths and files on all
international versions of Windows.
Note that the default encoding here is UTF-8 because we assume that 99%
of all Git repositories will be in UTF-8.
Also, if you use non-ascii characters in paths, anywhere, please burn on
a fire.
libgit2 currently identifies loose objects as corrupt if they've been
deflated using a window size less than 32Kb, because the
is_zlib_compressed_data() function doesn't recognise the header
byte as a zlib header. This patch makes the method tolerant of
all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice
it's accuracy in distingushing the standard loose-object format
from the experimental (now abandoned) format. It's based on a patch
which has been merged into C-Git master branch:
https://github.com/git/git/commit/7f684a2aff636f44a506
On memory constrained systems zlib may use a much smaller window
size - working on Agit, I found that Android uses a 4KB window;
giving a header byte of 0x48, not 0x78. Consequently all loose
objects generated by the Android platform appear 'corrupt' :(
It might appear that this patch changes isStandardFormat() to the
point where it could incorrectly identify the experimental format as
the standard one, but the two criteria (bitmask & checksum) can only
give a false result for an experimental object where both of the
following are true:
1) object size is exactly 8 bytes when uncompressed (bitmask)
2) [single-byte in-pack git type&size header] * 256
+ [1st byte of the following zlib header] % 31 = 0 (checksum)
As it happens, for all possible combinations of valid object type
(1-4) and window bits (0-7), the only time when the checksum will be
divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which,
due the fields all Commit objects must contain, could never be as
small as 8 bytes in size.
Given this, the combination of the two criteria (bitmask & checksum)
always correctly determines the buffer format, and is more tolerant
than the previous version.
References:
Android uses a 4KB window for deflation:
http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb258
Code snippet searching for false positives with the zlib checksum:
https://gist.github.com/1118177
Change-Id: Ifd84cd2bd6b46f087c9984fb4cbd8309f483dec0
Remove a wrong call to git_mwindow_close which caused a segfault if it
ever did run. In that same piece of code, if the LRU was from the
first wiindow in the list in a different file, we didn't update that
list, so the first element had been freed.
Fix these two issues.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
The following files now have 0444 permissions:
- loose objects
- pack indexes
- pack files
- packs downloaded by fetch
- packs downloaded by the HTTP transport
And the following files now have 0666 permissions:
- config files
- repository indexes
- reflogs
- refs
This brings libgit2 more in line with Git.
Note that git_filebuf_commit() and git_filebuf_commit_at() have both
gained a new mode parameter.
The latter change fixes an important issue where filebufs created with
GIT_FILEBUF_TEMPORARY received 0600 permissions (due to mkstemp(3)
usage). Now we chmod() the file before renaming it into place.
Tests have been added to confirm that new commit, tag, and tree
objects are created with the right permissions. I don't have access to
Windows, so for now I've guarded the tests with "#ifndef GIT_WIN32".
To further match how Git behaves, this change makes most of the
directories libgit2 creates in a git repo have a file mode of
0777. Specifically:
- Intermediate directories created with git_futils_mkpath2file() have
0777 permissions. This affects odb_loose, reflog, and refs.
- The top level folder for bare repos is created with 0777
permissions.
- The top level folder for non-bare repos is created with 0755
permissions.
- /objects/info/, /objects/pack/, /refs/heads/, and /refs/tags/ are
created with 0777 permissions.
Additionally, the following changes have been made:
- fileops functions that create intermediate directories have grown a
new dirmode parameter. The only exception to this is filebuf's
lock_file(), which unconditionally creates intermediate directories
with 0777 permissions when GIT_FILEBUF_FORCE is set.
- The test runner now sets the umask to 0 before running any
tests. This ensurses all file mode checks are consistent across
systems.
- t09-tree.c now does a directory permissions check. I've avoided
adding this check to other tests that might reuse existing
directories from the prefabricated test repos. Because they're
checked into the repo, they have 0755 permissions.
- Other assorted directories created by tests have 0777 permissions.
This makes libgit2 more closely match Git, which only checks for
ambiguous pack entries when given short hashes.
Note that the only time this is ever relevant is when a pack has the
same object more than once (it's happened in the wild, I promise).
When trying to find the end of an email, instead of starting at the
beginning of the signature, we start at the end of the name (after the
first '<').
This brings libgit2 more in line with Git's behavior when reading out
existing signatures.
However, note that Git does not allow names like these through the
usual porcelain; instead, it silently strips any '>' characters it
sees.
The v0.99 tag in the Git repo triggers this behavior:
http://git.kernel.org/?p=git/git.git;a=tag;h=d6602ec5194c87b0fc87103ca4d67251c76f233a
Ideally, we'd allow the tag to be instantiated even though the tagger
field is missing, but this at the very least prevents libgit2 from
crashing.
To test this bug, a new repository has been added based on the test
branch in testrepo.git. It contains a "e90810b" tag that looks like
this:
object e90810b8df3e80c413d903f631643c716887138d
type commit
tag e90810b
This is a very simple tag.
This ensures commit->message is always non-NULL, even if the commit
message is empty or consists of only a newline.
One such commit can be found in the wild in the jQuery repository:
25b424134f
Unfortunately, we can't use the function in fetch.c due to chunked
encoding and keep-alive connections.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Using a different buffer in each function means that some data might
get lost. Store all the data in a buffer in the transport object.
Take this opportunity to use the generic download-pack function.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Taken mostly from the git transport's version, this can be used by any
transport that takes its pack data from the network.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
It's a bit awkward to run it as an extra step, and HTTP may need to
send the wants list several times.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
As we don't know the length of the message we want to send to the
other end, we send a chunk size before each message. In later
versions, sending the wants might benefit from batching the lines
together.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Not every request needs a new connection if we're using a keep-alive
connection. Store the HTTP parser, host and port in the transport in
order to have it available in later calls.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
It's rare for a configured remote, but for one given as an URL on the
command line, it's more often than not the case.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
As we no longer use the STRLEN macro, the NUL-terminator in the string
was not copied over. Fix this.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
The documentation is a bit misleading. The subsection name is always
case-sensitive, but with a [section.subsection] header, the subsection
is transformed to lowercase when the configuration is parsed.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Winsock wants us to use closesocket() instead of close(), so introduce
the gitno_close function, which does the right thing.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
memset the structure on initialisation and don't try to dereference
the vector with the heads if we didn't find a repository.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Windows wants us to initialise the networking DLL before we're allowed
to send data through a socket. Call WSASetup and WSACleanup if
GIT_WIN32 is defined.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This is clearer and sidesteps the issue of what the return value of
snprintf is on the particular OS we're running on.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
If ExpandEnvironmentStringsW is successful, it returns the amount of
characters written, including the NUL terminator.
Thanks to Emeric for reading the MSDN documentation correctly.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
6c8b458 removed an "unused" variable needed for git_hashtable_insert2(),
causing a segfault in reference_rename(). Instead, use
git_hashtable_insert().
Signed-off-by: schu <schu-github@schulog.org>
In libgit2: Move an enum out of an int bitfield in the HTTP transport.
In the parser: Use int bitfields and change some variable sizes to
better fit thir use. Variables that count the size of the data chunk
can only ever be as large as off_t. Warning 4127 can be ignored, as
nobody takes it seriously anyway.
From Emeric: change some variable declarations to keep MSVC happy.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
git_repository_config wants to take the global and system paths again
so that one can be explicit if needed.
The git_repository_config_autoload function is provided for the cases
when it's good enough for the library to guess where those files are
located.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Taking advantage of the tree cache, git_tree_create_fromindex becomes
comparable in speed to git write-tree when the cache is available.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
When a file is updated in the index, it's path needs to be invalidated
in the tree cache as the hash is no longer correct.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Whenever a file is updated in the index, each tree leading towards it
needs to be invalidated. Provide the supporting function.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
The fix introduced in a02fc2cd1 (2011-05-24; index: correctly parse
invalidated TREE extensions) threw out the rest of the data in the
extension if it found an invalidated entry. This was the result of
incorrect reading of the documentation.
Insted, keep reading the extension, as there may be cached data we can
use.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
reference_rename() recently failed when renaming an existing reference
refs/heads/foo/bar -> refs/heads/foo because of a change in the
underlying functions / error codes. Fixes#412.
Signed-off-by: schu <schu-github@schulog.org>
GIT_EOVERFLOW means something different. Use GIT_ESHORTBUFFER. On the
way, remove a redundant sizeof(char).
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
There were quite a few places were spaces were being used instead of
tabs. Try to catch them all. This should hopefully not break anything.
Except for `git blame`. Oh well.
1. The license header is technically not valid if it doesn't have a
copyright signature.
2. The COPYING file has been updated with the different licenses used in
the project.
3. The full GPLv2 header in each file annoys me.
For unsigned types, the comparison >= 0 is always true, so avoid it by using
a post-decrement and integrating the initial assigment into the loop body.
No change in behavior is intended.
When Content-Type is the last field, we only know when we can store it
when we reach on_headers_complete.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Depending on what we want to do, we expect the Content-Type field to
have different contents. Store which service to expect instead of
hard-coding the string.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
There is no point in reinventing the wheel when using the treebuilder
is much more straightforward and makes the code more readable. There
is no optimisation, and the performance is no worse than when writing
the tree object ourselves.
write_deflate() used to ignore errors by zlib's deflate function when
not compiling in DEBUG mode. Always read $result and throw an error
instead.
Signed-off-by: schu <schu-github@schulog.org>
As we no longer expose the transport functions, this is now the only
way to connect to a remote when given an URL instead of a remote name
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Sockets on Windows are unsigned, so define a type GIT_SOCKET which is
signed or unsigned depending on the platform.
Thanks to Em for his patience with this.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
GCC produces several -Wuninitialized warnings. Most of them can be fixed
if we make visible for gcc that git__throw() and git__rethrow() always
return first argument.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Some servers take a long time to answer and expect us to keep sending
want lines; otherwise they close the connection. Avoid this by waiting
for one second for the server to answer. If the timeout runs out,
treat is as a NAK and keep sending want lines.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Only signal that we need a pack if we do need it and don't send a want
just because it's the first. If we don't need to download the pack,
then we can skip all of the negotiation and just return success.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
There are many ways how a transport might negotiate with the server,
so instead of making it fit into the smart protocol model, let the
transport do its thing. For now, the git protocol limits itself to
send only 160 "have" lines so we don't flood the server.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
The original was written before any code was written and had nothing
to do with the way things are actually done.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
There is no need to inspect what the local repository is like. Only
check whether the objects exist locally.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Make sure we only try to download the pack if we find the pack header
in the stream, and not if the server takes a bit longer to send us the
last NAK.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This function updates the references in the local reference storage to
match the ones in the remote.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
When indexing a file with ref deltas, a temporary cache for the
offsets has to be built, as we don't have an index file yet. If the
user takes the responsiblity for filling the cache, the packing code
will look there first when it finds a ref delta.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Provide the git_remote_download function to instruct the library to
downlad the packfile and let the user know the temporary location.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Move the generation of the want-list to be done from the negotiate
function, and keep the filtered references inside the remote
structure.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Configurations when taken from a repository and remotes should be
identifiable as coming from a particular repository. This allows us to
reduce the amount of variables that the user has to keep track of.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
We need to do an unsigned comparison, as otherwise UTF-8 characters
might look like they have the sign bit set and the check will fail.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
git_signature_new() and git_signature_now() currently don't return error
codes. Change the API to return error codes and not pointers to let the
user handle errors properly.
Signed-off-by: schu <schu-github@schulog.org>
The callers immediately throw away the offset, so we don't need any
logical changes in any of them. This will be useful for the indexer,
as it does need to know where the compressed data ends.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This code is useful for more things than just the packfile handling
code. Factor it out so it can be reused.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
z_stream.next_in is non-const. Although currently Zlib doesn't modify
buffer content on deflate(), it might be change in the future. gzwrite()
already modify it.
To avoid this let's change signature of git_filebuf.write and rework
git_filebuf_write() accordingly.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
index_init_entry() renamed to index_entry_init(). Now it allocates entry
on its own.
git_index_add() and git_index_append() reworked accordingly.
This commit fixes warning:
/home/kas/git/public/libgit2/src/index.c: In function ‘index_init_entry’:
/home/kas/git/public/libgit2/src/index.c:452:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/index.c: In function ‘git_index_clear’:
/home/kas/git/public/libgit2/src/index.c:228:8: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/index.c:235:8: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/index.c: In function ‘index_insert’:
/home/kas/git/public/libgit2/src/index.c:392:7: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/index.c:399:7: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/index.c: In function ‘read_unmerged’:
/home/kas/git/public/libgit2/src/index.c:681:35: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/index.c: In function ‘read_entry’:
/home/kas/git/public/libgit2/src/index.c:716:33: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/refs.c: In function ‘normalize_name’:
/home/kas/git/public/libgit2/src/refs.c:1681:12: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/tree.c: In function ‘entry_search_cmp’:
/home/kas/git/public/libgit2/src/tree.c:47:36: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/tree.c: In function ‘git_treebuilder_remove’:
/home/kas/git/public/libgit2/src/tree.c:443:31: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/revwalk.c: In function ‘object_table_hash’:
/home/kas/git/public/libgit2/src/revwalk.c:120:7: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
home/kas/git/public/libgit2/src/transport_local.c: In function ‘cmp_refs’:
/home/kas/git/public/libgit2/src/transport_local.c:19:22: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/transport_local.c:20:22: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/reflog.c: In function ‘reflog_parse’:
/home/kas/git/public/libgit2/src/reflog.c:148:17: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
/home/kas/git/public/libgit2/src/commit.c: In function ‘commit_parse_buffer’:
/home/kas/git/public/libgit2/src/commit.c:186:23: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
/home/kas/git/public/libgit2/src/commit.c:187:27: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
In reference_read we stat a file and then call futils which stats it
again. Use git_futils_readbuffer_updated to avoid the extra stat
call. This introduces another parameter which is used to tell the
caller whether the file was read or not.
Modify the callers to take advantage of this new feature. This change
removes ~140 stat calls from the test suite.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
This extends the git_fuitls_readbuffer function to only read in if the
file's modification date is later than the given one. Some code paths
want to check a file's modification date in order to decide whether
they should read it or not. If they do want to read it, another stat
call is done by futils. This function combines these two operations so
we avoid one stat call each time we read a new or updated file.
The git_futils_readbuffer functions is now a wrapper around the new
function.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Current implementation of git_reference_rename() removes 'ref' from
loose cache, but not frees it. In result 'ref' is not reachable any more
and we have got memory leak.
Let's re-add 'ref' with corrected name to loose cache instead of
'new_ref' and free 'new_ref' properly.
'rollback' path seems leak too. git_reference_rename() need to be rewritten
for proper resource management.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
It's not obvious that recurse_tree_entries or recurse_tree_entry
should free a resource that wasn't allocated by them. Do this
explicitely and plug a leak while we're at it.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
The old matcher was returning fake matches when given stupid entry
names. E.g.
`git2` could be matched by `git2 /`, `git2/foobar`, git2/////`
and other stupid stuff
Fixes#127 (that was quite an outstanding issue).
Rationale:
The tree objects on Git are stored and read following a very specific
sorting algorithm that places folders before files. That original sort
was the sort we were storing on memory, but this sort was being queried
with a binary search that used a simple `strcmp` for comparison, so
there were many instances where the search was failing.
Obviously, the most straightforward way to fix this is changing the
binary search CB to use the same comparison method as the sorting CB.
The problem with this is that the binary search callback compares a path
and an entry, so there is no way to know if the given path is a folder
or a standard file.
How do we work around this? Instead of splitting the `entry_byname`
method in two (one for searching directories and one for searching
normal files), we just assume that the path we are searching for is of
the same kind as the path it's being compared at the moment.
return git_futils_cmp_path(
ksearch->filename, ksearch->filename_len, entry->attr & 040000,
entry->filename, entry->filename_len, entry->attr & 040000);
Since there cannot be a folder and a regular file with the same name on
the same tree, the most basic equality check will always fail
for all comparsions, until our path is compared with the actual entry we
are looking for; in this case, the matching will succeed with the file
type of the entry -- whatever it was initially.
I hope that makes sense.
PS: While I was at it, I switched the cmp methods to use cached values
for the length of each filename. That makes searches and sorts
retardedly fast -- I was wondering the reason of the performance hiccups
on massive trees; it's because of 2*strlen for each comparsion call.
mode field of git_index_entry_unmerged is array of unsigned ints. It's
unsafe to cast pointer to an element of the array to long int *. It may
cause overflow in git_strtol32().
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Type casting usually points to some trick or bug. It's better not hide
it between useless type castings.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
The `hashfile` function has been moved to ODB, next to `git_odb_hash`.
Global state has been removed from the dirent call in `status.c`,
because global state is killing the rainforest and causing global
warming.
Add git_status_hashfile() to get blob's object id for a file without adding
it to the object database or needing a repository at all.
This functionality is similar to `git hash-object` without '-w'.
The direct-writes commit left some (slow) internals methods that
were no longer needed. These have been removed.
Also, the Reflog code was using the old `git_signature__write`, so
it has been rewritten to use a normal buffer and the new `writebuf`
signature writer. It's now slightly simpler and faster.
DIRECT WRITES ARE BACK AND FASTER THAN EVER. The streaming writer to the
ODB was an overkill for the smaller objects like Commit and Tags; most
of the streaming logic was taking too long.
This commit makes Commits, Tags and Trees to be built-up in memory, and
then written to disk in 2 pushes (header + data), instead of streaming
everything.
This is *always* faster, even for big files (since the git_filebuf class
still does streaming writes when the memory cache overflows). This is
also a gazillion lines of code smaller, because we don't have to
precompute the final size of the object before starting the stream (this
was kind of defeating the point of streaming, anyway).
Blobs are still written with full streaming instead of loading them in
memory, since this is still the fastest way.
A new `git_buf` class has been added. It's missing some features, but
it'll get there.
Our good, lovely folks at Microsoft decided that there was no good
reason to make `vsnprintf` compilant with the C standard, so that
function in Windows returns -1 on overflow, instead of returning the
actual byte count needed to write the full string.
We now handle this situation more gracefully with the POSIX
compatibility layer, by returning the needed byte size using an
auxiliary method instead of blindly resizing the target buffer until it
fits.
This means we can now support `printf`s of any size by allocating a
temporary buffer. That's good.
So far libgit2 didn't support reference logs (reflog). Add a new
git_reflog_* API for basic reading and writing of reflogs:
* git_reflog_read
* git_reflog_write
* git_reflog_free
Signed-off-by: schu <schu-github@schulog.org>
Drop the GLibc implementation of Merge Sort and replace it with Timsort.
The algorithm has been tuned to work on arrays of pointers (void **),
so there's no longer a need to abstract the byte-width of each element
in the array.
All the comparison callbacks now take pointers-to-elements, not
pointers-to-pointers, so there's now one less level of dereferencing.
E.g.
int index_cmp(const void *a, const void *b)
{
- const git_index_entry *entry_a = *(const git_index_entry **)(a);
+ const git_index_entry *entry_a = (const git_index_entry *)(a);
The result is up to a 40% speed-up when sorting vectors. Memory usage
remains lineal.
A new `bsearch` implementation has been added, whose callback also
supplies pointer-to-elements, to uniform the Vector API again.
`git_futils_rmdir_r`: rename, clean up.
`git_reference_rename`: cleanup. Do not use 3x4096 buffers on the stack
or things will get ugly very fast. We can reuse the same buffer.
MSYS/MinGW uses winsock but obviously doesn't set _MSC_VER. Use _WIN32
to decide whether to use winsock or BSD headers. Also remove these
headers from src/transport_git.c altogether, as they are not needed.
MSYS is very conservative, so we have to tell it that we don't care
about versions of Windows lower than WindowsXP. We also need to tell
CMake to add ws2_32 to the libraries list and we shouldn't add the
-fPIC option, to MSYS because it complains that it does it anyway.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
So far libgit2 didn't handle the following scenarios:
* Rename of reference m -> m/m
* Rename of reference n/n -> n
Fixed.
Since we don't write reflogs, we have to delete any old reflog for the
renamed reference. Otherwise git.git will possibly fail when it finds
invalid logs.
Reported-by: nulltoken <emeric.fermas@gmail.com>
Signed-off-by: schu <schu-github@schulog.org>
git_futils_rmdir_recurs() shall remove the given directory and all
subdirectories. This happens only if the directories are empty.
Signed-off-by: schu <schu-github@schulog.org>
It removes all entries with equal path except last added.
On large indexes git_index_append() + git_index_uniq() before writing is
*much* faster, than git_index_add().
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
The routine remove duplictes from the vector. Only the last added element
of elements with equal keys remains in the vector.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Index operation use git_vector_sort() to sort index entries. Since index
support adding duplicates (two or more entries with the same path), it's
important to preserve order of elements. Preserving order of elements
allows to make decisions based on order. For example it's possible to
implement function witch removes all duplicates except last added.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
In some cases it's important to preserve order of elements with equal
keys (stable sort). qsort(3) doesn't define order of elements with
equal keys.
git__msort() implements merge sort which is stable sort.
Implementation taken from git. Function renamed git_qsort() -> git__msort().
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
git_index_find() in index_insert() is useless if replace is not
requested (append). Do not call it in this case.
It speedup git_index_append() *dramatically* on large indexes.
$ cat index_test.c
int main(int argc, char **argv)
{
git_index *index;
git_repository *repo;
git_odb *odb;
struct git_index_entry entry;
git_oid tree_oid;
char tree_hex[41];
int i;
git_repository_init(&repo, "/tmp/myrepo", 0);
odb = git_repository_database(repo);
git_repository_index(&index, repo);
memset(&entry, 0, sizeof(entry));
git_odb_write(&entry.oid, odb, "", 0, GIT_OBJ_BLOB);
entry.path = "test.file";
for (i = 0; i < 50000; i++)
git_index_append2(index, &entry);
git_tree_create_fromindex(&tree_oid, index);
git_oid_fmt(tree_hex, &tree_oid);
tree_hex[40] = '\0';
printf("tree: %s\n", tree_hex);
git_index_free(index);
git_repository_free(repo);
return 0;
}
Before:
$ time ./index_test
tree: 43f73659c43b651588cc81459d9e25b08721b95d
./index_test 151.19s user 0.05s system 99% cpu 2:31.78 total
After:
$ time ./index_test
tree: 43f73659c43b651588cc81459d9e25b08721b95d
./index_test 0.05s user 0.00s system 94% cpu 0.059 total
About 2573 times speedup on this test :)
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Check if the window structure has actually been allocated before
trying to access it, and don't leak said structure if the map fails.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Without this, hashtable.h doesn't know what uint32_t is and the
compiler thinks that it's a function type.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If the section header is the last line in the file,
parse_section_header would incorrectly decide that the input had been
truncated.
Fix this by checking whether the actual input line is correctly
formatted.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
git_signature__parse used to be very strict about what's a well-formed
signature. Since git_signature__parse is used only when reading already
existing signatures, we should not care about if it's a valid signature
too much but rather show what we got.
Reported-by: nulltoken <emeric.fermas@gmail.com>
Signed-off-by: schu <schu-github@schulog.org>
The `stat` methods were having issues when called with a trailing slash
in Windows platforms.
We now use GetFileAttributes() where possible, which doesn't have this
restriction.
The old `git_fileops_prettify_path` has been replaced with
`git_path_prettify`. This is a much simpler method that uses the OS's
`realpath` call to obtain the full path for directories and resolve
symlinks.
The `realpath` syscall is the original POSIX call in Unix system and
an emulated version under Windows using the Windows API.
Cleaned up the structure of the whole OS-abstraction layer.
fileops.c now contains a set of utility methods for file management used
by the library. These are abstractions on top of the original POSIX
calls.
There's a new file called `posix.c` that contains
emulations/reimplementations of all the POSIX calls the library uses.
These are prefixed with `p_`. There's a specific posix file for each
platform (win32 and unix).
All the path-related methods have been moved from `utils.c` to `path.c`
and have their own prefix.
The assertion in line 360 was there to check that only loose refs were
being written as loose, but there are times when we need to re-write a
packed reference as loose.
Core Git doesn't care when people use spaces in the email address, even
though this kind of foolery receives the capital punishment in several
states of the USA.
We must obey.
A bunch of redundant methods have been removed from the external API.
- All the reference/tag creation methods with `_f` are gone. The force
flag is now passed as an argument to the normal create methods.
- All the different commit creation methods are gone; commit creation
now always requires a `git_commit` pointer for parents and a `git_tree`
pointer for tree, to ensure that corrupted commits cannot be generated.
- All the different tag creation methods are gone; tag creation now
always requires a `git_object` pointer to ensure that tags are not
created to inexisting objects.
Remove the unused repo and private pointers and make the direction a
flag, as it can only have two states. Change the connect signature to
use an int instead of git_net_direction and remove that enum.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Remove call of gitfo_size, since we call gitfo_lstat anyway; remove some
old workaround code for gitfo_read, which is obsolete now.
Signed-off-by: schu <schu-github@schulog.org>
Rather than an 'private' pointer, make the private structures inherit
from the generic git_transport struct. This way, we only have to worry
about one memory allocation instead of two. The structures are so
simple that this may even make the code use less memory overall.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
By pre-sorting the references, they are already in the right order if
we want to peel them.
With this, we get output-parity with git.git's ls-remote.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This makes it easier to send a requqest for an URL. It assumes there
is a socket where the string should go out to.
Make git_pkt_gen_proto accept a command parameter, which defaults to
git-upload-pack
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Add a parameter to git_pkt_parse_line to tell it how much data you
have in your buffer. If the buffer is too short, it returns an error
saying so. Adapt the git transport to use this and fix the offset
calculation.
Add the GIT_ESHORTBUFFER error code.
A pkt-line's length are described in its first four bytes in ASCII
hex. Copy this substring to another string before feeding it to
git__strtol32. Otherwise, it will read the whole hash.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
This are the types I intend to use for pkt-line parsing and (later)
creation. git_pkt serves as a base pointer type and once you know what
type it is you can use the real one (command, tip list, etc.)
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
If the strings match, git__fnmatch returns GIT_SUCCESS and
GIT_ENOMATCH on failure to match.
MSVC fixes: Added a test for _MSC_VER and (in that case) defined
HAVE_STRING_H to 1 so it doesn't try to include <strings.h> which
doesn't exist in the MSVC world. Moved the function declarations to
use the modern inline ones so MSVC doesn't have a fit. Added casts
everywhere so MSVC doesn't crap its pants.
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Move them to their own functions to avoid duplication and to make it
easier to ignore missing configuration.
Not finding 'fetch' is considered fatal, though this might not be
correct behaviour (push-only remotes?)
Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Call gitfo_lstat with the full pathname instead of the relative one,
which fails in case the current working directory is different from
the workdir.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Also allow space for the null-terminator when allocating the buffer in
packfile_unpack_compressed. Up to now, the last newline had served as
a terminator, but 858ef372 searches for a double-newline and exposes
the problem.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
If a config has several files, we need to check all of them before we
can say that a variable doesn't exist.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
This function puts the global and repository configurations in one
git_config object and gives it to the user.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
There is no need to store the format outselves, as the library
provides git_filebuf_printf which takes care of the formatting itself.
Also get rid of an use of strcat + strcpy which is always a nice
thing.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
The better solution would probably be to turn the gitfo_lstat /
gitfo_readlink macros into real functions that wrap either lstat or
gitfo_lstat__w32 (and readlink or gitfo_readlink__w32). However, that
would introduce an indirection unless inlined. For now, this is the less
intrusive change.
Handle Symlinks if they can be handled in Win32. This is not even
compiled. Needs review.
The lstat implementation is modified from core Git.
The readlink implementation is modified from PHP.
In git, the short message of a commit is the part of the commit message before 2 consecutive line breaks. In the short message, line breaks are replaced by space characters.
After each variable gets set, we store it in our list (not completely
in the right position, but the close enough). Then we write out the
new config file in the same way that git.git does it (keep the rest of
the file intact and insert or replace the variable in its line).
Overwriting variables and adding new ones is supported (even on new
sections), though deleting isn't yet.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
The section name should be stored in its case-sensitive variant when
we are adding a new variable. Use the internalize_section function to
do just that.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
The (rather late) early-exit code, which provides a negligible
optimisation causes cvar_match_section to return false negatives when
it's called with a section name instead of a full variable name.
Remove this optimisation.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Add a check for the file descriptor in git_filebuf_cleanup. Without
it, an existing lockfile would be deleted if we tried to acquire it
(but failed, as the lockfile already existed).
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
When calling gitfo_exists() on a symbolic link, sometimes we need to
simply check whether the link exists and sometimes we need to check
whether the file pointed to by the symlink exists.
Introduce a new function gitfo_shallow_exists that only checks if the
link exists and revert gitfo_exists to the original functionality of
checking whether the file pointed to by the link exists.
This reverts commit df1c98ab6d6171ed63729195bd190b54b67fe530.
As 8a27b6b reverts the exposition of struct stat to the external API, we
do not need - indeed, do not want - struct stat to be in the outer
include layer.
00582bc introduced a change that required the caller of
git_blob_create_fromfile() to pass a struct stat with the stat
information for the file. Several developers pointed out that this would
make life hard for the bindings developers as struct stat isn't widely
supported by other languages.
Make git_blob_create_fromfile() stat the path itself, eliminating the
need for the file to be stat'ed by the caller. This makes
index_init_entry() more costly as the file will be stat'ed twice but
makes life easier for everyone else.
00582bcb introduced a change to git_blob_create_fromfile() that required
the caller to pass a stat struct. This means that we need to include
stat.h higher in the hierarchy of includes.
The entry mode flags for an entry created from a path name were not
correctly written if the entry was a symlink. The st_mode of a statted
symlink is 0120777, however git requires the mode to read 0120000,
because it does not care about permissions of symlinks.
Introduce index_create_mode() that correctly writes the mode flags in
the form expected by git.
gitfo_exists() used to error out if the given file was a symbolic link,
due to access() returning an error code. This is not expected behaviour,
as gitfo_exists() should only check whether the file itself exists, not
its link target if it is a symbolic link.
Fix this by calling gitfo_lstat() instead, which is just a wrapper for
lstat().
Also fix the same error in index_init_entry().
In order to be able to write symlinks with git_blob_create_fromfile(),
we need to check whether the file to be written is a symbolic link or
not. Since the calling function of git_blob_create_fromfile() is likely to have
stated the file before calling, we make it pass the stat.
The reason for this is that writing symbolic link blobs is significantly
different from writing ordinary files - we do not want to open the link
destination but instead want to write the link itself, regardless of
whether it exists or not.
Previously, index_init_entry() used to error out if the file to be added
was a symlink that pointed to a nonexistent file. Fix this behaviour to
add the file regardless of whether it exists. This mimics git.git's
behaviour.
As suggested by carlosmn, git_oid_ncmp would probably
be a better name than git_oid_match, for it does the same
as git_oid_cmp but only up to a certain amount of hex digits.
Feature Added: Search an unmerged entry by path (git_index_get_unmerged
renamed to git_index_get_unmerged_bypath) or by index (git_index_get_unmerged_byindex).
Now the ceiling_dirs are compared with their symbolic free version (like base_path).
The ceiling dirs check is now performed after getting the parent directory.
retrieve_device returns the file device for a given path (so that we can detect device change while walking through parent directories).
abspath returns a canonicalized path, symbolic link free.
retrieive_ceiling_directories_offset returns the biggest path offset that path match in the ceiling directory list (so that we can stop at ceiling directories).
Implemented find_unique_short_oid for pack backend, based on git sha1 lookup method;
finding an object given its full oid is just a particular case of searching
the unique object matching an oid prefix (short oid).
Added git_odb_read_unique_short_oid, which iterates over all the backends to
find and read the unique object matching the given oid prefix.
Added a git_object_lookup_short_oid method to find the unique object in
the repository matching a given oid prefix : it generalizes git_object_lookup
which now does nothing but calls git_object_lookup_short_oid.
A TREE extension with an entry count of -1 means that it was
invalidated and we should ignore it. Do so instead of returning an
error.
This fixes issue #202
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
There are two reasons why read_tree_internal might return a NULL
tree. The first one is a corrupt index, but the second one is an
invalidated TREE extension. Up to now, its only way to communicate
with its caller was through the return value being NULL or not.
Allow read_tree_internal to report its exit status independently from
the tree pointer.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Before this commit, malformed tag and signature were considered as
valid by the parser. See the test t3800-mktag.sh of git to see example
of malformed tag and signature.
If parse_section_header{,_ext} return an error, current_section
doesn't get allocated. Set it to NULL after freeing so we don't try to
free it again.
This fixes part 2-2 of Issue #210.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Simplify cfg_readline and at the same time fix it so that it does
really ignore empty lines.
This fixes point 2-1 of Issue #210
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
On error, it would free the configuration object even though it didn't
own that memory, which would cause a double-free.
This fixes the first part of Issue #210
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
"git_config_backend" have been renamed to "git_config_file", which
implements a generic interface to access a configuration file -- be it
either on disk, from a DB or whatever mumbojumbo.
I think this makes more sense.
Regarding "initialize" vs. "initialise", www.dict.cc says the first is American
English whereas the latter in British English. For consistency, we should
stick to American English.
When I changed it over to use different strings for the variable and
the name, cvar_name_normalize was left behind. Fix this and rename to
cvar_normalize_name to reflect the incompatible change.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Now we use a simple closed-addressing cache. Cuckoo hashing was creating
too many issues with race conditions. Fuck that.
Let's see what happens performance wise, we may have to roll back or
come up with another way to implement an efficient multi-threaded cache.
Configuration options can come from different sources. Currently,
there is only support for reading them from a flat file, but it might
make sense to read it from a database at some point.
Move the parsing code into src/config_file.c and create an include
file include/git2/config_backend.h to allow for other backends to be
developed.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Ok, this is the real deal. Hopefully. Here's how it's going to work:
- One main method, called `git__throw`, that sets the error
code and error message when an error happens.
This method must be called in every single place where an error
code was being returned previously, setting an error message
instead.
Example, instead of:
return GIT_EOBJCORRUPTED;
Use:
return git__throw(GIT_EOBJCORRUPTED,
"The object is missing a finalizing line feed");
And instead of:
[...] {
error = GIT_EOBJCORRUPTED;
goto cleanup;
}
Use:
[...] {
error = git__throw(GIT_EOBJCORRUPTED, "What an error!");
goto cleanup;
}
The **only** exception to this are the allocation methods, which
return NULL on failure but already set the message manually.
/* only place where an error code can be returned directly,
because the error message has already been set by the wrapper */
if (foo == NULL)
return GIT_ENOMEM;
- One secondary method, called `git__rethrow`, which can be used to
fine-grain an error message and build an error stack.
Example, instead of:
if ((error = foobar(baz)) < GIT_SUCCESS)
return error;
You can now do:
if ((error = foobar(baz)) < GIT_SUCCESS)
return git__rethrow(error, "Failed to do a major operation");
The return of the `git_lasterror` method will be a string in the
shape of:
"Failed to do a major operation. (Failed to do an internal
operation)"
E.g.
"Failed to open the index. (Not enough permissions to access
'/path/to/index')."
NOTE: do not abuse this method. Try to write all `git__throw`
messages in a descriptive manner, to avoid having to rethrow them to
clarify their meaning.
This method should only be used in the places where the original
error message set by a subroutine is not specific enough.
It is encouraged to continue using this style as much possible to
enforce error propagation:
if ((error = foobar(baz)) < GIT_SUCCESS)
return error; /* `foobar` has set an error message, and
we are just propagating it */
The error handling revamp will take place in two phases:
- Phase 1: Replace all pieces of code that return direct error codes
with calls to `git__throw`. This can be done semi-automatically
using `ack` to locate all the error codes that must be replaced.
- Phase 2: Add some `git__rethrow` calls in those cases where the
original error messages are not specific enough.
Phase 1 is the main goal. A minor libgit2 release will be shipped once
Phase 1 is ready, and the work will start on gradually improving the
error handling mechanism by refining specific error messages.
OTHER NOTES:
- When writing error messages, please refrain from using weasel
words. They add verbosity to the message without giving any real
information. (<3 Emeric)
E.g.
"The reference file appears to be missing a carriage return"
Nope.
"The reference file is missing a carriage return"
Yes.
- When calling `git__throw`, please try to use more generic error
codes so we can eventually reduce the list of error codes to
something more reasonable. Feel free to add new, more generic error
codes if these are going to replace several of the old ones.
E.g.
return GIT_EREFCORRUPTED;
Can be turned into:
return git__throw(GIT_EOBJCORRUPTED,
"The reference is corrupted");
Win32 critical section objects (CRITICAL_SECTION) are not kernel objects.
Only kernel objects are destroyed by using CloseHandle. Critical sections
are supposed to be deleted with the DeleteCriticalSection API
(http://msdn.microsoft.com/en-us/library/ms682552(VS.85).aspx).
The section and variable names use different rules, so store them as
two different variables internally.
This will simplify the configuration-writing code as well later on,
but even with parsing, the code is simpler.
Take this opportunity to add a variable to the list directly when
parsing instead of passing through config_set.
A root commit is a commit whose branch (usually what HEAD points to)
doesn't exist (yet). This situation can happen when the commit is the
first after 1) a repository is initialized or 2) a orphan checkout has
been performed.
Take this opportunity to remove the symbolic link check, as
git_reference_resolve works on OID refs as well.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Typical use is git_reference_resolve(&ref, ref). Currently, if there is
an error, ref will point to NULL, causing the user to lose that
reference.
Always update resolved_ref instead of just on finding an OID ref,
storing the last valid reference in it.
This change helps simplify the code for allowing root commits.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Magic constant replaced by direct to-string covertion because of:
1) with value length 6 (040000 - subtree) final tree will be corrupted;
2) for wrong values length <6 final tree will be corrupted too.
Removed the optional `replace` argument, we now have 4 add methods:
`git_index_add`: add or update from path
`git_index_add2`: add or update from struct
`git_index_append`: add without replacing from path
`git_index_append2`: add without replacing from struct
Yes, this breaks the bindings.