* gen_pktline() in smart_protocol.c was skipping refspecs that deleted
refs that were not advertised by the server. The new behavior is to
send a delete command with an old-id of zero, which matches the behavior
of the official git client.
* Update test_network_push__delete() in reaction to above fix.
* Obviate messy logic that handles missing push_spec rrefs by canonicalizing
push_spec. After calculate_work(), loid, roid, and rref, are filled in with
exactly what is sent to the server
The original libpqueue file were licensed under Apache 2.0 so
therefore should retain their copyrights and header as per the
license terms at http://www.apache.org/licenses/LICENSE-2.0
When normalizing a reference name, if there is an error because
the name is invalid, then the memory allocated for storing the
name could be leaked if the caller was not careful and assumed
that the error return code meant that no allocation had occurred.
This fixes that by explicitly deallocating the reference name
buffer if there is an error in normalizing the name.
An earlier change to `git_diff_from_iterators` introduced a
memory leak where the allocated spoolandsort iterator was not
returned to the caller and thus not freed.
One proposal changes all iterator APIs to use git_iterator** so
we can reallocate the iterator at will, but that seems unexpected.
This commit makes it so that an iterator can be changed in place.
The callbacks are isolated in a separate structure and a pointer
to that structure can be reassigned by the spoolandsort extension.
This means that spoolandsort doesn't create a new iterator; it
just allocates a new block of callbacks (along with space for its
own extra data) and swaps that into the iterator.
Additionally, since spoolandsort is only needed to switch the
case sensitivity of an iterator, this simplifies the API to only
take the ignore_case boolean and to be a no-op if the iterator
already matches the requested case sensitivity.
The diff constructor functions had some confusing names, where the
"old" side of the diff was coming after the "new" side. This
reverses the order in the function name to make it less confusing.
Specifically...
* git_diff_index_to_tree becomes git_diff_tree_to_index
* git_diff_workdir_to_index becomes git_diff_index_to_workdir
* git_diff_workdir_to_tree becomes git_diff_tree_to_workdir
According to man 3 SSL_shutdown / TLS, "If a unidirectional shutdown is
enough (the underlying connection shall be closed anyway), this first
call to SSL_shutdown() is sufficient."
Currently, an unidirectional shutdown is enough, since
gitno_ssl_teardown is called by gitno_close only. Do so to avoid further
errors (by misbehaving peers for example).
Fixes#1129.
While C Git has been writing entry count -1 (ie. never other negative
numbers) as invalid since day 1, it accepts all negative entry counts
as invalid. JGit follows the same rule. libgit2 should also follow, or
the index that works with C Git or JGit may someday be rejected by
libgit2.
Other reimplementations like dulwich and grit have not bothered with
parsing or writing tree cache.
The `git_iterator_reset` command has not been working in all cases
particularly when there is a start and end range. This fixes it
and adds tests for it, and also extends it with the ability to
update the start/end range strings when an iterator is reset.
This removes the need to explicitly pass the repo into iterators
where the repo is implied by the other parameters. This moves
the repo to be owned by the parent struct. Also, this has some
iterator related updates to the internal diff API to lay the
groundwork for checkout improvements.
If commit timestamps are off, we're more likely to hit a traversal
where the first path ends up traversing past the root commit of the tree.
If that happens, it's possible that the loop will complete before the second
path marks some of those final parents. This fix keeps track of the root
nodes that are encountered in the traversal, and verify that they are
properly marked.
In the best case, with accurate timestamps, the traversal will continue
to terminate when all the commits are STALE (parents of a merge-base), as
it did before. In the worst case, where one path makes a complete traversal
past a root commit, we will continue the loop until the root commit itself
is marked.
This could also use PTHREAD_MUTEX_INITIALIZER, but a dynamic initializer seems like a more portable concept, and we won't need another #define on top of git_mutex_init()
Storing 4kB or 8kB in the stack is not very gentle. As this part has
to be linear, put the buffer into the indexer object so we allocate it
once in the heap.
There are many different broken filemodes in the wild so we need to
protect against them and give something useful up the chain. Don't
fail when reading a tree from the ODB but normalize the mode as best
we can.
As 664 is no longer a mode that we consider to be valid and gets
normalized to 644, we can stop accepting it in the treebuilder. The
library won't expose it to the user, so any invalid modes are a bug.
To paraphrase @peff:
You can get both size and type from a packed object reasonably cheaply.
If you have:
* An object that is not a delta; both type and size are available in the
packfile header.
* An object that is a delta. The packfile type will be OBJ_*_DELTA, and
you have to resolve back to the base to find the real type. That means
potentially a lot of packfile index lookups, but each one is
relatively cheap. For the size, you inflate the first few bytes of the
delta, whose header will tell you the resulting size of applying the
delta to the base.
For simplicity, we just decompress the whole delta for now.
A mmap-window is not guaranteed to give you the whole object, but the
indexer currently assumes so.
Loop asking for more data until we've successfully CRC'd all of the
packed data.
Up to now, deltas needed to be enterily in the packfile, and we tried
to decompress then in their entirety over and over again.
Adjust the logic so we read them as they come, just as we do for full
objects. This also allows us to simplify the logic and have less
nested code. The delta resolving phase still needs to decompress the
whole object into memory, as there is not yet any streaming
delta-apply support, but it helps in speeding up the downloading
process and reduces the amount of memory allocations we need to do.
The new API allows us to read the object bit by bit from the packfile,
instead of needing it all at once in the packfile. This also allows us
to hash the object as it comes in from the network instead of having
to try to read it all and failing repeatedly for larger objects.
This is only the first step, but it already shows huge improvements
when dealing with objects over a few megabytes in size. It reduces the
memory needs in some cases, but delta objects still need to be
completely in memory and the old inefficent method is still used for
that.
`revwalk.h:commit_lookup()` -> `git_revwalk__commit_lookup()`
and make `git_commit_list_parse()` do real error checking that
the item in the list is an actual commit object. Also fixed an
apparent typo in a test name.
Moved it into graph.{c,h} which i created for the new "graph"
functions namespace. Also adjusted the function prototype
to use `size_t` and `const git_oid *`.
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc). This groups all those functions together into a
new file and converts the code to use that.
This has two enhancements to existing functionality. The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).
clang-SVN HEAD kindly provided my the info, that sm_repo maybe
uninitialized when we want to free it (If the expression in line 358 or
359/360 evaluate to true, we jump to "cleanup", where we'd use sm_repo
uninitialized).
This fixes some missed places where we can apply const-ness to
various public APIs.
There are still some index and tree APIs that cannot take const
pointers because we sort our `git_vectors` lazily and so we can't
reliably bsearch the index and tree content without applying a
`git_vector_sort()` first.
This also fixes some missed places where size_t can be used and
where const can be applied to a couple internal functions.
This makes the diff functions that take callbacks both take
the payload parameter after the callback function pointers and
pass the payload as the last argument to the callback function
instead of the first. This should make them consistent with
other callbacks across the API.
3f9eb1e introduced support for SSL certificates issued for IP
addresses, making use of in_addr and in_addr6 structs. On FreeBSD
these are defined in (a file included in) <netinet/in.h>, so include
that file on FreeBSD and get the build working again.
The workdir iterator has always tried to ignore .git files, but
it turns out there were some bugs. This makes it more robust at
ignoring .git files.
This also makes iterators always check ".git" case insensitively
regardless of the properties of the system. This will make libgit2
skip ".GIT" and the like. This is different from core git, but on
systems with case insensitive but case preserving file systems,
allowing ".GIT" to be added is problematic.
This checks for a leading '.' before looking for the invalid
tree entry names. Even on pretty high levels of optimization,
this seems to make a measurable improvement.
I accidentally used && in the check initially instead of || and
while debugging ended up improving the error reporting of issues
with adding tree entries. I thought I'd leave those changes, too.
A number of diff APIs and the `git_checkout_index` API take a
`git_repository` object an operate on the index. This updates
them to take a `git_index` pointer explicitly and only fall back
on the `git_repository` index if the index input is NULL. This
makes it easier to operate on a temporary index.
The index iterator could previously only be created from a repo
object, but this allows creating an iterator from a `git_index`
object instead (while keeping, though renaming, the old function).
The existing p_lstat implementation on win32 is not quite POSIX
compliant when setting errno to ENOTDIR. This adds an option to
make is be compliant so that code (such as checkout) that cares
to have separate behavior for ENOTDIR can use it portably.
This also contains a couple of other minor cleanups in the
posix_w32.c implementations to avoid unnecessary work.
Using the builtin strcmp and strcasecmp as function pointers is
problematic on win32. This adds internal implementations and
divorces us from the platform linkage.
Returning NULL for the string when we haven't signaled an error
condition is counter-intuitive and causes unnecessary edge
cases. Return an empty string when asking for a string value for a
configuration variable such as '[section] var' to avoid these edge
cases.
If the distinction between no value and an empty value is needed, this
can be retrieved from the entry directly. As a side-effect, this
change stops the int parsing functions from segfaulting on such a
variable.