It is not legal inside our `p_mmap` function to mmap a zero length
file. This adds a test that exercises that case inside diff and
fixes the code path where we would try to do that.
The fix turns out not to be a lot of code since our default file
content is already initialized to "" which works in this case.
Fixes#1210
This moves the implementation of these two APIs into common code
that will be shared between the two. Also, this adds tests for
the `git_diff_blob_to_buffer` API. Lastly, this adds some extra
`const` to a few places that can use it.
Before this, we error out from `reference_matches_remote_head` if the
reference we're searching for does not exist.
Since we explicitly check if master is existing in `update_head_to_remote`
and error out if it doesn't, a repository without master branch could
not be cloned.
In fact this was later clobbered by what is fixed in #1194.
However, this patch introduces a `found` member in `head_info` and sets
it accordingly. That also saves us from checking the string length of
`branchname` a few times.
As a function that appears to only be called on error paths, I don't
think it makes sense for it to return an error, or clobber the global
giterr. Note that no existing callsites actually check the return
code.
In my own application, there are errors where the real error ends
up being hidden, as git_mwindow_file_deregister() clobbers the
global giterr. I'm not sure this error is even relevant?
I saw a repo in the wild today which had a master branch ref which was packed, but had no trailing newline. Git handled it fine, but libgit2 choked on it. Fix seems simple enough. If we don't see a newline, assume the end of the buffer is the end of the ref line.
There are a couple of checkout bugs fixed here. One is with
untracked working directory entries that are prefixes of tree
entries but not in a meaningful way (i.e. "read" is a prefix of
"readme.txt" but doesn't interfere in any way). The second bug
is actually a redo of 07edfa0fc640f85f95507c3101e77accd7d2bf0d
where directory entries in the index that are not in the diff
were not being removed correctly. That fix remedied one case
but broke another.
When checking out with the GIT_CHECKOUT_REMOVE_UNTRACKED option
and there was an entire tree in the working directory and in the
index that is not in the baseline nor target commit, the tree was
correctly(?) removed from the working directory but was not
successfully removed from the index. This fixes that and adds a
test of the functionality.
This moves a lot of the detailed checkout documentation into a new
file (docs/checkout-internals.md) and simplifies the public docs
for the checkout API.
There were a bunch of small bugs in the checkout code where I was
assuming that a typechange was always from a tree to a blob or
vice versa. This fixes up most of those cases. Also, there were
circumstances where the submodule definitions were changed by the
checkout and the submodule data was not getting reloaded properly
before the new submodules were checked out.
The notifications were broken from the various iterations over
this code and were not returning working dir item data correctly.
Also, workdir items that were alphabetically after the last item
in diff were not being processed.
The spoolandsort iterator changes got sort-of cherry picked out of
this branch and so I dropped the commit when rebasing; however,
there were a few small changes that got dropped as well (since the
version merged upstream wasn't quite the same as what I dropped).
This adds a new API to the submodule interface that just returns
where information about the submodule was found (e.g. config file
only or in the HEAD, index, or working directory).
Also, the old "refresh" call was potentially keeping some stale
submodule data around, so this simplfies that code and literally
discards the old cache, then reallocates.
Stash was sometimes obscuring the actual error code, replacing it
with a -1 when there was more descriptive value. This updates
stash to preserve the original error code more reliably along
with a variety of other error handling tweaks.
I believe this is an improvement, but arguably, preserving the
underlying error code may result in values that are harder to
interpret by the caller who does not understand the internals.
Discussion is welcome!
Previously a NULL oid was handled like an empty buffer and
returned a status empty string. This makes git_oid_tostr()
set the output buffer to the empty string instead.
Make checkout update entries in the index for all files that are
updated and/or removed, unless flag GIT_CHECKOUT_DONT_UPDATE_INDEX
is given. To do this, iterators were extended to allow a little
more introspection into the index being iterated over, etc.
This flips checkout back to be driven off the changes between
the baseline and the target trees. This reinstates the complex
code for tracking the contents of the working directory, but
overall, I think the resulting logic is easier to follow.
I've tried to map out the detailed behaviors of checkout and make
sure that we're handling the various cases correctly, along with
providing options to allow us to emulate "git checkout" and "git
checkout-index" with the various flags. I've thrown away flags
in the checkout API that seemed like clutter and added some new
ones. Also, I've converted the conflict callback to a general
notification callback so we can emulate "git checkout" output and
display "dirty" files.
As of this commit, the new behavior is not working 100% but some
of that is probably baked into tests that are not testing the
right thing. This is a decent snapshot point, I think, along the
way to getting the update done.
This corrects the order of operations in git reset so that the
checkout to reset the working directory content is done before
the HEAD is moved. This allows us to use the HEAD and the index
content to know what files can / should safely be reset.
Unfortunately, there are still some cases where the behavior of
this revision differs from core git. Notable, a file which has
been added to the index but is not present in the HEAD is
considered to be tracked by core git (and thus removable by a
reset command) whereas since this loads the target state into
the index prior to resetting, it will consider such a file to be
untracked and won't touch it. That is a larger fix that I'll
defer to a future commit.
* gen_pktline() in smart_protocol.c was skipping refspecs that deleted
refs that were not advertised by the server. The new behavior is to
send a delete command with an old-id of zero, which matches the behavior
of the official git client.
* Update test_network_push__delete() in reaction to above fix.
* Obviate messy logic that handles missing push_spec rrefs by canonicalizing
push_spec. After calculate_work(), loid, roid, and rref, are filled in with
exactly what is sent to the server
The original libpqueue file were licensed under Apache 2.0 so
therefore should retain their copyrights and header as per the
license terms at http://www.apache.org/licenses/LICENSE-2.0
When normalizing a reference name, if there is an error because
the name is invalid, then the memory allocated for storing the
name could be leaked if the caller was not careful and assumed
that the error return code meant that no allocation had occurred.
This fixes that by explicitly deallocating the reference name
buffer if there is an error in normalizing the name.
An earlier change to `git_diff_from_iterators` introduced a
memory leak where the allocated spoolandsort iterator was not
returned to the caller and thus not freed.
One proposal changes all iterator APIs to use git_iterator** so
we can reallocate the iterator at will, but that seems unexpected.
This commit makes it so that an iterator can be changed in place.
The callbacks are isolated in a separate structure and a pointer
to that structure can be reassigned by the spoolandsort extension.
This means that spoolandsort doesn't create a new iterator; it
just allocates a new block of callbacks (along with space for its
own extra data) and swaps that into the iterator.
Additionally, since spoolandsort is only needed to switch the
case sensitivity of an iterator, this simplifies the API to only
take the ignore_case boolean and to be a no-op if the iterator
already matches the requested case sensitivity.
The diff constructor functions had some confusing names, where the
"old" side of the diff was coming after the "new" side. This
reverses the order in the function name to make it less confusing.
Specifically...
* git_diff_index_to_tree becomes git_diff_tree_to_index
* git_diff_workdir_to_index becomes git_diff_index_to_workdir
* git_diff_workdir_to_tree becomes git_diff_tree_to_workdir
According to man 3 SSL_shutdown / TLS, "If a unidirectional shutdown is
enough (the underlying connection shall be closed anyway), this first
call to SSL_shutdown() is sufficient."
Currently, an unidirectional shutdown is enough, since
gitno_ssl_teardown is called by gitno_close only. Do so to avoid further
errors (by misbehaving peers for example).
Fixes#1129.
While C Git has been writing entry count -1 (ie. never other negative
numbers) as invalid since day 1, it accepts all negative entry counts
as invalid. JGit follows the same rule. libgit2 should also follow, or
the index that works with C Git or JGit may someday be rejected by
libgit2.
Other reimplementations like dulwich and grit have not bothered with
parsing or writing tree cache.
The `git_iterator_reset` command has not been working in all cases
particularly when there is a start and end range. This fixes it
and adds tests for it, and also extends it with the ability to
update the start/end range strings when an iterator is reset.
This removes the need to explicitly pass the repo into iterators
where the repo is implied by the other parameters. This moves
the repo to be owned by the parent struct. Also, this has some
iterator related updates to the internal diff API to lay the
groundwork for checkout improvements.
If commit timestamps are off, we're more likely to hit a traversal
where the first path ends up traversing past the root commit of the tree.
If that happens, it's possible that the loop will complete before the second
path marks some of those final parents. This fix keeps track of the root
nodes that are encountered in the traversal, and verify that they are
properly marked.
In the best case, with accurate timestamps, the traversal will continue
to terminate when all the commits are STALE (parents of a merge-base), as
it did before. In the worst case, where one path makes a complete traversal
past a root commit, we will continue the loop until the root commit itself
is marked.
This could also use PTHREAD_MUTEX_INITIALIZER, but a dynamic initializer seems like a more portable concept, and we won't need another #define on top of git_mutex_init()
Storing 4kB or 8kB in the stack is not very gentle. As this part has
to be linear, put the buffer into the indexer object so we allocate it
once in the heap.
There are many different broken filemodes in the wild so we need to
protect against them and give something useful up the chain. Don't
fail when reading a tree from the ODB but normalize the mode as best
we can.
As 664 is no longer a mode that we consider to be valid and gets
normalized to 644, we can stop accepting it in the treebuilder. The
library won't expose it to the user, so any invalid modes are a bug.
To paraphrase @peff:
You can get both size and type from a packed object reasonably cheaply.
If you have:
* An object that is not a delta; both type and size are available in the
packfile header.
* An object that is a delta. The packfile type will be OBJ_*_DELTA, and
you have to resolve back to the base to find the real type. That means
potentially a lot of packfile index lookups, but each one is
relatively cheap. For the size, you inflate the first few bytes of the
delta, whose header will tell you the resulting size of applying the
delta to the base.
For simplicity, we just decompress the whole delta for now.