This lock/unlock pair allows for the cller to lock a configuration file
to avoid concurrent operations.
It also allows for a transactional approach to updating a configuration
file. If multiple updates must be made atomically, they can be done
while the config is locked.
When a configuration file is locked, any updates made to it will be done
to the in-memory copy of the file. This allows for multiple updates to
happen while we hold the lock, preventing races during complex
config-file manipulation.
Instead of writing into the filebuf directly, make the functions to
write the modified config file write into a buffer which can then be
dumped into the lockfile for committing.
This allows us to re-use the same code for modifying a locked
configuration, as we can simply skip the last step of dumping the data
to disk.
When we're looking to update a tag, we can't stop if the tag auto-follow
rules don't say to update it. The tag might still match the refspec we
were given.
When curl uses a proxy, it will only use Basic unless we prompt it to
try to use the most secure on it has available.
This is something which git did recently, and it seems like a good idea.
With Visual Studio versions 2008 and older they ignore the full path to files and only check
the basename of the file to find a collision. Additionally, having duplicate basenames can break
other build tools like GYP.
This fixes https://github.com/libgit2/libgit2/issues/3356
Allow restoring a previously captured oom error, by
detecting when the captured message pointer points to the
static oom error message. This means there is no need
to strdup the message in giterr_detach.
Error messages that are detached are assumed to be dynamically
allocated. Passing a pointer to the static oom error message
can cause an attempt to free the static buffer later. This change
checks if the oom error message is about to be detached and detaches
a copy instead.
Instead of allocating a brand new buffer for each error string we want
to store, we can use a per-thread buffer to store the error string and
re-use the underlying storage. We already use the buffer to format the
string, so this mostly makes that more direct.
An error here will typically mean that the directory was removed between
the time we iterated the parent and the time we wanted to visit it in
which case we should ignore it.
Other kinds of errors such as permissions (or transient errors) also
better dealt with by pretending we didn't see it.
When we have an error renaming the lockfile, we need to make sure
that we remove it upon cleanup. For this, we need to keep track of
whether we opened the file and whether the rename succeeded.
If we did create the lockfile but the rename did not succeed, we
remove the lockfile. This won't protect against all errors, but
the most common ones (target file is open) does get handled.
The header src/cc-compat.h defines portable format specifiers PRIuZ, PRIdZ, and PRIxZ. The original report highlighted the need to use these specifiers in examples/network/fetch.c. For this commit, I checked all C source and header files not in deps/ and transitioned to the appropriate format specifier where appropriate.
Removing a reflog upon ref deletion is something which only some
backends might wish to do. Backends which are database-backed may wish
to archive a reflog, log-based ones may not need to do anything.
If we get the path from the gitmodules file, look up the submodule we're
interested in by path, rather then by name. Otherwise we might get
duplicate results.
Upgrade xdiff to version used in core git 2.4.5 (0df0541).
Corrects an issue where an LF is added at EOF while applying
an unrelated change (ba31180), cleans up some unused code (be89977 and
e5b0662), and provides an improved callback to avoid leaking internal
(to xdiff) structures (467d348).
This also adds some additional functionality that we do not yet take
advantage of, namely the ability to ignore changes whose lines are
all blank (36617af).
Versions prior to 0.9.8f did not have this function, rhel/centos5 are still on a
heavily backported version of 0.9.8e and theoretically supported until March 2017
Without this ifdef, I get the following link failure:
```
CMakeFiles/libgit2_clar.dir/src/openssl_stream.c.o: In function `openssl_connect':
openssl_stream.c:(.text+0x45a): undefined reference to `SSL_set_tlsext_host_name'
collect2: error: ld returned 1 exit status
make[6]: *** [libgit2_clar] Error 1
```
Introduce `git__getenv` which is a UTF-8 aware `getenv` everywhere.
Make `cl_getenv` use this to keep consistent memory handling around
return values (free everywhere, as opposed to only some platforms).
The regex we use to look at the gitmodules file does not correctly
delimit the name of submodule which we want to look up and puts '.*'
straight after the name, maching on any submodule which has the seeked
submodule as a prefix of its name.
Add the missing '\.' in the regex so we want a full stop to exist both
before and after the submodule name.
Allow custom filters with wildcard attributes, so that clients
can support some random `filter=foo` in a .gitattributes and look
up the corresponding smudge/clean commands in the configuration file.
Function was added in commit 2c982daa2e on October 5, 2011,
and removed in commit 41fb1ca0ec on October 29, 2012.
Given the length of time it's gone unused, it's safe to remove now.
This reverts commit 969d4b703c.
This was a fluke from Coverity. The length to all the APIs in the
library is supposed to be passed in as nibbles, not bytes. Passing it as
bytes would prevent us from parsing uneven-sized SHA1 strings.
Also, the rest of the library was still using nibbles (including
revparse and the odb_prefix APIs), so this change was seriously breaking
things in unexpected ways. ^^
When diffing the index with the workdir and GIT_DIFF_UPDATE_INDEX has been passed,
the previous implementation was always writing the index to disk even if it wasn't
modified.
When a refspec contains no rhs and thus won't cause an explicit update,
we skip all the logic, but that means that we don't update FETCH_HEAD
with it, which is what the implicit rhs is.
Add another bit of logic which puts those remote heads in the list of
updates so we put them into FETCH_HEAD.
When we don't own a buffer (asize=0) we currently allow the usage of
grow to copy the memory into a buffer we do own. This muddles the
meaning of grow, and lets us be a bit cavalier with ownership semantics.
Don't allow this any more. Usage of grow should be restricted to buffers
which we know own their own memory. If unsure, we must not attempt to
modify it.
Always set `GIT_DIFF_PATCH_DIFFABLE` for all files, regardless of
binary-ness, so that the binary callback is invoked to either
show the binary contents, or just print the standard "Binary files
differ" message. We may need to do deeper inspection for binary
files where we have avoided loading the contents into a file map.
If the libcurl stream is available, use that as the underlying stream
instead of the socket stream. This allows us to set a proxy for HTTPS
connections.
The TLS streams talk over the curl stream themselves, so we don't need
to ask for it explicitly. Do so in the case of the non-encrypted one so
we can still make use proxies in that case.
When linking against libcurl, use it as the underlying transport instead
of straight sockets. We can't quite just give over the file descriptor,
as curl puts it into non-blocking mode, so we build a custom BIO so
OpenSSL sends the data through our stream, be it the socket or curl
streams.
If the stream claims to support this feature, we can let the transport
set the proxy.
We also set HTTPPROXYTUNNEL option so curl can create a tunnel through
the proxy which lets us create our own TLS session (if needed).
cURL has a mode in which it acts a lot like our streams, providing send
and recv functions and taking care of the TLS and proxy setup for us.
Implement a new stream which uses libcurl instead of raw sockets or the
TLS libraries directly. This version does not support reporting
certificates or proxies yet.
When stashing the workdir tree, examine the index as well. Using
a mechanism similar to `git_diff_tree_to_workdir_with_index`
allows us to determine that a file was added in the index and
subsequently modified in the working directory. Without examining
the index, we would erroneously believe that this file was
untracked and fail to include it in the working directory tree.
Use a slightly modified `git_diff_tree_to_workdir_with_index` in
order to avoid some of the behavior custom to `git diff`. In
particular, be sure to include the working directory side of a
file when it was deleted in the index.
This is something we do on re-init but not when opening a
repository. This hasn't particularly mattered up to now as the version
has been 0 ever since the first release of git, but the times, they're
a-changing and we will soon see version 1 in the wild. We need to make
sure we don't open those.
If an index entry for a file that is not in HEAD is in conflicted state,
when diffing HEAD with the index, the status field of the corresponding git_diff_delta was incorrectly reported as GIT_DELTA_ADDED instead of GIT_DELTA_CONFLICTED.
This was due to handle_unmatched_new_item() initially setting the status
to GIT_DELTA_CONFLICTED but then overriding it later with GIT_DELTA_ADDED.
We currently do not handle those enum values which require us to set
"true" or unset variables in all cases. Use a common function which does
understand this by looking at our mapping directly.
Similarly to the other ones. In this test we copy over testing
`RECURSE_YES` which shows an error in our handling of the `YES` variant
which we may have to port to the rest.
During the cache deletion, the check for whether we consider a submodule
to exist got changed regarding submodules which are in the worktree but
not configured.
Instead of checking for the url field to be populated, check the
location where we've found it.
This lets us specify in the status call which ignore rules we want to
use (optionally falling back to whatever the submodule has in its
configuration).
This removes one of the reasons for having `_set_ignore()` set the value
in-memory. We re-use the `IGNORE_RESET` value for this as it is no
longer relevant but has a similar purpose to `IGNORE_FALLBACK`.
Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of
initializers and move that to fall back to the configuration as well.
As submodules are becomes more like values, we should not let a status
check to update its properties. Instead of taking a submodule, have
status take a repo and submodule name.
Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
This allows the user to look up fields which we don't parse in libgit2,
and allows them to access gpgsig or mergetag fields if they wish to
check the signature.
When an entry has a racy timestamp, we need to check whether the file
itself has changed since we put its entry in the index. Only then do we
smudge the size field to force a check the next time around.
When a file on the workdir has the same or a newer timestamp than the
index, we need to perform a full check of the contents, as the update of
the file may have happened just after we wrote the index.
The iterator changes are such that we can reach inside the workdir
iterator from the diff, though it may be better to have an accessor
instead of moving these structs into the header.
When updating the index during a diff, preserve the original mode,
which prevents us from dropping the mode to what we have interpreted
as on our system (eg, what the working directory claims it to be,
which may be a lie on some systems.)
This is used by the submodule in order to figure out if the index has
changed since it last read it. Using a timestamp is racy, so let's make
it use the checksum, just like we now do for reloading the index itself.
We currently use a timetamp to check whether an index file has been
modified since we last read it, but this is racy. If two updates happen
in the same second and we read after the first one, we won't detect the
second one.
Instead read the SHA-1 checksum of the file, which are its last 20 bytes which
gives us a sure-fire way to detect whether the file has changed since we
last read it.
As we're now keeping track of it, expose an accessor to this data.
When checking out some file 'foo' that has been modified in the
working directory, allow the checkout to proceed (do not conflict)
if 'foo' is identical to the target of the checkout.
In order to avoid racy-git, we zero out the file size for entries with
the same timestamp as the index (or during the initial checkout). This
is the case in a couple of crlf tests, as the code is fast enough to do
everything in the same second.
As we know that we do not perform the modification just after writing
out the index, which is what this is designed to work around, tick the
mtime of the index file such that it doesn't agree with the files
anymore, and we do not zero out these entries.
If a file entry has the same timestamp as the index itself, it is
considered racily-clean, as it may have been modified after the index
was written, but during the same second. We take extra steps to check
the contents, but this is just one part of avoiding races.
For files which do have changes but have not been updated in the index,
updating the on-disk index means updating its timestamp, which means we
would no longer recognise these entries as racy and we would trust the
timestamp to tell us whether they have changed.
In order to work around this, git zeroes out the file-size field in
entries with the same timestamp as the index in order to force the next
diff to check the contents. Do so in libgit2 as well.
Arguably all uses of readdir_r are unnecessary, but in this case
especially so, as the directory handle only exists within this function,
so we don't race with anybody.
Introduce a new binary diff callback to provide the actual binary
delta contents to callers. Create this data from the diff contents
(instead of directly from the ODB) to support binary diffs including
the workdir, not just things coming out of the ODB.
The read and write callbacks passed to SSLSetIOFuncs() have been
rewritten to match the implementation used on opensource.apple.com and
other open source projects like VLC.
This change also fixes a bug where the read callback could get into
an infinite loop when 0 bytes were read.
Some tools create multiple author fields. git is rather lax when parsing
them, although fsck does complain about them. This means that they exist
in the wild.
As it's not too taxing to check for them, and there shouldn't be a
noticeable slowdown when dealing with correct commits, add logic to skip
over these extra fields when parsing the commit.
When we hit an error writing to the next stream from a file, we jump to
'done' which currently skips over closing the file descriptor.
Make sure to close the descriptor if it has been set to a valid value.
We take in a possibly partial ID by taking a length and working off of
that to figure out whether to just look up the object or ask the
backends for a prefix lookup.
Unfortunately we've been checking the size against `GIT_OID_HEXSZ` which
is the size of a *string* containing a full ID, whereas we need to check
against the size we can have when it's a 20-byte array.
Change the checks and comment to use `GIT_OID_RAWSZ` which is the
correct size of a git_oid to have when full.
The way we currently do it depends on the subtlety of strlen vs sizeof
and the fact that .pack is one longer than .idx. Let's use a git_buf so
we can express the manipulation we want much more clearly.
`merge_diff_list_count_candidates()` takes pointers to the source and
target counts, but when it comes time to increase them, we're increasing
the pointer, rather than the value it's pointing to.
Dereference the value to increase.
Coverity complains about the git_rawobj ones because we use a loop in
which we keep remembering the old version, and we end up copying our
object as the base, so we want to have the data pointer be NULL.
Let `ssh_stream_free()` take a NULL stream, as free functions should,
and remove the check from the connection setup.
The connection setup would not need the check anyhow, as we always have
a stream by the time we reach this code.
When the callback returns an error, we should stop immediately. This
broke when trying to make sure we pass specific errors up the chain.
This broke cancelling out of the loose backend's foreach.
We've been using `p_ftruncate()` to extend the packfile in order to mmap
it and write the new data into it. This works well in the general case,
but as truncation does not allocate space in the filesystem, it must do
so when we write data to it.
The only way the OS has to indicate a failure to allocate space is via
SIGBUS which means we tried to write outside the file. This will cause
everyone to crash as they don't expect to handle this signal.
Switch to using `p_lseek()` and `p_write()` to extend the file in a way
which tells the filesystem to allocate the space for the missing
data. We can then be sure that we have space to write into.
We use heuristics to make a decent guess at when we can save time and
space by linking object files during a clone. Unfortunately checking the
device id isn't enough, as those would be the same during e.g. a bind-mount,
but the OS still does not allow us to link between mounts of the same
filesystem.
If we fail to perform the links, fall back to copying the contents into
a new file as a last attempt.
A remote's URLs are now modified according to the url.*.insteadOf
and url.*.pushInsteadOf configurations. This allows a user to
replace URL prefixes by setting the corresponding keys. E.g.
"url.foo.insteadOf = bar" would replace the prefix "bar" with the
new prefix "foo".
Some brain damaged tolower() implementations appear to want to
take the locale into account, and this may require taking some
insanely aggressive lock on the locale and slowing down what should
be the most trivial of trivial calls for people who just want to
downcase ASCII.
Treat input bytes as unsigned before doing arithmetic on them,
lest we look at some non-ASCII byte (like a UTF-8 character) as a
negative value and perform the comparison incorrectly.
We do not error on "merge conflicts"; on the contrary, merge conflicts
are a normal part of merging. We only error on "checkout conflicts",
where a change exists in the index or the working directory that would
otherwise be overwritten by performing the checkout.
This *may* happen during merge (after the production of the new index
that we're going to checkout) but it could happen during any checkout.
If there exists a conflict in the index, but no file in the working
directory, this implies that the user wants to accept the resolution
by removing the file. Thus, remove the conflict entry from the
index, instead of trying to add a (nonexistent) file.
Mark the `old_file` and `new_file` sides of a delta with a new bit,
`GIT_DIFF_FLAG_EXISTS`, that introduces that a particular side of
the delta exists in the diff.
This is useful for indicating whether a working directory item exists
or not, in the presence of a conflict. Diff users may have previously
used DELETED to determine this information.
It's not always obvious the mapping between stage level and
conflict-ness. More importantly, this can lead otherwise sane
people to write constructs like `if (!git_index_entry_stage(entry))`,
which (while technically correct) is unreadable.
Provide a nice method to help avoid such messy thinking.
Since a diff entry only concerns a single entry, zero the information
for the index side of a conflict. (The index entry would otherwise
erroneously include the lowest-stage index entry - generally the
ancestor of a conflict.)
Test that during status, the index side of the conflict is empty.
When diffing against an index, return a new `GIT_DELTA_CONFLICTED`
delta type for items that are conflicted. For a single file path,
only one delta will be produced (despite the fact that there are
multiple entries in the index).
Index iterators now have the (optional) ability to return conflicts
in the index. Prior to this change, they would be omitted, and callers
(like diff) would omit conflicted index entries entirely.
Wrap the iterator current / advance functions so that we can extend
them, but also handle GIT_ITEROVER cases in the iterator funcs
instead of the callers.
When we moved from acting on the instance to acting on the
configuration, we dropped the validation of the passed refspec, which
can lead to writing an invalid refspec to the configuration. Bring that
validation back.
The public key field is optional and as such can take NULL. Account for
that and do not call strlen() on NULL values. Also assert() for non-NULL
values of username & private key.
Declare GIT_CREDTYPE_SSH_MEMORY to have consistent API independently of
whether libgit2 was built with or without in-memory key passing support.
Or rather, to have it at all since build-time definitions are not stored
in headers.
When we look for which remote corresponds to a remote-tracking branch,
we look in the refspecs to see which ones matches. If none do, we should
abort. We currently ignore the error message from this operation, so
let's not do that anymore.
As part of the test we're writing, let's test for the expected behaviour
if we cannot find a refspec which tells us what the remote-tracking
branch for a remote would look like.
When we find out that we're dealing with a matching refspec, we set the
flag and return immediately. This leaves the strings as NULL, which
breaks the contract.
Assign these pointers to a string with the correct values.
When we discover that we want to keep a negative rule, make sure to
clear the error variable, as it we otherwise return whatever was left by
the previous loop iteration.
This can be used by tools to show mesages about failing to communicate
with the server. The error message in this case will often contain the
server's error message, as far as it managed to send anything.
When we fail to read from stdout, it's typically because the URL was
wrong and the server process has sent some output over its stderr
output.
Read that output and set the error message to whatever we read from it.
We set an error if we get an error when reading, but we don't bother
setting an error message for write failing. This causes a cryptic error
to be shown to the user when the target filesystem is full.
These were left over from the culling as it's not clear which use-cases
might benefit from this. It is not clear that we want to support any
use-case which depends on changing the remote's idea of the base
refspecs rather than passing in different per-operation refspec list, so
remove these functions.
This function deals with functions doing IO which means the amount of
errors that can happen is quit large. It does not help if it always
ovewrites the underlying error message with a less understandable
version of "something went wrong".
Instead, only use this generic message if there was no error set by the
callback.
Instead of going through each entry we have and re-adding, which may not
even be correct for certain crlf options and has bad performance, use
the function which performs a diff against the worktree and try to add
and remove files from that list.
We currently iterate over all the entries and re-add them to the
index. While this provides correctness, it is wasteful as we try to
re-insert files which have not changed.
Instead, take a diff between the index and the worktree and only re-add
those which we already know have changed.
This is useful to send to the client while we're performing the work.
The reporting function has a force parameter which makes sure that we
do send out the message of 100% completed, even if this comes before the
next udpate window.
Instead of copying each object individually, as we'd been doing, use the
packbuilder which should be faster and give us some feedback.
While performing this change, we can hook up the packbuilder's writing
to the push progress so the caller knows how far along we are.
We currently first look in the loose object dir and then in the packs
for objects. When performing operations on recent history this has a
higher likelihood of hitting, but when we deal with operations which
look further back into the past, we start spending a large amount of
time getting ENOTENT from `access`.
Reversing the priorities means that long-running operations can get to
their objects faster, as we can look at the index data we have in memory
(or rather mapped) to figure out whether we have an object, which is
faster than going out to the filesystem.
The packed backend already implements an optimistic read algorithm by
first looking at the packs we know about and only going out to disk to
referesh if the object is not found which means that in the case where
we do have the object (which will be in the majority for anything that
traverses the graph) we can avoid going to to disk entirely to determine
whether an object exists.
Operations which look at recent history may take a slight impact, but
these would be operations which look a lot less at object and thus take
less time regardless.
The base refspecs changing can be a cause of confusion as to what is the
current base refspec set and complicate saving the remote's
configuration.
Change `git_remote_add_{fetch,push}()` to update the configuration
instead of an instance.
This finally makes `git_remote_save()` a no-op, it will be removed in a
later commit.
While this will rarely be different from the default, having it in the
remote adds yet another setting it has to keep around and can affect its
behaviour. Move it to the options.
Instead of having it set in a different place from every other callback,
put it the main structure. This removes some state from the remote and
makes it behave more like clone, where the constructors are passed via
the options.
As a first step in removing the repository-saving logic, don't allow
chaning the url or push url from a remote object, but change the
configuration on the configuration immediately.
Having the setting be different from calling its actions was not a great
idea and made for the sake of the wrong convenience.
Instead of that, accept either fetch options, push options or the
callbacks when dealing with the remote. The fetch options are currently
only the callbacks, but more options will be moved from setters and
getters on the remote to the options.
This does mean passing the same struct along the different functions but
the typical use-case will only call git_remote_fetch() or
git_remote_push() and so won't notice much difference.
The push object knows which remote it's associated with, and therefore
does not need to keep its own copy of the callbacks stored in the
remote.
Remove the copy and simply access the callbacks struct within the
remote.
Restricting files to size_t is a silly limitation. The loose backend
writes to a file directly, so there is no issue in using 63 bits for the
size.
We still assume that the header is going to fit in 64 bytes, which does
mean quite a bit smaller files due to the run-length encoding, but it's
still a much larger size than you would want Git to handle.
When handling attr matching, simply compare the directory path where the
attribute file resides to the path being matched. Skip over commonality
to allow us to compare the contents of the attribute file to the remainder
of the path.
This allows us to more easily compare the pattern directly to the path,
instead of trying to guess whether we want to compare the path's basename
or the full path based on whether the match was inside a containing
directory or not.
This also allows us to do fewer translations on the pattern (trying to
re-prefix it.)
When determining whether some file matches an attr pattern, do
not try to truncate the path to pass to fnmatch. When there is
no containing directory for an item (eg, from a .gitignore in the
root) this will cause us to truncate our path, which means that
we cannot do meaningful comparisons on it and we may have false
positives when trying to determine whether a given file is actually
a file or a folder (as we have lost the path's base information.)
This mangling was to allow fnmatch to compare a directory on disk to
the name of a directory, but it is unnecessary as our fnmatch accepts
FNM_LEADING_DIR.
We use a blocking socket and set the mode to AUTO_RETRY which means that
`SSL_write` and `SSL_read` will only return once the read or write has
been completed. We therefore don't need to handle partial writes or
re-try read due to a regenotiation.
While here, consider that a zero also indicates an error condition.
When writing a configuration file, we want to take a lock on the
new file (eg, `config.lock`) before opening the configuration file
(`config`) for reading so that we can prevent somebody from changing
the contents underneath us.
When updating a configuration file, we want to copy the old data
from the file to preserve comments and funny whitespace, instead
of writing it in some "canonical" format. Thus, we keep a
pointer to the start of the line and the line length to preserve
these things we don't care to rewrite.
Previously we would try to be clever when writing the configuration
file and try to stop parsing (and simply copy the rest of the old
file) when we either found the value we were trying to write,
or when we left the section that value was in, the assumption being
that there was no more work to do.
Regrettably, you can have another section with the same name later
in the file, and we must cope with that gracefully, thus we read the
whole file in order to write a new file.
Now, writing a file looks even more than reading. Pull the config
parsing out into its own function that can be used by both reading
and writing the configuration.
When checking out with a case-insensitive working directory, we
want to change the case of items in the working directory to
reflect changes that occured in the checkout target. Diff now
has an option to break case-changing renames into delete/add.
Using FindFirstFile and FindNextFile in win32 allows us to
use the directory information that is returned, instead of
us having to get the file attributes all over again, which
is a distinct cost savings on win32.
The _next method shouldn't take a path pointer (and a path_len
pointer) as 100% of current users use the full path and ignore
the filename.
Plus let's add some docs and a unit test.
Changed win32/path_w32.c to utilize NTFS' FindFirst..FindNext data instead of doing an lstat per file. Avoiding unnecessary directory opens and file scans reduces IO, improving overall performance. Effect is magnified due to NTFS being a kernel mode file system (as opposed to user mode).
In checkout.c and filter.c we were casting a sub struct
to a parent struct which breaks the strict aliasing rules
in C. However we can use .parent or .base to access the
parent struct to avoid the build warnings.
In remote.c the local variable error was not initialized
or updated in some cases. For unintialized error a build
warning will be generated. So always keep error variable
up-to-date.
Do not automatically fail on a bad certificate, but let the caller
decide. This means we don't need our switch on errors anymore but can
return a string representation from the security framework.
If git_config_delete is to work properly in the presence of duplicate section
headers, it cannot stop searching at the end of the first matching section, as
there may be another matching section later.
When config_write is used for deletion (value = NULL), it may only terminate
when the desired key is found or there are no sections left to parse.
The idea...sometimes, a filemode is user-specified via an
explicit git_index_entry. In this case, believe the user, always.
Sometimes, it is instead built up by statting the file system. In
those cases, go with the existing logic we have to determine
whether the file system supports all filemodes and symlinks, and
make the best guess.
On file systems which have full filemode and symlink support, this
commit should make no difference. On others (most notably Windows),
this will fix problems things like:
* git_index_add and git_index_add_frombuffer() should be believed.
* As a consequence, git_checkout_tree should make the filemodes in
the index match the ones in the tree.
* And diffs with GIT_DIFF_UPDATE_INDEX don't write the wrong filemodes.
* And merges, and probably other downstream stuff now fixed, too.
This makes my previous changes to checkout.c unnecessary,
so they are now reverted.
Also, added a test for index_entry permissions from git_index_add
and git_index_add_frombuffer, both of which failed before these changes.
In `git_rebase_operation_current()`, indicate when a rebase has not
started (with `GIT_REBASE_NO_OPERATION`) rather than conflating that
with the first operation being in-progress.
It can be useful for the caller to know which update commands will be
sent to the server before the packfile is pushed up. git does this via
the pre-push hook.
We don't have hooks, but as it adds introspection into what is
happening, we can add a callback which performs the same function.
In the prior implementation, enabling GIT_DIFF_UPDATE_INDEX would overwrite
entries in the index with the ones generated from scanning the working if the
OID was the same.
Because this OID comparison ignores file modes, this means an file in the
workdir with only an exec bit difference with the one in the index would end
up being overwritten, resulting in the exec bit being loss. There might be
other related bugs but the fix of comparing OIDs and file modes should
address them all.
When walking backwards and marking parents uninteresting, make sure we
detect when the list of commits we have left has run out of
uninteresting commits so we can stop marking commits as
uninteresting. Failing to do so can mean that we walk the whole history
marking everything uninteresting, which eats up time, CPU and IO for
with useless work.
While pre-marking does look for this, we still need to check during the
main traversal as there are setups for which pre-marking does not leave
enough information in the commits. This can happen if we push a commit
and hide its parent.
The regcomp function returns a non-zero value if compilation of
a regular expression fails. In most places we only check for
negative values, but positive values indicate an error, as well.
Fix this tree-wide, fixing a segmentation fault when calling
git_config_iterator_glob_new with an invalid regexp.
When a commit is first set as unintersting and then pushed, we must take
care that we do not put it into the commit list as that makes us return
at least that commit (but maybe more) as we've inserted it into the list
because we have the assumption that we want anything in the commit list.
When no reference names could be found we did error out when trying to describe
a commit. This is wrong, though, when the option to fall back to a commit's
object ID is set.
git_checkout_tree() has some fallback behaviors for file systems
which don't have full support of filemodes. Generally works fine,
but if a given file had a change of type from a 0644 to 0755 (i.e.,
you add executable permissions), the fallback behavior incorrectly
triggers when writing hte updated index.
This would cause a git_checkout_tree() command, even with the
GIT_CHECKOUT_FORCE option set, to leave a dirty index on Windows.
Also added checks to an existing test to catch this case.
When diffs are generated, the value for the 'nfiles' field of 'git_diff_delta'
will be consistent with the value in the 'status' field. Merging diffs can
modify the 'status' field of some deltas and the 'nfiles' field needs to be
updated accordingly.
When we insert e.g. a tag or tagged object into the packfile, we must
make sure to insert any referenced objects as well, or we will have
broken links.
Use the recursive version of packfile insertion to make sure we send
over not just the tagged object but also the objects it references.
This function recursively inserts the given object and any referenced
ones. It can be thought of as a more general version of the functions to
insert a commit or tree.
When the user has a certificate check callback set, we still have to
check whether the stream we're using is even capable of providing a
certificate.
In the case of an unencrypted certificate, do not ask for it from the
stream, and do not call the callback.
This extra constructor will be useful for the annotated versions of
ref-modifying functions, as it allows us to create a commit with the
extended sha syntax which was used to retrieve it.
We do not always want to put the id directly into the reflog, but we
want to speicfy what a user typed. For this use-case we provide
annotated version of a few functions which let the caller specify what
user-friendly name was used when asking for the operation.
It turns out that erroring out on duplicate commits is the right thing
to do, but git was not hitting the bug on the server-side.
Bring back a descriptive error message in case of duplicate entries and
error out.
If a packfile includes duplicate objects, we can choose to use the
secon copy instead of the first by using the same logic as if it were
the first.
Change the error condition from 0 to -1, which indicates a bad resize,
and set the OOM message in that case.
This does mean we will leak the first copy of the object. We can deal
with that later, but making fetches work is more important.
While this is not even close to a fix, we can at least set an error
message so we know which error we are facing. Up to know we just
returned an error without a message.
Currently git_submodule_sync writes the submodule's URL to the
key 'branch.<REMOTE_NAME>.remote' while the reference
implementation of `git submodule sync` writes to
'remote.<REMOTE_NAME>.url', which is the intended behavior
according to git-submodule(1).
The current implementation does not set 'fire_callback' back to 0 for failed updates so the callback still fires.
Instead of adding yet another condition check to set 'fire_callback' to 0 if needed, considering this function should be a no-op for failed updates anyway, the best fix is to simplify its logic to check upfront if the update is a failed one.
The default behaviour for the packbuilder is to perform the work in a
single thread, which is fine for the public API, but we currently have
no way for a user to determine the number of threads to use when
creating the packfile, which makes our clone behaviour over the
filesystem quite a bit slower than what git offers.
This is a very particular scenario, in which we avoid spawning git by
being ourselves the server-side, so it's probably ok to auto-set the
threading, as the upload-pack process would do if we were talking to
git.
Currently we use the most naïve and inefficient method for figuring out
which objects to send to the remote whereby we end up trying to insert
subdirs which have not changed multiple times.
Instead, make use of the packbuilder's built-in more efficient method
which uses the walk to feed the object list and avoids inserting an
object and its descendants.
Most use-cases for the object packer communicate in terms of commits
which each side has. We already have an object to specify this
relationship between commits, namely git_revwalk.
By knowing which commits we want to pack and which the other side
already has, we can perform similar optimisations to git, by marking
each tree as interesting or uninteresting only once, and not sending
those trees which we know the other side has.
Keep the definitions in the headers, while putting the declarations in
the C files. Putting the function definitions in headers causes
them to be duplicated if you include two headers with them.
If a refspec could not be parsed, the git_refspec__parse function would
return an error value but would not provide additional error information
for the callers. This commit amends that function to set a more useful
error message.
When we rename a reference, we want the old and new ids to be the same
one (as we did not change it). The normal code path looks up the old id
from the current value of the brtanch, but by the time we look it up, it
does not exist anymore and thus we write a zero id.
Pass the old id explicitly instead.
Clear the error message on git_libgit2_shutdown for all versions of
the library (no threads and Win32 threads). Drop the giterr_clear
in clar, as that shouldn't be necessary.
This changes the get_entry() method to return a refcounted version of
the config entry, which you have to free when you're done.
This allows us to avoid freeing the memory in which the entry is stored
on a refresh, which may happen at any time for a live config.
For this reason, get_string() has been forbidden on live configs and a
new function get_string_buf() has been added, which stores the string in
a git_buf which the user then owns.
The functions which parse the string value takea advantage of the
borrowing to parse safely and then release the entry.
The user may decide to return any type of credential, including ones we
did not say we support. Add a check to make sure the user returned an
object of the right type and error out if not.
We want to use the "checkout: moving from ..." message in order to let
git know when a change of branch has happened. Make the convenience
functions for this goal write this message.
This namespace is about behaving like git's branch command, so let's do
exactly that instead of taking a reflog message.
This override is still available via the reference namespace.
The signature for the reflog is not something which changes
dynamically. Almost all uses will be NULL, since we want for the
repository's default identity to be used, making it noise.
In order to allow for changing the identity, we instead provide
git_repository_set_ident() and git_repository_ident() which allow a user
to override the choice of signature.
Win32 DLLs have four fields for the version number (major, minor,
teeny, patch). If a consumer wants to build a custom DLL, it may
be useful to set the patchlevel version number in the DLL.
This value only affects the DLL version number, it does not affect
the resultant "version number", which remains major.minor.teeny.
Windows headers #define some names that openssl uses too. Openssl
headers #undef the offending names before reusing them. But if those
offending Windows headers get included after the openssl headers the
namespace is polluted and nothing good happens.
Fixes issue #2850.
When the repository does not contain an index, emulate git's behavior
and upgrade to `SAFE_CREATE`. This allows us to check out repositories
created with `git clone --no-checkout`.
A repository can have multiple "reserved names" now, not just
a single "short name" for the repository folder itself. Refactor
to include a git_repository__reserved_names that returns all the
reserved names for a repository.
git_index_add_frombuffer enables now to store a memory buffer in the odb
and to store an entry in the index directly if the index is attached to a
repository.
Provide a convenience function that creates a buffer that can be provided
to callers but will not be freed via `git_buf_free`, so the buffer
creator maintains the allocation lifecycle of the buffer's contents.