When we insert e.g. a tag or tagged object into the packfile, we must
make sure to insert any referenced objects as well, or we will have
broken links.
Use the recursive version of packfile insertion to make sure we send
over not just the tagged object but also the objects it references.
This function recursively inserts the given object and any referenced
ones. It can be thought of as a more general version of the functions to
insert a commit or tree.
When the user has a certificate check callback set, we still have to
check whether the stream we're using is even capable of providing a
certificate.
In the case of an unencrypted certificate, do not ask for it from the
stream, and do not call the callback.
This extra constructor will be useful for the annotated versions of
ref-modifying functions, as it allows us to create a commit with the
extended sha syntax which was used to retrieve it.
We do not always want to put the id directly into the reflog, but we
want to speicfy what a user typed. For this use-case we provide
annotated version of a few functions which let the caller specify what
user-friendly name was used when asking for the operation.
It turns out that erroring out on duplicate commits is the right thing
to do, but git was not hitting the bug on the server-side.
Bring back a descriptive error message in case of duplicate entries and
error out.
If a packfile includes duplicate objects, we can choose to use the
secon copy instead of the first by using the same logic as if it were
the first.
Change the error condition from 0 to -1, which indicates a bad resize,
and set the OOM message in that case.
This does mean we will leak the first copy of the object. We can deal
with that later, but making fetches work is more important.
While this is not even close to a fix, we can at least set an error
message so we know which error we are facing. Up to know we just
returned an error without a message.
Currently git_submodule_sync writes the submodule's URL to the
key 'branch.<REMOTE_NAME>.remote' while the reference
implementation of `git submodule sync` writes to
'remote.<REMOTE_NAME>.url', which is the intended behavior
according to git-submodule(1).
The current implementation does not set 'fire_callback' back to 0 for failed updates so the callback still fires.
Instead of adding yet another condition check to set 'fire_callback' to 0 if needed, considering this function should be a no-op for failed updates anyway, the best fix is to simplify its logic to check upfront if the update is a failed one.
The default behaviour for the packbuilder is to perform the work in a
single thread, which is fine for the public API, but we currently have
no way for a user to determine the number of threads to use when
creating the packfile, which makes our clone behaviour over the
filesystem quite a bit slower than what git offers.
This is a very particular scenario, in which we avoid spawning git by
being ourselves the server-side, so it's probably ok to auto-set the
threading, as the upload-pack process would do if we were talking to
git.
Currently we use the most naïve and inefficient method for figuring out
which objects to send to the remote whereby we end up trying to insert
subdirs which have not changed multiple times.
Instead, make use of the packbuilder's built-in more efficient method
which uses the walk to feed the object list and avoids inserting an
object and its descendants.
Most use-cases for the object packer communicate in terms of commits
which each side has. We already have an object to specify this
relationship between commits, namely git_revwalk.
By knowing which commits we want to pack and which the other side
already has, we can perform similar optimisations to git, by marking
each tree as interesting or uninteresting only once, and not sending
those trees which we know the other side has.
Keep the definitions in the headers, while putting the declarations in
the C files. Putting the function definitions in headers causes
them to be duplicated if you include two headers with them.
If a refspec could not be parsed, the git_refspec__parse function would
return an error value but would not provide additional error information
for the callers. This commit amends that function to set a more useful
error message.
When we rename a reference, we want the old and new ids to be the same
one (as we did not change it). The normal code path looks up the old id
from the current value of the brtanch, but by the time we look it up, it
does not exist anymore and thus we write a zero id.
Pass the old id explicitly instead.
Clear the error message on git_libgit2_shutdown for all versions of
the library (no threads and Win32 threads). Drop the giterr_clear
in clar, as that shouldn't be necessary.
This changes the get_entry() method to return a refcounted version of
the config entry, which you have to free when you're done.
This allows us to avoid freeing the memory in which the entry is stored
on a refresh, which may happen at any time for a live config.
For this reason, get_string() has been forbidden on live configs and a
new function get_string_buf() has been added, which stores the string in
a git_buf which the user then owns.
The functions which parse the string value takea advantage of the
borrowing to parse safely and then release the entry.
The user may decide to return any type of credential, including ones we
did not say we support. Add a check to make sure the user returned an
object of the right type and error out if not.
We want to use the "checkout: moving from ..." message in order to let
git know when a change of branch has happened. Make the convenience
functions for this goal write this message.
This namespace is about behaving like git's branch command, so let's do
exactly that instead of taking a reflog message.
This override is still available via the reference namespace.
The signature for the reflog is not something which changes
dynamically. Almost all uses will be NULL, since we want for the
repository's default identity to be used, making it noise.
In order to allow for changing the identity, we instead provide
git_repository_set_ident() and git_repository_ident() which allow a user
to override the choice of signature.
Win32 DLLs have four fields for the version number (major, minor,
teeny, patch). If a consumer wants to build a custom DLL, it may
be useful to set the patchlevel version number in the DLL.
This value only affects the DLL version number, it does not affect
the resultant "version number", which remains major.minor.teeny.
Windows headers #define some names that openssl uses too. Openssl
headers #undef the offending names before reusing them. But if those
offending Windows headers get included after the openssl headers the
namespace is polluted and nothing good happens.
Fixes issue #2850.
When the repository does not contain an index, emulate git's behavior
and upgrade to `SAFE_CREATE`. This allows us to check out repositories
created with `git clone --no-checkout`.
A repository can have multiple "reserved names" now, not just
a single "short name" for the repository folder itself. Refactor
to include a git_repository__reserved_names that returns all the
reserved names for a repository.
git_index_add_frombuffer enables now to store a memory buffer in the odb
and to store an entry in the index directly if the index is attached to a
repository.
Provide a convenience function that creates a buffer that can be provided
to callers but will not be freed via `git_buf_free`, so the buffer
creator maintains the allocation lifecycle of the buffer's contents.
Add structures and preliminary functions to take a buffer, file or
blob and write the contents in chunks through an arbitrary number
of chained filters, finally writing into a user-provided function
accept the contents.
Increment refcount of newly added cache entries just like existing
entries looked up from the cache. Otherwise the new entry can be
evicted from the cache and destroyed while it's still in use.
Always lock the index when we begin the merge, before we write
any of the metdata files. This prevents a race where another
client may run a commit after we have written the MERGE_HEAD but
before we have updated the index, which will produce a merge
commit that is treesame to one parent. The merge will finish and
update the index and the resultant commit would not be a merge at
all.
Introduce `git_indexwriter`, to allow us to lock the index while
performing additional operations, then complete the write (or abort,
unlocking the index).
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it. This means dropping the ability to pass `NULL` as
an out parameter.
As a result, the macros also get updated to reflect this as well.
Win32 generally ignores Unix-like mode bits that don't make any
sense on the platform (eg `0644` makes no sense to Windows). But
WINE complains loudly when presented with POSIXy bits. Remove them.
(Thanks @phkelley)
Use `size_t` to hold the size of arrays to ease overflow checking,
lest we check for overflow of a `size_t` then promptly truncate
by packing the length into a smaller type.
Introduce git__reallocarray that checks the product of the number
of elements and element size for overflow before allocation. Also
introduce git__mallocarray that behaves like calloc, but without the
`c`. (It does not zero memory, for those truly worried about every
cycle.)
Fixes#2869. If included file includes more files, it may reallocate
cfg_file->readers, hence invalidate not only `r` pointer, but `result`
pointer as well.
`p_stat` calls `git_win32_path_from_utf8`, which canonicalizes the
path. Do not further try to modify the path, else we trim the
trailing slash from a root directory and try to access `C:` instead
of `C:/`.
During checkout, assume that the .gitattributes files aren't
modified during the checkout. Instead, create an "attribute session"
during checkout. Assume that attribute data read in the same
checkout "session" hasn't been modified since the checkout started.
(But allow subsequent checkouts to invalidate the cache.)
Further, cache nonexistent git_attr_file data even when .gitattributes
files are not found to prevent re-scanning for nonexistent files.
pathspec_match_free() should not dereference a NULL passed to it.
I found this issue when I tried to run example log program with
nonexistent branch:
./example/log help
Such call leads to segmentation fault.
This fixes the build at least on FreeBSD, where those types were not
defined indirectly:
src/openssl_stream.c💯18: error: variable has incomplete type 'struct in6_addr'
struct in6_addr addr6;
^
src/openssl_stream.c💯9: note: forward declaration of 'struct in6_addr'
struct in6_addr addr6;
^
src/openssl_stream.c:111:18: error: use of undeclared identifier 'AF_INET'
if (p_inet_pton(AF_INET, host, &addr4)) {
^
src/unix/posix.h:31:40: note: expanded from macro 'p_inet_pton'
^
src/openssl_stream.c:115:18: error: use of undeclared identifier 'AF_INET6'
if(p_inet_pton(AF_INET6, host, &addr6)) {
^
src/unix/posix.h:31:40: note: expanded from macro 'p_inet_pton'
^
This is supported by the underlying set_index() implementation
and setting the repository index to NULL is recommended by the
git_repository_set_bare() documentation.
On case insensitive filesystems, we may have files in the working
directory that case fold to a name we want to write. Remove those
files (by default) so that we will not end up with a filename that
has the unexpected case.
The documentation for `git_path_join_unrooted` states that the base
length will be returned, so that consumers like checkout know where
to start creating directories instead of always creating directories
at the directory root.
The implementation of the hashsig API disallows computing a signature on
small files containing only a few lines. This new flag disables this
behavior.
git_diff_find_similar() sets this flag by default which means that rename
/ copy detection of small files will now work. This in turn affects the
behavior of the git_status and git_blame APIs which will now detect rename
of small files assuming the right options are passed.
git_remote_download() must also clear the internal push state resulting from a possible earlier push operation. Otherwise calling git_remote_update_tips() will execute the push version instead of the fetch version and among other things, tags won't be updated.
The clock_gettime() function is normally not available under
AmigaOS, hence another solution is required. We are using now
GetUpTime() that is present in current versions of this
operating system.
The openssl setup function needs to be GIT_EXPORT'ed, be sure
to include the `sys/openssl.h` header so that it is appropriately
decorated as an export function.
A byte value of 0x85 is not whitespace, we were conflating that with
U+0085 (UTF8: 0xc2 0x85). This caused us to incorrectly treat valid
multibyte characters like U+88C5 (UTF8: 0xe8 0xa3 0x85) as whitespace.
On a case-insensitive filesystem, we need to deal with case-changing
renames (eg, foo -> FOO) by removing the old and adding the new,
exactly as if we were on a case-sensitive filesystem.
Update the `checkout::tree::can_cancel_checkout_from_notify` test, now
that notifications are always sent case sensitively.
For the REUC and NAME entries, we use size_t internally, and we take
size_t for the get_byindex() functions, but the entrycount() functions
strangely cast to an unsigned int instead.
This introduces the functionality of submodule update in
'git_submodule_do_update'. The existing 'git_submodule_update' function is
renamed to 'git_submodule_update_strategy'. The 'git_submodule_update'
function now refers to functionality similar to `git submodule update`,
while `git_submodule_update_strategy` is used to get the configured value
of submodule.<name>.update.
Path validation may be influenced by `core.protectHFS` and
`core.protectNTFS` configuration settings, thus treebuilders
can take a repository to influence their configuration.
HFS filesystems ignore some characters like U+200C. When these
characters are included in a path, they will be ignored for the
purposes of comparison with other paths. Thus, if you have a ".git"
folder, a folder of ".git<U+200C>" will also match. Protect our
".git" folder by ensuring that ".git<U+200C>" and friends do not match it.
Disallow:
1. paths with trailing dot
2. paths with trailing space
3. paths with trailing colon
4. paths that are 8.3 short names of .git folders ("GIT~1")
5. paths that are reserved path names (COM1, LPT1, etc).
6. paths with reserved DOS characters (colons, asterisks, etc)
These paths would (without \\?\ syntax) be elided to other paths - for
example, ".git." would be written as ".git". As a result, writing these
paths literally (using \\?\ syntax) makes them hard to operate with from
the shell, Windows Explorer or other tools. Disallow these.
When turning UTF-8 paths into UCS-2 paths for Windows, always use
the \\?\-prefixed paths. Because this bypasses the system's
path canonicalization, handle the canonicalization functions ourselves.
We must:
1. always use a backslash as a directory separator
2. only use a single backslash between directories
3. not rely on the system to translate "." and ".." in paths
4. remove trailing backslashes, except at the drive root (C:\)
This option does not get persisted to disk, which makes it different
from the rest of the setters. Remove it until we go all the way.
We still respect the configuration option, and it's still possible to
perform a one-time prune by calling the function.
For each remote-tracking branch we want to remove, we need to consider
it against every other refspec in case we have overlapping refspecs,
such as with
refs/heads/*:refs/remotes/origin/*
refs/pull/*/head:refs/remotes/origin/pr/*
as we'd otherwise remove too many refspecs.
Create a list of condidates, which are the references matching the rhs
of any active refspec and then filter that list by removing those
entries for which we find a remove reference with any active
refspec. Those which are left after this are removed.
Most of the network-facing facilities have been copied to the socket and
openssl streams. No code now uses these functions directly anymore, so
we can now remove them.
Having an ssh stream would require extra work for stream capabilities we
don't need anywhere else (oob auth and command execution) so for now
let's move away from the gitno connection to use socket_stream.
We can introduce an ssh stream interface if and as we need it.
This unfortunately isn't as stackable as could be possible, as it
hard-codes the socket stream. This is because the method of using a
custom openssl BIO is not clear, and we do not need this for now. We can
still bring this in if and as we need it.
We currently have gitno for talking over TCP, but this needs to know
about both plaintext and OpenSSL connections and the code has gotten
somewhat messy with ifdefs determining which version of the function
should be called.
In order to clean this up and abstract away the details of sending over
the different types of streams, we can instead use an interface and
stack stream implementations.
We may not be able to use the stackability with all streams, but we
are definitely be able to use the abstraction which is currently spread
between different bits of gitno.