Previous to OpenSSL version 1.1, the user had to initialize at least the error
strings as well as the SSL algorithms by himself. OpenSSL version 1.1 instead
provides a new function `OPENSSL_init_ssl`, which handles initialization of all
subsystems. As the new API call will by default load error strings and
initialize the SSL algorithms, we can safely replace these calls when compiling
against version 1.1 or later.
This fixes a compiler error when compiling against OpenSSL version 1.1 which has
been built without stubs for deprecated syntax.
Up to version 1.0, OpenSSL required us to provide a callback which implements
a locking mechanism. Due to problems in the API design though this mechanism was
inherently broken, especially regarding that the locking callback cannot report
errors in an obvious way. Due to this shortcoming, the locking initialization
has been completely removed in OpenSSL version 1.1. As the library has also been
refactored to not make any use of these callback functions, we can safely remove
all initialization of the locking subsystem if compiling against OpenSSL version
1.1 or higher.
This fixes a compilation error when compiling against OpenSSL version 1.1 which
has been built without stubs for deprecated syntax.
In the function `git_filter_list_stream_data`, we initialize, write and
subesquently close the stream which should receive content processed by
the filter. While we skip writing to the stream if its initialization
failed, we still try to close it unconditionally -- even if the
initialization failed, where the stream might not be set at all, leading
us to segfault.
Semantics in this code is not really clear. The function handling the
same logic for files instead of data seems to do the right thing here in
only closing the stream when initialization succeeded. When stepping
back a bit, this is only reasonable: if a stream cannot be initialized,
the caller would not expect it to be closed again. So actually, both
callers of `stream_list_init` fail to do so. The data streaming function
will always close the stream and the file streaming function will not
close the stream if writing to it has failed.
The fix is thus two-fold:
- callers of `stream_list_init` now close the stream iff it has been
initialized
- `stream_list_init` now closes the lastly initialized stream if
the current stream in the chain failed to initialize
Add a test which segfaulted previous to these changes.
Whenever we rename a branch, we update the repository's symbolic HEAD
reference if it currently points to the branch that is to be renamed.
But with the introduction of worktrees, we also have to iterate over all
HEADs of linked worktrees to adjust them. Do so.
While we already provide functions to get the current repository's HEAD,
it is quite involved to iterate over HEADs of both the repository and
all linked work trees. This commit implements a function
`git_repository_foreach_head`, which accepts a callback which is then
called for all HEAD files.
The functions `git_repository_head_for_worktree` and
`git_repository_detached_head_for_worktree` both implement their
own logic to read the HEAD reference file. Use the new function
`git_reference__read_head` instead to unify the code paths.
The function `read_worktree_head` has the logic embedded to construct
the path to `HEAD` in the work tree's git directory, which is quite
useful for other callers. Extract the logic into its own function to
make it reusable by others.
If trying to set the HEAD of a repository to another reference, we have
to check whether this reference is already checked out in another linked
work tree. If it is, we will refuse setting the HEAD and return an
error, but do not set a meaningful error message. Add one.
Currently, we only provide functions to read references directly from a
repository's reference store via e.g. `git_reference_lookup`. But in
some cases, we may want to read files not connected to the current
repository, e.g. when looking up HEAD of connected work trees. This
commit implements `git_reference__read_head`, which will read out and
allocate a reference at an arbitrary path.
When executing `git_futils_mmap_ro_file`, we first try to guess whether
the file is mmapable at all. Part of this check is whether the file is
too large to be mmaped, which can be true on systems with 32 bit
`size_t` types.
The check is performed by first getting the file size wtih
`git_futils_filesize` and then checking whether the returned size can be
represented as `size_t`, returning an error if so. While this test also
catches the case where the function returned an error (as `-1` is not
representable by `size_t`), we will set the misleading error message
"file too large to mmap". But in fact, a negative return value from
`git_futils_filesize` will be caused by the inability to fstat the file.
Fix the error message by handling negative return values separately and
not overwriting the error message in that case.
Short-circuit the call to `git_path_resolve_relative` in case
`git_buf_joinpath` returns an error. While this does not fix any
immediate errors, the resulting code is easier to read and handles
potential new error conditions raised by `git_buf_joinpath`.
In the `_check_dir_contents` function, we first allocate memory for
joining the directory and subdirectory together and afterwards use
`git_buf_joinpath`. While this function in fact should not fail as
memory is already allocated, err on the safe side and check for returned
errors.
The current code in `parse_section_header_ext` is only prepared to
properly handle out-of-memory conditions for the `git_buf` structure.
While very unlikely and probably caused by a programming error, it is
also possible to run into error conditions other than out-of-memory
previous to reaching the actual parsing loop. In these cases, we will
run into undefined behavior as the `rpos` variable is only initialized
after these triggerable errors, but we use it in the cleanup-routine.
Fix the issue by unifying the function's cleanup code with an
`end_error` section, which will not use the `rpos` variable.
POSIX emulation retries should be configurable so that tests can disable
them. In particular, maniacally threading tests may end up trying to
open locked files and need retries, which will slow continuous
integration tests significantly.
This can prevent FILE_SHARED_VIOLATIONS when used in tools such as TortoiseGit TGitCache and FILE_SHARE_DELETE, because files can be opened w/o being locked any more.
Signed-off-by: Sven Strickroth <email@cs-ware.de>
When `git_repository_set_head` is provided a remote reference, update
the reflog with the tag name, like we do with a branch. This helps
consumers match the semantics of `git checkout remote`.
Provide a macro that will allow us to run a function with posix-like
return values multiple times in a retry loop, with an optional cleanup
function called between invocations.
Introduce mapping from windows error codes to errno values. This
allows us to replace our calls to the Windows posix emulation functions
with calls to the Win32 APIs for more fine-grained control over the
emulation.
These mappings match the Windows CRT's mappings for its posix emulation
as they were described to me.
While writing the tree inside of a buffer, we check whether the buffer
runs out of memory after each tree entry. While we set the error code as
soon as we detect the OOM situation, we happily proceed iterating over
the entries. This is not useful at all, as we will try to write into the
buffer repeatedly, which cannot work.
Fix this by exiting as soon as we are OOM.
The `git_tree_entry *entry` variable is defined twice inside of this
function. While this is not a problem currently, remove the shadowing
variable to avoid future confusion.
While we detect errors in `git_treebuilder_write_with_buffer`, we just
exit directly instead of freeing allocated memory. Fix this by
remembering error codes and skipping forward to the function's cleanup
code.
The recent addition of an error code to `pass_whole_blame` in ff8d2eb15
(blame_git: check return value of object lookup, 2017-03-20) introduced
a spurious goto. Remove it.
The config and attrcache file reading code would attempt to load a file
in a home directory by expanding the `~` and looking for the file, using
`git_sysdir_find_global_file`. If the file was not found, the error
handling would look for the literal path, eg `~/filename.txt`.
Use the new `git_config_expand_global_file` instead, which allows us to
get the path to the file separately, when the path is prefixed with
`~/`, and fail with a not found error without falling back to looking
for the literal path.
Provide a mechanism for callers to expand the full path of a file in the
global configuration directory (that is to say, the home directory) even
if the file doesn't necessarily exist. This lets callers use their own
logic for building paths separate from handling file existence.
`O_EXCL` and `O_TRUNC` are mutually exclusive flags to open(2); you can't
truncate a file if you're asserting that it can't exist in the first place.
Drop `O_TRUNC`.
When `git_repository_set_head` is provided a tag reference, update the
reflog with the tag name, like we do with a branch. This helps
consumers match the semantics of `git checkout tag`.
While parsing patch header lines, we iterate over each line and check if
the line has trailing garbage. What we do not check though is that the
line is actually a line ending with a trailing newline.
Fix this by checking the return code of `parse_advance_expected_str`.
The `pack_entry_find_prefix` function receives a `git_rawobj` structure
as argument. While the function first initializes the structure to a
sensible state, Coverity is unable to correctly detect this, resulting
in a warning.
Fix this warning by initializing the object to all-zeroes before passing
it to the function.
While parsing section headers, we use a buffer to store the actual
section name. We do not check though if the buffer runs out of memory at
any stage. Do so.
The function `pass_whole_blame` performs an object lookup but does not
check if the lookup actually succeeds. Convert the function to return an
error code and check for it in the calling function.
The OpenSSL library may require multiple locks to work correctly, where
it is the caller's responsibility to initialize and release the locks.
While we correctly initialized up to `n` locks, as determined by
`CRYPTO_num_locks`, we were repeatedly freeing the same mutex in our
shutdown procedure.
Fix the issue by freeing locks at the correct index.
The `map_free` functions were not implemented as functions but instead
as macros which also set the map to NULL. While this is most certainly
sensible in most cases, we should prefer the more obvious behavior,
namingly leaving the map pointer intact.
Furthermore, this macro has been refactored incorrectly during the
map-refactorings: the two statements are not actually grouped together
by a `do { ... } while (0)` block, as it is required for macros to
match the behavior of functions more closely. This has led to at least
one subtle nesting error in `pack-objects.c`. The following code block
```
if (pb->object_ix)
git_oidmap_free(pb->object_ix);
```
would be expanded to
```
if (pb->object_ix)
git_oidmap__free(pb->object_ix); pb->object_ix = NULL;
```
which is not what one woudl expect. While it is not a bug here as it
would simply become a no-op, the wrong implementation could lead to bugs
in other occasions.
Fix this by simply removing the macro altogether and replacing it with
real function calls. This leaves the burden of setting the pointer to
NULL afterwards to the caller, but this is actually expected and behaves
like other `free` functions.
We currently call `git_strmap_free` on `checkout_data.mkdir_map` in the
`checkout_data_clear` function. The only thing protecting us from a
double-free is that the `git_strmap_free` function is in fact not a
function, but a macro that also sets the map to NULL.
Remove the second call to `git_strmap_free` and explicitly set the map
member to NULL.
It is possible to specify submodule URLs relative to the repository
location. E.g. having a submodule with URL "../submodule" will look for
the submodule at "repo/../submodule".
With the introduction of worktrees, though, we cannot simply resolve the
URL relative to the repository location itself. If the repository for
which a URL is to be resolved is a working tree, we have to resolve the
URL relative to the parent's repository path. Otherwise, the URL would
change depending on where the working tree is located.
Fix this by special-casing when we have a working tree while getting the
URL base.
References for a repository are usually created inside of its gitdir.
When using worktrees, though, these references are not to be created
inside the worktree gitdir, but instead inside the gitdir of its parent
repository, which is the commondir. Like this, branches will still be
available after the worktree itself has been deleted.
The filesystem refdb currently still creates new references inside of
the gitdir. Fix this and have it create references in commondir.
The three link files "worktree/.git", ".git/worktrees/<name>/commondir"
and ".git/worktrees/<name>/gitdir" should always contain absolute and
resolved paths. Adjust the logic creating new worktrees to first use
`git_path_prettify_dir` before writing out these files, so that paths
are resolved first.
When creating a new worktree, we have to set up the initial data
structures. Next to others, this also includes the HEAD pseudo-ref.
We currently set it to the worktree respectively branch name, which is
actually not fully qualified.
Use the fully qualified branch name instead.
The working tree's parent path should not point to the parent's gitdir,
but to the parent's working directory. Pointing to the gitdir would not
make any sense, as the parent's working directory is actually equal to
both repository's common directory.
Fix the issue.
While we already provide functionality to look up a worktree from a
repository, we cannot do so the other way round. That is given a
repository, we want to look up its worktree if it actually exists.
Getting the worktree of a repository is useful when we want to get
certain meta information like the parent's location, getting the locked
status, etc.
Separate the logic of finding the worktree directory of a repository and
actually opening the working tree's directory. This is a preparatory
step for opening the worktree structure of a repository itself.
When calling `git_submodule_update` on a submodule, we have to retrieve
the ID of the submodule entry in the index. If the function is called on
a submodule which is only partly initialized, the submodule entry may
not be added to the index yet. This leads to an assert when trying to
look up the blob later on.
Fix the issue by checking if the index actually holds the submodule's
ID and erroring out if it does not.
The function `diff_parsed_alloc` allocates and initializes a
`git_diff_parsed` structure. This structure also contains diff options.
While we initialize its flags, we fail to do a real initialization of
its values. This bites us when we want to actually use the generated
diff as we do not se the option's version field, which is required to
operate correctly.
Fix the issue by executing `git_diff_init_options` on the embedded
struct.
In a diff, the shortest possible hunk with a modification (that is, no
deletion) results from a file with only one line with a single character
which is removed. Thus the following hunk
@@ -1 +1 @@
-a
+
is the shortest valid hunk modifying a line. The function parsing the
hunk body though assumes that there must always be at least 4 bytes
present to make up a valid hunk, which is obviously wrong in this case.
The absolute minimum number of bytes required for a modification is
actually 2 bytes, that is the "+" and the following newline. Note: if
there is no trailing newline, the assumption will not be offended as the
diff will have a line "\ No trailing newline" at its end.
This patch fixes the issue by lowering the amount of bytes required.
Now that the `git_diff_foreach` function does not depend on internals of
the `git_patch_generated` structure anymore, we can easily move it to
the actual diff code.
The current logic of `git_diff_foreach` makes the assumption that all
diffs passed in are actually derived from generated diffs. With these
assumptions we try to derive the actual diff by inspecting either the
working directory files or blobs of a repository. This obviously cannot
work for diffs parsed from a file, where we do not necessarily have a
repository at hand.
Since the introduced split of parsed and generated patches, there are
multiple functions which help us to handle patches generically, being
indifferent from where they stem from. Use these functions and remove
the old logic specific to generated patches. This allows re-using the
same code for invoking the callbacks on the deltas.
Under the existing logic, we try to load patch contents differently,
depending on whether the patch files stem from the working directory or
not. But actually, the executed code paths are completely equal to each
other -- so we were always the code despite the condition.
Remove the condition altogether and conflate both code paths.
Don't compute the sha-1 in `git_futils_readbuffer_updated` unless the
checksum was requested. This means that `git_futils_readbuffer` will
not calculate the checksum unnecessarily.
An untracked file in a submodule should not prevent a rebase from
starting. Even if the submodule's SHA is changed, and that file would
conflict with a new tracked file, it's still OK to start the rebase
and discover the conflict later.
Signed-off-by: David Turner <dturner@twosigma.com>
When fsync'ing files, fsync the parent directory in the case where we
rename a file into place, or create a new file, to ensure that the
directory entry is flushed correctly.
Add a custom `O_FSYNC` bit (if it's not been defined by the operating
system`) so that `git_futils_writebuffer` can optionally do an `fsync`
when it's done writing.
We call `fsync` ourselves, even on systems that define `O_FSYNC` because
its definition is no guarantee of its actual support. Mac, for
instance, defines it but doesn't support it in an `open(2)` call.
Introduce a simple counter that `p_fsync` implements. This is useful
for ensuring that `p_fsync` is called when we expect it to be, for
example when we have enabled an odb backend to perform `fsync`s when
writing objects.
Fixes a regression from #4092. This is a crash on 32-bit and I assume that
it doesn't do the right thing on 64-bit either. MSVC emits a warning for this,
but of course, it's easy to get lost among all of the similar 'possible loss
of data' warnings.
Provide a descriptive error message when compiling THREADSAFE on gcc
versions < 4.1. We require the atomic primitives (eg
`__sync_synchronize`) that were introduced in that version.
(Note, clang setes `__GNUC__` but appears to set its version > 4.1.)
Remove useless indirection from `git_attr_cache__init` to
`git_attr_cache__do_init`. The difference is that the
`git_attr_cache__init` macro first checks if the cache is already
initialized and, if so, not call `git_attr_cache__do_init`. But
actually, `git_attr_cache__do_init` already does the same thing and
returns immediately if the cache is already initialized.
Remove the indirection.
When doing an upsert of a file, we used to use `git__compare_and_swap`,
comparing the entry's file which is to be replaced with itself. This can
be more easily formulated by using `git__swap`, which unconditionally
replaces the value.
`snprintf` requires a _format_ but does not require _arguments_ to the
format. eg: `snprintf(buf, 42, "hi")` is perfectly legal. Expand the
macro to match.
Without this, `p_sprintf(buf, 42, "hi")` errors with:
```
error: expected expression
p_snprintf(msg, 42, "hi");
^
src/unix/posix.h:53:34: note: expanded from macro 'p_snprintf'
^
/usr/include/secure/_stdio.h:57:73: note: expanded from macro 'snprintf'
__builtin___snprintf_chk (str, len, 0, __darwin_obsz(str),
__VA_ARGS__)
```
The upstream git.git project currently identifies all references inside
of `refs/bisect/` as well as `HEAD` as per-worktree references. This is
already incorrect and is currently being fixed by an in-flight topic
[1]. The new behavior will be to match all pseudo-references outside of
the `refs/` hierarchy as well as `refs/bisect/`.
Our current behavior is to mark a selection of pseudo-references as
per-worktree, only. This matches more pseudo-references than current
git, but forgets about `refs/bisect/`. Adjust behavior to match the
in-flight topic, that is classify the following references as
per-worktree:
- everything outside of `refs/`
- everything inside of `refs/bisect/`
[1]: <20170213152011.12050-1-pclouds@gmail.com>
When extracting a commit's signature, we first free the object and only
afterwards put its signature contents into the result buffer. This works
in most cases - the free'd object will normally be cached anyway, so we
only end up decrementing its reference count without actually freeing
its contents. But in some more exotic setups, where caching is disabled,
this can definitly be a problem, as we might be the only instance
currently holding a reference to this object.
Fix this issue by first extracting the contents and freeing the object
afterwards only.
The functions `git_commit_header_field` and
`git_commit_extract_signature` both receive buffers used to hand back
the results to the user. While these functions called `git_buf_sanitize`
on these buffers, this is not the right thing to do, as it will simply
initialize or zero-terminate passed buffers. As we want to overwrite
contents, we instead have to call `git_buf_clear` to completely reset
them.
When `git_buf_sanitize` gets called, it converts a buffer with NULL
content to be correctly initialized. This is done by pointing it to
`git_buf__initbuf`. While the method's documentation states this
clearly, it may also lead to the conclusion that it will do the same to
buffers which do _not_ have NULL contents.
Clarify behavior when passing a buffer with non-NULL contents, where
`git_buf_sanitize` will ensure that the contents are `\0`-terminated.
When opening a worktree via the gitdir of its parent repository
we fail to correctly set up the worktree's working directory. The
problem here is two-fold: we first fail to see that the gitdir
actually is a gitdir of a working tree and then subsequently
fail to determine the working tree location from the gitdir.
The first problem of not noticing a gitdir belongs to a worktree
can be solved by checking for the existence of a `gitdir` file in
the gitdir. This file points back to the gitlink file located in
the working tree's working directory. As this file only exists
for worktrees, it should be sufficient indication of the gitdir
belonging to a worktree.
The second problem, that is determining the location of the
worktree's working directory, can then be solved by reading the
`gitdir` file in the working directory's gitdir. When we now
resolve relative paths and strip the final `.git` component, we
have the actual worktree's working directory location.
The `path_repository` variable is actually confusing to think
about, as it is not always clear what the repository actually is.
It may either be the path to the folder containing worktree and
.git directory, the path to .git itself, a worktree or something
entirely different. Actually, the intent of the variable is to
hold the path to the gitdir, which is either the .git directory
or the bare repository.
Rename the variable to `gitdir` to avoid confusion. While at it,
also rename `path_gitlink` to `gitlink` to improve consistency.
If a branch is already checked out in a working tree we are not
allowed to check out that branch in another repository. Introduce
this restriction when setting a repository's HEAD.