Our code to write index entries to disk does not check whether the
entry that is to be written should use prefix compression for the path.
As such, we were overallocating memory and added bogus right-padding
into the resulting index entries. As there is no padding allowed in the
index version 4 format, this should actually result in an invalid index.
Fix this by re-using the newly extracted `index_entry_size` function.
Create a new function `index_entry_size` which encapsulates the logic to
calculate how much space is needed for an index entry, whether it is
simple/extended or compressed/uncompressed. This can later be re-used by
our code writing index entries.
The last written disk entry is currently being written inside of the
function `write_disk_entry`. Make behavior a bit more obviously by
instead setting it inside of `write_entries` while iterating all
entries.
To calculate the path of a compressed index entry, we need to know the
preceding entry's path. While we do actually set the first predecessor
correctly to "", we fail to update this while reading the entries.
Fix the issue by updating `last` inside of the loop. Previously, we've
been passing a double-pointer to `read_entry`, which it didn't update.
As it is more obvious to update the pointer inside the loop itself,
though, we can simply convert it to a normal pointer.
The index version 4 introduced compressed path names for the entries.
From the git.git index-format documentation:
At the beginning of an entry, an integer N in the variable width
encoding [...] is stored, followed by a NUL-terminated string S.
Removing N bytes from the end of the path name for the previous
entry, and replacing it with the string S yields the path name for
this entry.
But instead of stripping N bytes from the previous path's string and
using the remaining prefix, we were instead simply concatenating the
previous path with the current entry path, which is obviously wrong.
Fix the issue by correctly copying the first N bytes of the previous
entry only and concatenating the result with our current entry's path.
When encoding varints to a buffer, we want to remain sure that the
required buffer space does not exceed what is actually available. Our
current check does not do the right thing, though, in that it does not
honor that our `pos` variable counts the position down instead of up. As
such, we will require too much memory for small varints and not enough
memory for big varints.
Fix the issue by correctly calculating the required size as
`(sizeof(varint) - pos)`. Add a test which failed before.
Upstream git.git has changed the way how packfiles are named.
Previously, they were using a hash of the contained object's OIDs, which
has then been changed to use the hash of the complete packfile instead.
See 1190a1acf (pack-objects: name pack files after trailer hash,
2013-12-05) in the git.git repository for more information on this
change.
This commit changes our logic to match the behavior of core git.
To determine if a repository is a worktree or not, we currently check
for the existence of a "gitdir" file inside of the repository's gitdir.
While this is sufficient for non-broken repositories, we have at least
one case of a subtly broken repository where there exists a gitdir file
inside of a gitmodule. This will cause us to misidentify the submodule
as a worktree.
While this is not really a fault of ours, we can do better here by
observing that a repository can only ever be a worktree iff its common
directory and dotgit directory are different. This allows us to make our
check whether a repo is a worktree or not more strict by doing a simple
string comparison of these two directories. This will also allow us to
do the right thing in the above case of a broken repository, as for
submodules these directories will be the same. At the same time, this
allows us to skip the `stat` check for the "gitdir" file for most
repositories.
The check whether a repository is a worktree or not is currently done
inside of `git_repository_open_ext`. As we want to extend this function
later on, pull it out into its own function `repo_is_worktree` to ease
working on it.
The out-parameters of `find_repo` containing found paths of a repository
are a tad confusing, as they are not as obvious as they could be. Rename
them like following to ease reading the code:
- `repo_path` -> `gitdir_path`
- `parent_path` -> `workdir_path`
- `link_path` -> `gitlink_path`
- `common_path` -> `commondir_path`
The `path` out-parameter of `find_repo` is being sanitized initially
such that we do not try to append to existing content. The sanitization
is done via `git_buf_free`, though, which forces us to needlessly
reallocate the buffer later in the function. Fix this by using
`git_buf_clear` instead.
The fields `declared_size` and `received_bytes` of the `git_odb_stream`
are both of type `git_off_t` which is defined as a signed integer. When
passing these values to a printf-style string in
`git_odb_stream__invalid_length`, though, we format these as PRIuZ,
which is unsigned.
Fix the issue by using PRIdZ instead, silencing warnings on macOS.
The credentials callback reads the username and password via scanf into
fixed-length arrays. While these are simply examples and as such not as
interesting, the unchecked return value of scanf causes GCC to emit
warnings. So while we're busy to shut up GCC, we also fix the possible
overflow of scanf by using getline instead.
The `error` variable is used as a return value in the out-section of
both `odb_read_1` and `read_prefix_1`. While the value will actually
always be initialized inside of this section, GCC fails to realize this
due to interactions with the `found` variable: if `found` is set, the
error will always be initialized. If it is not, we return early without
reaching the out-statements.
Shut up the warnings by initializing the error variable, even though it
is unnecessary.
The current signature of `git_worktree_prune` accepts a flags field to
alter its behavior. This is not as flexible as we'd like it to be when
we want to enable passing additional options in the future. As the
function has not been part of any release yet, we are still free to
alter its current signature. This commit does so by using our usual
pattern of an options structure, which is easily extendable without
breaking the API.
When creating a new worktree, we do have a potential race with us
creating the worktree and another process trying to delete the same
worktree as it is being created. As such, the upstream git project has
introduced a flag `git worktree add --locked`, which will cause the
newly created worktree to be locked immediately after its creation. This
mitigates the race condition.
We want to be able to mirror the same behavior. As such, a new flag
`locked` is added to the options structure of `git_worktree_add` which
allows the user to enable this behavior.
Debian and Ubuntu often use schroot to build their DEB packages in a
controlled environment. Depending on how schroot is configured, our
tests regarding repository discovery break due to not being able to find
the repositories anymore. It turns out that these errors occur when the
schroot is configured to use an overlayfs on the directory structures.
The reason for this failure is that we usually refrain from discovering
repositories across devices. But unfortunately, overlayfs does not have
consistent device identifiers for all its files but will instead use the
device number of the filesystem the file stems from. So whenever we
cross boundaries between the upper and lower layer of the overlay, we
will fail to properly detect the repository and bail out.
This commit fixes the issue by enabling cross-device discovery in our
tests. While it would be preferable to have this turned off, it probably
won't do much harm anyway as we set up our tests in a temporary location
outside of the parent repository.
After calling `libssh2_init`, we need to clean up after the library by
executing `libssh2_exit` as soon as we exit. Register a shutdown handler
to do so which simply calls `libssh2_exit`. This fixes several memory
leaks.
We unconditionally return success when initializing libssh2, regardless
of whether `libgssh2_init` signals success or an error. Fix this by
checking its return code.
The `git_worktree_add` function currently accepts only a path and name
for the new work tree. As we may want to expand these parameters in
future versions without adding additional parameters to the function for
every option, this commit introduces our typical pattern of an options
struct. Right now, this structure is still empty, which will change with
the next commit.