git hardocodes these as objects which exist regardless of whether they
are in the odb and uses them in the shell interface as a way of
expressing the lack of a blob or tree for one side of e.g. a diff.
In the library we use each language's natural way of declaring a lack of
value which makes a workaround like this unnecessary. Since git uses it,
it does however mean each shell application would need to perform this
check themselves.
This makes it common work across a range of applications and an issue
with compatibility with git, which fits right into what the library aims
to provide.
Thus we introduce the hard-coded empty blob and tree in the odb
frontend. These hard-coded objects are checked for before going to the
backends, but after the cache check, which means the second time they're
used, they will be treated as normal cached objects instead of creating
new ones.
If the remote is anonymous, then we cannot check for any configuration,
as there is no name. Check for this before we try to use the name, which
may be a NULL pointer.
This fixes#2697.
This function has one output but can match multiple files, which can be
unexpected for the user, which would usually path the exact path of the
file he wants the status of.
We cannot know from looking at .gitmodules whether a directory is a
submodule or not. We need the index or tree we are comparing against to
tell us. Otherwise we have to assume the entry in .gitmodules is stale
or otherwise invalid.
Thus we pass the index of the repository into the workdir iterator, even
if we do not want to compare against it. This follows what git does,
which even for `git diff <tree>`, it will consider staged submodules as
such.
We consider an entry in .gitmodules to mean that we have a submodule at
a particular path, even if HEAD^{tree} and the index do not contain any
reference to it.
We should ignore that submodule entry and simply consider that path to
be a regular directory.
We currently consider CR to start the end of the line, but that means
that we miss cases with CR CR LF which can be used with git to match
files whose names have CR at the end of their names.
The fix from the patch comes from Russell's comment in the issue.
This fixes#2536.
* Error-handling is cleaned up to only let a file-not-found error
through, not other sorts of errors. And when a file-not-found
error happens, we clean up the error.
* Test now checks that file-not-found introduces no error. And
other minor cleanups.
An anonymous remote wouldn't create remote-tracking branches, so testing
we don't create them for TAGS_ALL is nonsensical. Furthermore, the name
of the supposed remote-tracking branch was also not one which would have
been created had it had a name.
Give the remote a name and test that we only create the tags when we
pass TAGS_ALL and that we do create the remote-branch branch when given
TAGS_AUTO.
When we update FETCH_HEAD we check whether the remote is the current
branch's upstream remote. The code does not check whether the current
refspec is relevant for this reference but always tries to perform the
reverse transformation, which causes it to error out if the refspec
doesn't match the reference.
Thanks to Pierre-Olivier Latour for the reproduction recipe.
For example, if you have
[include]
path = foo
and foo didn't exist, git_config_open_ondisk() would just give up
on the rest of the file. Now it ignores the unresolved include
without error and continues reading the rest of the file.
Already cherry-picked commits should not be re-included. If all changes
included in a commit exist in the upstream, then we should error with
GIT_EAPPLIED.
`git_rebase_next` will apply the next patch (or cherry-pick)
operation, leaving the results checked out in the index / working
directory so that consumers can resolve any conflicts, as appropriate.
Remote objects are not meant to be changed from under the user. We did
this in rename, but only the name and left the refspecs, such that a
save would save the wrong refspecs (and a fetch and anything else would
use the wrong refspecs).
Instead, let's simply take a name and not change any loaded remote from
under the user.
This function does not in fact tell us anything, as almost anything with
a colon in it is a valid rsync-style SSH path; it can not tell us that
we do not support ftp or afp or similar as those are still valid SSH
paths and we do support that.
Git for Windows will handle UNC paths only when in forward-slash
format, eg "//server/path". When given a UNC path as a remote,
rewrite standard format ("\\server\path") into this ridiculous
format.
The entry_count field is the amount of index entries covered by a
particular cache entry, that is how many files are there (recursively)
under a particular directory.
The current code that attemps to do this is severely defincient and is
trying to count the amount of children, which always comes up to zero.
We don't even need to recount, since we have the information during the
cache creation. We can take that number and keep it, as we only ever
invalidate or replace.
Keeping the cache around after read-tree is only one part of the
optimisation opportunities. In order to share the cache between program
instances, we need to write the TREE extension to the index.
Do so, taking the opportunity to rename 'entries' to 'entry_count' to
match the name given in the format description. The included test is
rather trivial, but works as a sanity check.
We don't need the remote loaded, and the function extracted both of
these from the git_remote in order to do its work, so let's remote a
step and not ask for the loaded remote at all.
This fixes#2390.
A transaction allows you to lock multiple references and set up changes
for them before applying the changes all at once (or as close as the
backend supports).
This can be used for replication purposes, or for making sure some
operations run when the reference is locked and thus cannot be changed.
When a list of refspecs is passed to fetch (what git would consider
refspec passed on the command-line), we not only need to perform the
updates described in that refspec, but also update the remote-tracking
branch of the fetched remote heads according to the remote's configured
refspecs.
These "fetches" are not however to be written to FETCH_HEAD as they
would be duplicate data, and it's not what the user asked for.
With opportunistic ref updates, git has introduced the concept of having
base refspecs *and* refspecs that are active for a particular fetch.
Let's start by letting the user override the refspecs for download.
When we describe the workdir, we perform a describe on HEAD and then
check to see if the worktree is dirty. If it is and we have a suffix
string, we append that to the buffer.
Instead of printing out to the buffer inside the information-gathering
phase, write the data to a intermediate result structure.
This allows us to split the options into gathering options and
formatting options, simplifying the gathering code.
Instead of spreading the data in function arguments, some of which
aren't used for ssh and having a struct only for ssh, use a struct for
both, using a common parent to pass to the callback.
We should let the user decide whether to cancel the connection or not
regardless of whether our checks have decided that the certificate is
fine. We provide our own assessment to the callback to let the user fall
back to our checks if they so desire.
If the certificate validation fails (or always in the case of ssh),
let the user decide whether to allow the connection.
The data structure passed to the user is the native certificate
information from the underlying implementation, namely OpenSSL or
WinHTTP.
When using a bare repo with an index, libgit2 attempts to read
files from the index. It caches those files based on the path
to the file, specifically the path to the directory that contains
the file.
If there is no working directory, we use `git_path_dirname_r` to
get the path to the containing directory. However, for the
`.gitattributes` file in the root of the repository, this ends up
normalizing the containing path to `"."` instead of the empty
string and the lookup the `.gitattributes` data fails.
This adds a test of attribute lookups on bare repos and also
fixes the problem by simply rewriting `"."` to be `""`.
A signature is made up of a non-empty name and a non-empty email so
let's validate that. This also brings us more in line with git, which
also rejects ident with an empty email.
Teach git_repository_init_ext to use relative paths for the gitlink
to the work directory. This is used when creating a sub repository
where the sub repository resides in the parent repository's
.git directory.
When the fetch refspec does not include the remote's default branch, it
indicates an error in user expectations or programmer error. Error out
in that case.
This lets us get rid of the dummy refspec which can never work as its
zeroed out. In the cases where we did not find a default branch, we set
HEAD detached immediately, which lets us refactor the "normal" path,
removing `found_branch`.
Add tests for the case when there are no branches on the remote and when
HEAD is detached but has the id of a non-branch. In both of these cases,
we should return ENOTFOUND.
It does the same as git_remote_supported_url() but has a name which
implies we'd check the URL for correctness while we're simply looking at
the scheme and looking it up in our lists.
While here, fix up the tests so we check all the combination of what's
supported.
Our mkdir helper was failing is a parent directory was not
accessible even if the child directory could be created.
This changes the helper to keep trying child directories
even when the parent is unwritable.
The old `allocfmt` is of no use to callers, as they are not able to free
the returned buffer. Export a new API that returns a static string that
doesn't need to be freed.
The online::push::notes test pushes a note but leaves it hanging
around for other tests to stumble across when they're validating
that they're seeing the refs they expect to see. Clean it up on
exit.
* Move the transport registration mechanisms into a new header under
'sys/' because this is advanced stuff.
* Remove the 'priority' argument from the registration as it adds
unnecessary complexity. (Since transports cannot decline to operate,
only the highest priority transport is ever executed.) Users who
require per-priority transports can implement that in their custom
transport themselves.
* Simplify registration further by taking a scheme (eg "http") instead
of a prefix (eg "http://").
In the check for multiline, we traverse the backslashes from the end
backwards and int the end assert that we haven't gone past the beginning
of the line. We make sure of this in the loop condition, but we also
check in the return value.
However, for certain configurations, a line in a multiline variable
might be empty to aid formatting. In that case, 'end' == 'start', since
we ended up looking at the first char which made it a multiline.
There is no need for the (end > start) check in the return, since the
loop guarantees we won't go further back than the first char in the
line, and we do accept the first char to be the final backslash.
This fixes#2483.
`git help ignore` has this to say about trailing slashes:
> If the pattern ends with a slash, it is removed for the purpose of
> the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and
> paths underneath it, but will not match a regular file or a
> symbolic link foo (this is consistent with the way how pathspec
> works in general in Git).
Sure enough, having manually performed the same steps as this test,
`git status` tells us the following:
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm --cached <file>..." to unstage)
#
# new file: force.txt
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# ../.gitignore
# child1/
# child2/
i.e. neither child1 nor child2 is ignored.
When writing 'bin/*' in the rules, this means we ignore very file inside
bin/ individually, but do not ignore the directory itself. Thus the
status listing should list both files under bin/, one untracked and one
ignored.
We always calculate multiple merge bases, but up to now we had only
exposed the "best" merge base.
Introduce git_oidarray which analogously to git_strarray lets us return
multiple ids.
git allows you to set which paths to use for the git server programs
when connecting over ssh; and we want to provide something similar.
We do this by providing a factory function which can be set as the
remote's transport callback which will set the given paths upon
creation.
We used to assume a refspec would only have an asterisk in the middle of
their respective pattern. This has not been a valid assumption for some
time now with git.
Instead of assuming where the asterisk is going to be, change the logic
to treat each pattern as having two halves with a replacement bit in the
middle, where the asterisk is.
Move the definition of git_thread_yield() to the test which needs it and
add the correct definition for it for FreeBSD and derivatives.
Original patch adding FreeBSD and derivatives by @jacquesg.
In order to connect to a remote server, we need to provide a path to the
repository we're interested in. Consider the lack of path in the url an
error.
As git_clone now has callbacks to configure the details of the
repository and remote, remove the lower-level functions from the public
API, as they lack some of the logic from git_clone proper.
git_checkout_index can now check out other git_index's (that are not
necessarily the repository index). This allows checkout_index to use
the repository's index for stat cache information instead of the index
data being checked out. git_merge and friends now check out their
indexes directly instead of trying to blend it into the running index.
To make sure that items returned from pool allocations are aligned
on nice boundaries, this rounds up all pool allocation sizes to a
multiple of 8. This adds a small amount of overhead to each item.
The rounding up could be made optional with an extra parameter to
the pool initialization that turned on rounding only for pools
where item alignment actually matters, but I think for the extra
code and complexity that would be involved, that it makes sense
just to burn a little bit of extra memory and enable this all the
time.
git_remote_set_transport now takes a transport factory rather than a transport
git_clone_options now allows the caller to specify a remote creation callback
For urls where we do not specify a username, we must handle the case
where the ssh transport asks us for the username.
Test also that switching username fails.
In order to know which authentication methods are supported/allowed by
the ssh server, we need to send a NONE auth request, which needs a
username associated with it.
Most ssh server implementations do not allow switching the username
between authentication attempts, which means we cannot use a dummy
username and then switch. There are two ways around this.
The first is to use a different connection, which an earlier commit
implements, but this increases how long it takes to get set up, and
without knowing the right username, we cannot guarantee that the
list we get in response is the right one.
The second is what's implemented here: if there is no username specified
in the url, ask for it first. We can then ask for the list of auth
methods and use the user's credentials in the same connection.
The checkout::conflict type conflict tests were failing because
they were overly assertive about the resultant mode, testing
group & other bits, which failed miserably for people who had a
umask less restrictive than 022. Only test the resultant owner bits.
Git for Windows 1.9.4 changed the behavior when the text=auto
attribute is specified and core.autocrlf=false. Previous observed
behavior would *not* filter files when going into the working
directory, the new behavior *does* filter. Update our behavior to match.
When checking out files, we're performing conversion into the user's
native line endings, but we only want to do it for files which have
consistent line endings. Refuse to perform the conversion for mixed-EOL
files.
The CRLF->LF filter is left as-is, as that conversion is considered to be
normalization by git and should force a conversion of the line endings.
Opening the same repository multiple times will currently open the same
file multiple times, as well as map the same region of the file multiple
times. This is not necessary, as the packfile data is immutable.
Instead of opening and closing packfiles directly, introduce an
indirection and allocate packfiles globally. This does mean locking on
each packfile open, but we already use this lock for the global mwindow
list so it doesn't introduce a new contention point.
Before calling the credentials callback, ask the sever which
authentication methods it supports and report that to the user, instead
of simply reporting everything that the transport supports.
In case of an error, we do fall back to listing all of them.
Finding a filename in a vector means we need to resort it every time we
want to read from it, which includes every time we want to write to it
as well, as we want to find duplicate keys.
A hash-map fits what we want to do much more accurately, as we do not
care about sorting, but just the particular filename.
We still keep removed entries around, as the interface let you assume
they were going to be around until the treebuilder is cleared or freed,
but in this case that involves an append to a vector in the filter case,
which can now fail.
The only time we care about sorting is when we write out the tree, so
let's make that the only time we do any sorting.
There is no reason why we need to use a callback here. A string array
fits better with the usage, as this is not an event and we don't need
anything from the user.
Inside `git_remote_load`, the calls to `get_optional_config` use
`giterr_clear` to unset any errors that are set due to missing config
keys. If neither a fetch nor a push url config was found for a remote,
we should set an error again.
This adds another assertion to ensure that the reference name inside
the git_reference struct returned by `git_branch_create` is returned as
precomposed if `core.precomposeunicode` is enabled.
If requested, git_clone_local_into() will try to link the object files
instead of copying them.
This only works on non-Windows (since it doesn't have this) when both
are on the same filesystem (which are unix semantics).