Instead of going through each entry we have and re-adding, which may not
even be correct for certain crlf options and has bad performance, use
the function which performs a diff against the worktree and try to add
and remove files from that list.
We currently iterate over all the entries and re-add them to the
index. While this provides correctness, it is wasteful as we try to
re-insert files which have not changed.
Instead, take a diff between the index and the worktree and only re-add
those which we already know have changed.
This is useful to send to the client while we're performing the work.
The reporting function has a force parameter which makes sure that we
do send out the message of 100% completed, even if this comes before the
next udpate window.
A couple of tests use the wrong remote to push to. We did not notice up
to now because the local push would copy individual objects, and those
already existed, so it became a no-op.
Once we made local push create the packfile, it became noticeable that
there was a new packfile where it didn't belong.
Instead of copying each object individually, as we'd been doing, use the
packbuilder which should be faster and give us some feedback.
While performing this change, we can hook up the packbuilder's writing
to the push progress so the caller knows how far along we are.
We currently first look in the loose object dir and then in the packs
for objects. When performing operations on recent history this has a
higher likelihood of hitting, but when we deal with operations which
look further back into the past, we start spending a large amount of
time getting ENOTENT from `access`.
Reversing the priorities means that long-running operations can get to
their objects faster, as we can look at the index data we have in memory
(or rather mapped) to figure out whether we have an object, which is
faster than going out to the filesystem.
The packed backend already implements an optimistic read algorithm by
first looking at the packs we know about and only going out to disk to
referesh if the object is not found which means that in the case where
we do have the object (which will be in the majority for anything that
traverses the graph) we can avoid going to to disk entirely to determine
whether an object exists.
Operations which look at recent history may take a slight impact, but
these would be operations which look a lot less at object and thus take
less time regardless.
The base refspecs changing can be a cause of confusion as to what is the
current base refspec set and complicate saving the remote's
configuration.
Change `git_remote_add_{fetch,push}()` to update the configuration
instead of an instance.
This finally makes `git_remote_save()` a no-op, it will be removed in a
later commit.
While this will rarely be different from the default, having it in the
remote adds yet another setting it has to keep around and can affect its
behaviour. Move it to the options.
Instead of having it set in a different place from every other callback,
put it the main structure. This removes some state from the remote and
makes it behave more like clone, where the constructors are passed via
the options.
As a first step in removing the repository-saving logic, don't allow
chaning the url or push url from a remote object, but change the
configuration on the configuration immediately.
Having the setting be different from calling its actions was not a great
idea and made for the sake of the wrong convenience.
Instead of that, accept either fetch options, push options or the
callbacks when dealing with the remote. The fetch options are currently
only the callbacks, but more options will be moved from setters and
getters on the remote to the options.
This does mean passing the same struct along the different functions but
the typical use-case will only call git_remote_fetch() or
git_remote_push() and so won't notice much difference.
The push object knows which remote it's associated with, and therefore
does not need to keep its own copy of the callbacks stored in the
remote.
Remove the copy and simply access the callbacks struct within the
remote.
Restricting files to size_t is a silly limitation. The loose backend
writes to a file directly, so there is no issue in using 63 bits for the
size.
We still assume that the header is going to fit in 64 bytes, which does
mean quite a bit smaller files due to the run-length encoding, but it's
still a much larger size than you would want Git to handle.
When handling attr matching, simply compare the directory path where the
attribute file resides to the path being matched. Skip over commonality
to allow us to compare the contents of the attribute file to the remainder
of the path.
This allows us to more easily compare the pattern directly to the path,
instead of trying to guess whether we want to compare the path's basename
or the full path based on whether the match was inside a containing
directory or not.
This also allows us to do fewer translations on the pattern (trying to
re-prefix it.)
When determining whether some file matches an attr pattern, do
not try to truncate the path to pass to fnmatch. When there is
no containing directory for an item (eg, from a .gitignore in the
root) this will cause us to truncate our path, which means that
we cannot do meaningful comparisons on it and we may have false
positives when trying to determine whether a given file is actually
a file or a folder (as we have lost the path's base information.)
This mangling was to allow fnmatch to compare a directory on disk to
the name of a directory, but it is unnecessary as our fnmatch accepts
FNM_LEADING_DIR.
Ensure that when examining a .gitignore in a subdirectory, we do not
erroneously apply the paths contained therein to the root of the
repository. (Fixed in c02a0e4).
While we are confident about the size of an int in architectures we're
likely to care about, the index format is defined by the exact size of
the fields. Use the definitions which show the exact width of the entry
fields.
As part of that, bring back 32-bit time and size fields, which currently
are 64 bits wide and can bring a false sense of security in how much
data they really store. Document that these fields are not to be taken
as authoritative.
It was added as a workaround while the project had code to use WinCNG
but had not made a release with it. There is now a release of libssh2
with WinCNG support, so this option is redundant. Let's get rid of it
before people start liking it too much.