`git_submodule_status` is very slow, bottlenecked on
`git_repository_head_tree`, which it uses through `submodule_update_head`. If
the user has requested submodule caching, assume that they want this status
cached too and skip it.
Signed-off-by: David Turner <dturner@twosigma.com>
Added `git_repository_submodule_cache_all` to initialze a cache of
submodules on the repository so that operations looking up N
submodules are O(N) and not O(N^2). Added a
`git_repository_submodule_cache_clear` function to remove the cache.
Also optimized the function that loads all submodules as it was itself
O(N^2) w.r.t the number of submodules, having to loop through the
`.gitmodules` file once per submodule. I changed it to process the
`.gitmodules` file once, into a map.
Signed-off-by: David Turner <dturner@twosigma.com>
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
Remove overriding the `checkout_strategy` for `update_options` when
performing an update on a submodule. Users should be specifying the
correct checkout strategy in
`update_options.checkout_opts.checkout_strategy`.
In C89 it is undefined behavior to pass `NULL` pointers to
`strncmp` and later on in C99 it has been explicitly stated that
functions with an argument declared as `size_t nmemb` specifying
the array length shall always have valid parameters, no matter if
`nmemb` is 0 or not (see ISO 9899 §7.21.1.2).
The function `str_equal_no_trailing_slash` always passes its
parameters to `strncmp` if their lengths match. This means if one
parameter is `NULL` and the other one either `NULL` or a string
with length 0 we will pass the pointers to `strncmp` and cause
undefined behavior.
Fix this by explicitly handling the case when both lengths are 0.
Reload the HEAD and index data for a submodule after reading the
configuration. The configuration may specify a `path`, so we must
update HEAD and index data with that path in mind.
When searching for information about a submdoule, let's be more explicit
in what we expect to find. We currently insert a submodule into the map
and change certain parameters when the config callback gets called.
Switch to asking for the configuration we're interested in, rather than
taking it in an arbitrary order.
If we get the path from the gitmodules file, look up the submodule we're
interested in by path, rather then by name. Otherwise we might get
duplicate results.
The regex we use to look at the gitmodules file does not correctly
delimit the name of submodule which we want to look up and puts '.*'
straight after the name, maching on any submodule which has the seeked
submodule as a prefix of its name.
Add the missing '\.' in the regex so we want a full stop to exist both
before and after the submodule name.
We currently do not handle those enum values which require us to set
"true" or unset variables in all cases. Use a common function which does
understand this by looking at our mapping directly.
Similarly to the other ones. In this test we copy over testing
`RECURSE_YES` which shows an error in our handling of the `YES` variant
which we may have to port to the rest.
During the cache deletion, the check for whether we consider a submodule
to exist got changed regarding submodules which are in the worktree but
not configured.
Instead of checking for the url field to be populated, check the
location where we've found it.
This lets us specify in the status call which ignore rules we want to
use (optionally falling back to whatever the submodule has in its
configuration).
This removes one of the reasons for having `_set_ignore()` set the value
in-memory. We re-use the `IGNORE_RESET` value for this as it is no
longer relevant but has a similar purpose to `IGNORE_FALLBACK`.
Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of
initializers and move that to fall back to the configuration as well.
As submodules are becomes more like values, we should not let a status
check to update its properties. Instead of taking a submodule, have
status take a repo and submodule name.
Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
This is used by the submodule in order to figure out if the index has
changed since it last read it. Using a timestamp is racy, so let's make
it use the checksum, just like we now do for reloading the index itself.
Having the setting be different from calling its actions was not a great
idea and made for the sake of the wrong convenience.
Instead of that, accept either fetch options, push options or the
callbacks when dealing with the remote. The fetch options are currently
only the callbacks, but more options will be moved from setters and
getters on the remote to the options.
This does mean passing the same struct along the different functions but
the typical use-case will only call git_remote_fetch() or
git_remote_push() and so won't notice much difference.
Currently git_submodule_sync writes the submodule's URL to the
key 'branch.<REMOTE_NAME>.remote' while the reference
implementation of `git submodule sync` writes to
'remote.<REMOTE_NAME>.url', which is the intended behavior
according to git-submodule(1).
We want to use the "checkout: moving from ..." message in order to let
git know when a change of branch has happened. Make the convenience
functions for this goal write this message.
The signature for the reflog is not something which changes
dynamically. Almost all uses will be NULL, since we want for the
repository's default identity to be used, making it noise.
In order to allow for changing the identity, we instead provide
git_repository_set_ident() and git_repository_ident() which allow a user
to override the choice of signature.