Initially, the setting has been solely used to enable the use of
`fsync()` when creating objects. Since then, the use has been extended
to also cover references and index files. As the option is not yet part
of any release, we can still correct this by renaming the option to
something more sensible, indicating not only correlation to objects.
This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also
move the variable from the object to repository source code.
While we already provide functions to get the current repository's HEAD,
it is quite involved to iterate over HEADs of both the repository and
all linked work trees. This commit implements a function
`git_repository_foreach_head`, which accepts a callback which is then
called for all HEAD files.
The `path_repository` variable is actually confusing to think
about, as it is not always clear what the repository actually is.
It may either be the path to the folder containing worktree and
.git directory, the path to .git itself, a worktree or something
entirely different. Actually, the intent of the variable is to
hold the path to the gitdir, which is either the .git directory
or the bare repository.
Rename the variable to `gitdir` to avoid confusion. While at it,
also rename `path_gitlink` to `gitlink` to improve consistency.
The commondir variable stores the path to the common directory.
The common directory is used to store objects and references
shared across multiple repositories. A current use case is the
newly introduced `git worktree` feature, which sets up a separate
working copy, where the backing git object store and references
are pointed to by the common directory.
Added `git_repository_submodule_cache_all` to initialze a cache of
submodules on the repository so that operations looking up N
submodules are O(N) and not O(N^2). Added a
`git_repository_submodule_cache_clear` function to remove the cache.
Also optimized the function that loads all submodules as it was itself
O(N^2) w.r.t the number of submodules, having to loop through the
`.gitmodules` file once per submodule. I changed it to process the
`.gitmodules` file once, into a map.
Signed-off-by: David Turner <dturner@twosigma.com>
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
The signature for the reflog is not something which changes
dynamically. Almost all uses will be NULL, since we want for the
repository's default identity to be used, making it noise.
In order to allow for changing the identity, we instead provide
git_repository_set_ident() and git_repository_ident() which allow a user
to override the choice of signature.
A repository can have multiple "reserved names" now, not just
a single "short name" for the repository folder itself. Refactor
to include a git_repository__reserved_names that returns all the
reserved names for a repository.
During checkout, assume that the .gitattributes files aren't
modified during the checkout. Instead, create an "attribute session"
during checkout. Assume that attribute data read in the same
checkout "session" hasn't been modified since the checkout started.
(But allow subsequent checkouts to invalidate the cache.)
Further, cache nonexistent git_attr_file data even when .gitattributes
files are not found to prevent re-scanning for nonexistent files.
Disallow:
1. paths with trailing dot
2. paths with trailing space
3. paths with trailing colon
4. paths that are 8.3 short names of .git folders ("GIT~1")
5. paths that are reserved path names (COM1, LPT1, etc).
6. paths with reserved DOS characters (colons, asterisks, etc)
These paths would (without \\?\ syntax) be elided to other paths - for
example, ".git." would be written as ".git". As a result, writing these
paths literally (using \\?\ syntax) makes them hard to operate with from
the shell, Windows Explorer or other tools. Disallow these.
This adds a basic test of doing simultaneous diffs on multiple
threads and adds basic locking for the attr file cache because
that was the immediate problem that arose from these tests.
This takes the old submodule cache which was just a git_strmap
and makes a real git_submodule_cache object that can contain other
things like a lock and timestamp-ish data to control refreshing of
submodule info.
This doesn't actual do string precompose but it puts the hooks in
place into the iterators and the git_path_dirload function so that
the actual precompose work is ready to go.
This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization. Hopefully this will make the diff code easier to
understand and to extend.
This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided. The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.
This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.
This adds a bunch of additional config values to the repository
config value cache and makes it easier to add a simple boolean
config without creating enum values for each possible setting.
Also, this fixes a bug in git_config_refresh where the config
cache was not being cleared which could lead to potential
incorrect values.
The work to start using the new cached configs will come in the
next couple of commits...
The goal of this work is to expose the search logic for "global",
"system", and "xdg" files through the git_libgit2_opts() interface.
Behind the scenes, I changed the logic for finding files to have a
notion of a git_strarray that represents a search path and to store
a separate search path for each of the three tiers of config file.
For each tier, I implemented a function to initialize it to default
values (generally based on environment variables), and then general
interfaces to get it, set it, reset it, and prepend new directories
to it.
Next, I exposed these interfaces through the git_libgit2_opts
interface, reusing the GIT_CONFIG_LEVEL_SYSTEM, etc., constants
for the user to control which search path they were modifying.
There are alternative designs for the opts interface / argument
ordering, so I'm putting this phase out for discussion.
Additionally, I ended up doing a little bit of clean up regarding
attr.h and attr_file.h, adding a new attrcache.h so the other two
files wouldn't have to be included in so many places.
Often `git_odb_read_header` will "fail" and have to read the
entire object into memory instead of just the header. When this
happens, the object is loaded and then disposed of immediately,
which makes it difficult to efficiently use the header information
to decide if the object should be loaded (since attempting to do
so will often result in loading the object twice).
This commit takes the existing code and reorganizes it to have
two new functions:
- `git_odb__read_header_or_object` which acts just like the old
read header function except that it returns the object, too, if
it was forced to load the whole thing. It then becomes the
callers responsibility to free the `git_odb_object`.
- `git_object__from_odb_object` which was extracted from the old
`git_object_lookup` and creates a subclass of `git_object` from
an existing `git_odb_object` (separating the ODB lookup from the
`git_object` creation). This allows you to use the first header
reading function efficiently without instantiating the
`git_odb_object` twice.
There is no net change to the behavior of any of the existing
functions, but this allows internal code to tap into the ODB
lookup and object creation to be more efficient.
This extends git_repository_init_ext further with support for
initializing the repository from an external template directory
and with support for the "create shared" type flags that make a
set GID repository directory.
This also adds tests for much of the new functionality to the
existing `repo/init.c` test suite.
Also, this adds a bunch of new utility functions including a
very general purpose `git_futils_mkdir` (with the ability to
make paths and to chmod the paths post-creation) and a file
tree copying function `git_futils_cp_r`. Also, this includes
some new path functions that were useful to keep the code
simple.
The extended version of repository init adds support for many
of the things that you can do with `git init` and sets up
structures that will make it easier to extend further in the
future.
Depending on the operation, we need to consider gitattributes
in both the work dir and the index. This adds a parameter to
all of the gitattributes related functions that allows user
control of attribute reading behavior (i.e. prefer workdir,
prefer index, only use index).
This fix also covers allowing us to check attributes (and
hence do diff and status) on bare repositories.
This was a somewhat larger change that I hoped because it had
to change the cache key used for gitattributes files.
This renamed `git_khash_str` to `git_strmap`, `git_hash_oid` to
`git_oidmap`, and deletes `git_hashtable` from the tree, plus
adds unit tests for `git_strmap`.
This updates khash.h with some extra features (like error checking
on allocations, ability to use wrapped malloc, foreach calls, etc),
creates two high-level wrappers around khash: `git_khash_str` and
`git_khash_oid` for string-to-void-ptr and oid-to-void-ptr tables,
then converts all of the old usage of `git_hashtable` over to use
these new hashtables.
For `git_khash_str`, I've tried to create a set of macros that
yield an API not too unlike the old `git_hashtable` API. Since
the oid hashtable is only used in one file, I haven't bother to
set up all those macros and just use the khash APIs directly for
now.
This adds support for a bunch of core.* settings that affect
diff and status, plus fixes up some incorrect implementations
of those settings from before. Also, this cleans up the
handling of config settings in the new submodules code and
in the old attrs/ignore code.
When processing status for a newly checked out repo, it is
possible that there will be submodules that have not yet been
initialized. The only way to distinguish these from untracked
directories is to have some knowledge of submodules. This
commit adds a new submodule API which, given a name or path,
can determine if it appears to be a submodule and can give
information about the submodule.
Adds support for .gitignore files to git_status_foreach() and
git_status_file(). This includes refactoring the gitattributes
code to share logic where possible. The GIT_STATUS_IGNORED flag
will now be passed in for files that are ignored (provided they
are not already in the index or the head of repo).
Add support for git attribute macro definitions. Also, add
support for cache flush API to clear the attribute file content
cache when needed.
Additionally, improved the handling of global and system files,
making common utility functions in fileops and converting config
and attr to both use the common functions.
Adds a bunch more tests and fixed some memory leaks. Note that
adding macros required me to use refcounted attribute assignment
definitions, which complicated, but probably improved memory usage.
This adds APIs for querying git attributes. In addition to
the new API in include/git2/attr.h, most of the action is in
src/attr_file.[hc] which contains utilities for dealing with
a single attributes file, and src/attr.[hc] which contains
the implementation of the APIs that merge all applicable
attributes files.