Commit Graph

288 Commits

Author SHA1 Message Date
Vicent Marti
128e94bbbb index: Remove unneeded consts 2015-10-21 12:04:53 +02:00
Edward Thomson
21515f228b index: also try conflict mode when inserting
When we do not trust the on-disk mode, we use the mode of an existing
index entry.  This allows us to preserve executable bits on platforms
that do not honor them on the filesystem.

If there is no stage 0 index entry, also look at conflicts to attempt
to answer this question:  prefer the data from the 'ours' side, then
the 'theirs' side before falling back to the common ancestor.
2015-09-30 09:06:09 -04:00
Carlos Martín Nieto
6d6020defc Merge pull request #3353 from ethomson/wrongcase_add
index: canonicalize directory case when adding
2015-09-08 18:34:51 +02:00
Edward Thomson
2964cbeae1 Merge pull request #3381 from leoyanggit/index_directory_iterator
New feature: add the ablility to iterate through a directory in index
2015-09-08 11:50:08 -04:00
Edward Thomson
a32bc85e84 git_index_add: allow case changing renames
On case insensitive platforms, allow `git_index_add` to provide a new
path for an existing index entry.  Previously, we would maintain the
case in an index entry without the ability to change it (except by
removing an entry and re-adding it.)

Higher-level functions (like `git_index_add_bypath` and
`git_index_add_frombuffers`) continue to keep the old path for easier
usage.
2015-09-08 11:34:00 -04:00
Edward Thomson
280adb3f94 index: canonicalize directory case when adding
On case insensitive systems, when given a user-provided path in the
higher-level index addition functions (eg `git_index_add_bypath` /
`git_index_add_frombuffer`), examine the index to try to match the
given path to an existing directory.

Various mechanisms can cause the on-disk representation of a folder
to not match the representation in HEAD or the index - for example,
a case changing rename of some file `a/file.txt` to `A/file.txt`
will update the paths in the index, but not rename the folder on
disk.

If a user subsequently adds `a/other.txt`, then this should be stored
in the index as `A/other.txt`.
2015-09-08 11:32:40 -04:00
Edward Thomson
9fd4c9c867 Merge pull request #3366 from libgit2/cmn/index-hashmap
Use a hashmap for path-based lookups in the index
2015-09-06 10:50:22 -04:00
Leo Yang
c097f7173d New API: git_index_find_prefix
Find the first index entry matching a prefix.
2015-09-04 12:24:36 -04:00
Carlos Martín Nieto
81b7636757 index: put the icase insert choice in macros
This should let us see more clearly what we're doing and avoid the ugly
'if' we need every time we want to interact with the map.
2015-09-04 13:50:25 +02:00
Edward Thomson
810cabb9df racy-git: TODO to use improved diffing 2015-08-28 18:39:57 -04:00
Edward Thomson
ed1c64464a iterator: use an options struct instead of args 2015-08-28 18:39:47 -04:00
Carlos Martín Nieto
af1d5239a1 index: keep a hash table as well as a vector of entries
The hash table allows quick lookup of specific paths, while we use the
vector for enumeration.
2015-08-14 21:10:12 +02:00
Edward Thomson
ef4857c2b3 errors: tighten up git_error_state OOMs a bit more
When an error state is an OOM, make sure that we treat is specially
and do not try to free it.
2015-08-03 19:44:51 -04:00
Carlos Martín Nieto
ea961abf24 index: stage an unregistered submodule as well
We previously added logic to `_add_bypath()` to update a submodule. Go
further and stage the submodule even if it's not registered to behave
like git.
2015-08-01 20:01:11 +02:00
Carlos Martín Nieto
247d27c2c6 index: allow add_bypath to update submodules
Similarly to how git itself does it, allow the index update operation to
stage a change in a submodule's HEAD.
2015-07-12 12:11:22 +02:00
Carlos Martín Nieto
7497584651 index: check racily clean entries more thoroughly
When an entry has a racy timestamp, we need to check whether the file
itself has changed since we put its entry in the index. Only then do we
smudge the size field to force a check the next time around.
2015-06-22 12:47:30 +02:00
Carlos Martín Nieto
624c949f01 index: make relative comparison use the checksum as well
This is used by the submodule in order to figure out if the index has
changed since it last read it. Using a timestamp is racy, so let's make
it use the checksum, just like we now do for reloading the index itself.
2015-06-20 16:17:28 +02:00
Carlos Martín Nieto
5e947c91d4 index: use the checksum to check whether it's been modified
We currently use a timetamp to check whether an index file has been
modified since we last read it, but this is racy. If two updates happen
in the same second and we read after the first one, we won't detect the
second one.

Instead read the SHA-1 checksum of the file, which are its last 20 bytes which
gives us a sure-fire way to detect whether the file has changed since we
last read it.

As we're now keeping track of it, expose an accessor to this data.
2015-06-19 22:05:08 +02:00
Carlos Martín Nieto
316b820b6f index: zero the size of racily-clean entries
If a file entry has the same timestamp as the index itself, it is
considered racily-clean, as it may have been modified after the index
was written, but during the same second. We take extra steps to check
the contents, but this is just one part of avoiding races.

For files which do have changes but have not been updated in the index,
updating the on-disk index means updating its timestamp, which means we
would no longer recognise these entries as racy and we would trust the
timestamp to tell us whether they have changed.

In order to work around this, git zeroes out the file-size field in
entries with the same timestamp as the index in order to force the next
diff to check the contents. Do so in libgit2 as well.
2015-06-16 08:40:45 +02:00
Edward Thomson
8147b1aff5 diff: introduce binary diff callbacks
Introduce a new binary diff callback to provide the actual binary
delta contents to callers.  Create this data from the diff contents
(instead of directly from the ODB) to support binary diffs including
the workdir, not just things coming out of the ODB.
2015-06-12 09:39:20 -04:00
Edward Thomson
9b3e41f72b index_add_all: remove conflicts when no wd file
If there exists a conflict in the index, but no file in the working
directory, this implies that the user wants to accept the resolution
by removing the file.  Thus, remove the conflict entry from the
index, instead of trying to add a (nonexistent) file.
2015-05-28 09:47:51 -04:00
Edward Thomson
9f545b9d71 introduce git_index_entry_is_conflict
It's not always obvious the mapping between stage level and
conflict-ness.  More importantly, this can lead otherwise sane
people to write constructs like `if (!git_index_entry_stage(entry))`,
which (while technically correct) is unreadable.

Provide a nice method to help avoid such messy thinking.
2015-05-28 09:47:31 -04:00
Edward Thomson
d67f270e58 index: validate mode of new conflicts 2015-05-28 09:43:57 -04:00
Edward Thomson
3ab5a65967 index: remove error message in non-error remove
If `git_index_remove_bypath` does no work, and returns an OK error
code, it should not set an error message.
2015-05-28 09:43:53 -04:00
Edward Thomson
ecd60a56eb conflicts: when adding conflicts, remove staged
When adding a conflict for some path, remove the staged entry.
Otherwise, an illegal index (with both stage 0 and high-stage
entries) would result.
2015-05-28 09:43:49 -04:00
Edward Thomson
cbfeecf33f git_index_add_all: don't recurse ignored dirs
No need to get reports about individual ignored files, having a
single ignored directory delta is enough.
2015-05-20 20:14:31 -04:00
Edward Thomson
fa9a969d80 index_add_all: include untracked files in new subdirs 2015-05-20 20:05:55 -04:00
Carlos Martín Nieto
2b2dfe80f0 index: include TYPECHANGE in the diff
Without this option, we would not be able to catch exec bit changes.
2015-05-14 15:23:12 +02:00
Carlos Martín Nieto
0a78a52ed9 index: make add_all to act on a diff
Instead of going through each entry we have and re-adding, which may not
even be correct for certain crlf options and has bad performance, use
the function which performs a diff against the worktree and try to add
and remove files from that list.
2015-05-14 15:23:12 +02:00
Carlos Martín Nieto
197307f6b1 index: refactor diff-based update_all to match other applies
Refactor so we look like the code we're replacing, which should also
allow us to more easily inplement add-all.
2015-05-14 15:23:12 +02:00
Carlos Martín Nieto
713e11e0b1 index: use a diff to perform update_all
We currently iterate over all the entries and re-add them to the
index. While this provides correctness, it is wasteful as we try to
re-insert files which have not changed.

Instead, take a diff between the index and the worktree and only re-add
those which we already know have changed.
2015-05-14 15:23:12 +02:00
Edward Thomson
35d3976151 index: introduce git_index_read_index 2015-05-11 14:12:05 -04:00
John Fultz
d3282680ed Fix index-adding functions to know when to trust filemodes.
The idea...sometimes, a filemode is user-specified via an
explicit git_index_entry.  In this case, believe the user, always.

Sometimes, it is instead built up by statting the file system.  In
those cases, go with the existing logic we have to determine
whether the file system supports all filemodes and symlinks, and
make the best guess.

On file systems which have full filemode and symlink support, this
commit should make no difference.  On others (most notably Windows),
this will fix problems things like:
* git_index_add and git_index_add_frombuffer() should be believed.
* As a consequence, git_checkout_tree should make the filemodes in
the index match the ones in the tree.
* And diffs with GIT_DIFF_UPDATE_INDEX don't write the wrong filemodes.
* And merges, and probably other downstream stuff now fixed, too.

This makes my previous changes to checkout.c unnecessary,
so they are now reverted.

Also, added a test for index_entry permissions from git_index_add
and git_index_add_frombuffer, both of which failed before these changes.
2015-04-21 02:17:23 -05:00
Pierre-Olivier Latour
807566d554 Entry argument passed to git_index_add_frombuffer() should be const 2015-04-03 18:59:11 -07:00
Damien PROFETA
a275fbc0f7 Add API to add a memory buffer to an index
git_index_add_frombuffer enables now to store a memory buffer in the odb
and to store an entry in the index directly if the index is attached to a
repository.
2015-02-25 10:24:13 +01:00
Carlos Martín Nieto
a291790a8d Merge pull request #2831 from ethomson/merge_lock
merge: lock index during the merge (not just checkout)
2015-02-15 05:18:01 +01:00
Edward Thomson
41fae48df2 indexwriter: an indexwriter for repo operations
Provide git_indexwriter_init_for_operation for the common locking
pattern in merge, rebase, revert and cherry-pick.
2015-02-14 09:25:36 -05:00
Edward Thomson
55798fd153 git_indexwriter: lock then write the index
Introduce `git_indexwriter`, to allow us to lock the index while
performing additional operations, then complete the write (or abort,
unlocking the index).
2015-02-14 09:25:35 -05:00
Edward Thomson
f1453c59b2 Make our overflow check look more like gcc/clang's
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it.  This means dropping the ability to pass `NULL` as
an out parameter.

As a result, the macros also get updated to reflect this as well.
2015-02-13 09:27:33 -05:00
Edward Thomson
2884cc42de overflow checking: don't make callers set oom
Have the ALLOC_OVERFLOW testing macros also simply set_oom in the
case where a computation would overflow, so that callers don't
need to.
2015-02-12 22:54:47 -05:00
Edward Thomson
392702ee2c allocations: test for overflow of requested size
Introduce some helper macros to test integer overflow from arithmetic
and set error message appropriately.
2015-02-12 22:54:46 -05:00
Jacques Germishuys
b63b3b0e4d Ensure git_index_entry is not NULL before trying to free it 2015-01-25 14:08:05 +02:00
Edward Thomson
2fe8157e1b index: reuc and name entrycounts should be size_t
For the REUC and NAME entries, we use size_t internally, and we take
size_t for the get_byindex() functions, but the entrycount() functions
strangely cast to an unsigned int instead.
2014-12-22 18:42:03 -06:00
Edward Thomson
a64119e396 checkout: disallow bad paths on win32
Disallow:
 1. paths with trailing dot
 2. paths with trailing space
 3. paths with trailing colon
 4. paths that are 8.3 short names of .git folders ("GIT~1")
 5. paths that are reserved path names (COM1, LPT1, etc).
 6. paths with reserved DOS characters (colons, asterisks, etc)

These paths would (without \\?\ syntax) be elided to other paths - for
example, ".git." would be written as ".git".  As a result, writing these
paths literally (using \\?\ syntax) makes them hard to operate with from
the shell, Windows Explorer or other tools.  Disallow these.
2014-12-16 10:08:53 -06:00
Vicent Marti
0d388adc86 index: Check for valid paths before creating an index entry 2014-12-16 10:08:49 -06:00
Carlos Martín Nieto
62a617dc68 iterator: submodules are determined by an index or tree
We cannot know from looking at .gitmodules whether a directory is a
submodule or not. We need the index or tree we are comparing against to
tell us. Otherwise we have to assume the entry in .gitmodules is stale
or otherwise invalid.

Thus we pass the index of the repository into the workdir iterator, even
if we do not want to compare against it. This follows what git does,
which even for `git diff <tree>`, it will consider staged submodules as
such.
2014-11-07 08:33:27 +01:00
Carlos Martín Nieto
c2f8b21593 index: write out the tree cache extension
Keeping the cache around after read-tree is only one part of the
optimisation opportunities. In order to share the cache between program
instances, we need to write the TREE extension to the index.

Do so, taking the opportunity to rename 'entries' to 'entry_count' to
match the name given in the format description. The included test is
rather trivial, but works as a sanity check.
2014-10-10 19:43:42 +02:00
Carlos Martín Nieto
6843cebe17 index: fill the tree cache when reading from a tree
When reading from a tree, we know what every tree is going to look like,
so we can fill in the tree cache completely, making use of the index for
modification of trees a lot quicker.
2014-10-10 19:35:19 +02:00
Carlos Martín Nieto
19c88310cb tree-cache: move to use a pool allocator
This simplifies freeing the entries quite a bit; though there aren't
that many failure paths right now, introducing filling the cache from a
tree will introduce more. This makes sure not to leak memory on errors.
2014-10-10 19:35:18 +02:00
Jacques Germishuys
ff97778a7a The raw index buffer content is not guaranteed to be aligned
* Ensure alignment by copying the content into a structure on the stack
2014-09-26 12:12:08 +02:00