Commit Graph

130 Commits

Author SHA1 Message Date
Edward Thomson
4c88198a85 iterator: test that we're at the end of iteration
Ensure that we have hit the end of iteration; previously we tested
that we saw all the values that we expected to see.  We did not
then ensure that we were at the end of the iteration (and that there
were subsequently values in the iteration that we did *not* expect.)
2016-03-23 17:16:37 -04:00
Edward Thomson
0e0589fcc3 iterator: combine fs+workdir iterators more completely
Drop some of the layers of indirection between the workdir and the
filesystem iterators.  This makes the code a little bit easier to
follow, and reduces the number of unnecessary allocations a bit as
well.  (Prior to this, when we filter entries, we would allocate them,
filter them and then free them; now we do the filtering before
allocation.)

Also, rename `git_iterator_advance_over_with_status` to just
`git_iterator_advance_over`.  Mostly because it's a fucking long-ass
function name otherwise.
2016-03-23 17:16:37 -04:00
Edward Thomson
be30387e8b iterators: refactored tree iterator
Refactored the tree iterator to never recurse; simply process the
next entry in order in `advance`.  Additionally, reduce the number of
allocations and sorting as much as possible to provide a ~30% speedup
on case-sensitive iteration.  (The gains for case-insensitive iteration
are less majestic.)
2016-03-23 17:08:37 -04:00
Edward Thomson
f0224772ee git_object_dup: introduce typesafe versions 2016-03-23 17:08:37 -04:00
Edward Thomson
684b35c41b iterator: disambiguate reset and reset_range
Disambiguate the reset and reset_range functions.  Now reset_range
with a NULL path will clear the start or end; reset will leave the
existing start and end unchanged.
2016-03-23 17:08:37 -04:00
Edward Thomson
ac05086c40 iterator: drop unused/unimplemented seek 2016-03-23 17:08:36 -04:00
Carlos Martín Nieto
60a194aa86 tree: re-use the id and filename in the odb object
Instead of copying over the data into the individual entries, point to
the originals, which are already in a format we can use.
2016-03-20 11:00:12 +01:00
Carlos Martín Nieto
594a5d12d4 Merge pull request #3619 from ethomson/win32_forbidden
win32: allow us to read indexes with forbidden paths on win32
2016-02-18 12:28:06 +01:00
Edward Thomson
4fea9cffbd iterator: assert tree_iterator has a frame
Although a `tree_iterator` that failed to be properly created
does not have a frame, all other `tree_iterator`s should.  Do not
call `pop` in the failure case, but assert that in all other
cases there is a frame.
2016-02-17 13:10:33 +00:00
Colin Xu
a218b2f625 Validate pointer before access the member.
When Git repository at network locations, sometimes git_iterator_for_tree
fails at iterator__update_ignore_case so it goes to git_iterator_free.
Null pointer will crash the process if not check.

Signed-off-by: Colin Xu <colin.xu@gmail.com>
2016-02-17 13:10:33 +00:00
Arthur Schreiber
3679ebaef5 Horrible fix for #3173. 2016-02-11 23:41:34 +01:00
Vicent Marti
1e5e02b4f4 pool: Simplify implementation 2015-10-28 10:13:13 +01:00
Edward Thomson
26d7cf6e57 iterator: loop fs_iterator advance (don't recurse) 2015-09-13 14:07:54 -04:00
Edward Thomson
a1859e21f3 iterator: advance the tree iterator smartly
While advancing the tree iterator, if we advance over things that
we aren't interested in, then call `current`.  Which may *itself*
call advance.

While advancing the tree iterator, if we advance over things that
we aren't interested in, then call `current`.  Which may *itself*
call advance.

While advancing the tree iterator, if we advance over things that
we aren't interested in, then call `current`.  Which may *itself*
call advance.

While advancing the tree iterator, if we advance over things that
we aren't interested in, then call `current`.  Which may *itself*
call advance.

While advancing the tree iterator, if we advance over things that
we aren't interested in, then call `current`.  Which may *itself*
call advance.

Error: stack overflow.
2015-09-11 17:38:28 -04:00
Edward Thomson
d53c888069 iterator: saner pathlist matching for idx iterator
Some nicer refactoring for index iteration walks.

The index iterator doesn't binary search through the pathlist space,
since it lacks directory entries, and would have to binary search
each index entry and all its parents (eg, when presented with an index
entry of `foo/bar/file.c`, you would have to look in the pathlist for
`foo/bar/file.c`, `foo/bar` and `foo`).  Since the index entries and the
pathlist are both nicely sorted, we walk the index entries in lockstep
with the pathlist like we do for other iteration/diff/merge walks.
2015-08-31 11:48:06 -04:00
Edward Thomson
1af84271dd tree_iterator: use a pathlist 2015-08-30 18:55:18 -04:00
Edward Thomson
4a0dbeb0d3 diff: use new iterator pathlist handling
When using literal pathspecs in diff with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`
turn on the faster iterator pathlist handling.

Updates iterator pathspecs to include directory prefixes (eg, `foo/`)
for compatibility with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`.
2015-08-30 17:06:26 -04:00
Edward Thomson
6c9352bf30 iterator: sort subdirs properly with pathlist
When given a pathlist, don't assume that directories sort before
files.  Walk through any list of entries sorting before us to make
sure that we've exhausted all entries that *aren't* directories.

Eg, if we're searching for 'foo/bar', and we have a 'foo.c', keep
advancing the pathlist to keep looking for an entry prefixed with
'foo/'.
2015-08-28 18:40:02 -04:00
Edward Thomson
ef206124de Move filelist into the iterator handling itself. 2015-08-28 18:39:52 -04:00
Edward Thomson
ed1c64464a iterator: use an options struct instead of args 2015-08-28 18:39:47 -04:00
Edward Thomson
ef4857c2b3 errors: tighten up git_error_state OOMs a bit more
When an error state is an OOM, make sure that we treat is specially
and do not try to free it.
2015-08-03 19:44:51 -04:00
Carlos Martín Nieto
12786e0f7c iterator: skip over errors in diriter init
An error here will typically mean that the directory was removed between
the time we iterated the parent and the time we wanted to visit it in
which case we should ignore it.

Other kinds of errors such as permissions (or transient errors) also
better dealt with by pretending we didn't see it.
2015-07-26 17:19:22 +02:00
Edward Thomson
dd6b24b19a iterator_walk: cast away constness for free 2015-07-02 10:36:15 -05:00
Edward Thomson
ded4ccab01 iterator_walk: drop unused variable 2015-06-29 21:23:09 +00:00
Carlos Martín Nieto
24fa21f38e index, iterator, fetchhead: plug leaks 2015-06-26 19:00:33 +02:00
Edward Thomson
8960dc1ec6 iterator: provide git_iterator_walk
Provide `git_iterator_walk` to walk each iterator in lockstep,
returning each iterator's idea of the contents of the next path.
2015-06-25 18:34:37 -04:00
Carlos Martín Nieto
ff47537557 diff: check files with the same or newer timestamps
When a file on the workdir has the same or a newer timestamp than the
index, we need to perform a full check of the contents, as the update of
the file may have happened just after we wrote the index.

The iterator changes are such that we can reach inside the workdir
iterator from the diff, though it may be better to have an accessor
instead of moving these structs into the header.
2015-06-22 12:47:30 +02:00
Carlos Martín Nieto
82a7a24cf4 Merge pull request #3165 from ethomson/downcase
Downcase
2015-06-08 15:22:01 +02:00
Edward Thomson
75a4636f50 git__tolower: a tolower() that isn't dumb
Some brain damaged tolower() implementations appear to want to
take the locale into account, and this may require taking some
insanely aggressive lock on the locale and slowing down what should
be the most trivial of trivial calls for people who just want to
downcase ASCII.
2015-05-29 18:16:46 -04:00
Edward Thomson
9f545b9d71 introduce git_index_entry_is_conflict
It's not always obvious the mapping between stage level and
conflict-ness.  More importantly, this can lead otherwise sane
people to write constructs like `if (!git_index_entry_stage(entry))`,
which (while technically correct) is unreadable.

Provide a nice method to help avoid such messy thinking.
2015-05-28 09:47:31 -04:00
Edward Thomson
aa3af01db0 index iterator: optionally include conflicts 2015-05-28 09:43:41 -04:00
Edward Thomson
f63a1b729b git_path_diriter: use FindFirstFile in win32
Using FindFirstFile and FindNextFile in win32 allows us to
use the directory information that is returned, instead of
us having to get the file attributes all over again, which
is a distinct cost savings on win32.
2015-05-01 12:31:40 -04:00
Edward Thomson
5c387b6c5a git_path_diriter: next shouldn't take path ptr
The _next method shouldn't take a path pointer (and a path_len
pointer) as 100% of current users use the full path and ignore
the filename.

Plus let's add some docs and a unit test.
2015-05-01 12:31:29 -04:00
Edward Thomson
7ef005f165 git_path_dirload_with_stat: moved to fs_iterator 2015-05-01 12:31:26 -04:00
Edward Thomson
35c1d20750 git_win32_path_dirload_with_stat: removed 2015-05-01 12:31:14 -04:00
J Wyman
1920ee4ef6 Improvements to status performance on Windows.
Changed win32/path_w32.c to utilize NTFS' FindFirst..FindNext data instead of doing an lstat per file. Avoiding unnecessary directory opens and file scans reduces IO, improving overall performance. Effect is magnified due to NTFS being a kernel mode file system (as opposed to user mode).
2015-04-28 14:25:02 -04:00
J Wyman
4c09e19a37 Improvements to ignore performance on Windows.
Minimizing the number directory and file opens, minimizes the amount of IO thus reducing the overall cost of performing ignore operations.
2015-04-28 14:24:58 -04:00
Edward Thomson
f1453c59b2 Make our overflow check look more like gcc/clang's
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it.  This means dropping the ability to pass `NULL` as
an out parameter.

As a result, the macros also get updated to reflect this as well.
2015-02-13 09:27:33 -05:00
Edward Thomson
392702ee2c allocations: test for overflow of requested size
Introduce some helper macros to test integer overflow from arithmetic
and set error message appropriately.
2015-02-12 22:54:46 -05:00
Carlos Martín Nieto
f7fcb18f8a Plug leaks
Valgrind is now clean except for libssl and libgcrypt.
2014-11-23 15:51:31 +01:00
Carlos Martín Nieto
62a617dc68 iterator: submodules are determined by an index or tree
We cannot know from looking at .gitmodules whether a directory is a
submodule or not. We need the index or tree we are comparing against to
tell us. Otherwise we have to assume the entry in .gitmodules is stale
or otherwise invalid.

Thus we pass the index of the repository into the workdir iterator, even
if we do not want to compare against it. This follows what git does,
which even for `git diff <tree>`, it will consider staged submodules as
such.
2014-11-07 08:33:27 +01:00
Russell Belfer
f554611a27 Improve checks for ignore containment
The diff code was using an "ignored_prefix" directory to track if
a parent directory was ignored that contained untracked files
alongside tracked files. Unfortunately, when negative ignore rules
were used for directories inside ignored parents, the wrong rules
were applied to untracked files inside the negatively ignored
child directories.

This commit moves the logic for ignore containment into the workdir
iterator (which is a better place for it), so the ignored-ness of
a directory is contained in the frame stack during traversal.  This
allows a child directory to override with a negative ignore and yet
still restore the ignored state of the parent when we traverse out
of the child.

Along with this, there are some problems with "directory only"
ignore rules on container directories.  Given "a/*" and "!a/b/c/"
(where the second rule is a directory rule but the first rule is
just a generic prefix rule), then the directory only constraint
was having "a/b/c/d/file" match the first rule and not the second.
This was fixed by having ignore directory-only rules test a rule
against the prefix of a file with LEADINGDIR enabled.

Lastly, spot checks for ignores using `git_ignore_path_is_ignored`
were tested from the top directory down to the bottom to deal with
the containment problem, but this is wrong. We have to test bottom
to top so that negative subdirectory rules will be checked before
parent ignore rules.

This does change the behavior of some existing tests, but it seems
only to bring us more in line with core Git, so I think those
changes are acceptable.
2014-05-06 12:41:26 -07:00
Russell Belfer
9c8ed49997 Remove trace / add git_diff_perfdata struct + api 2014-05-02 09:21:33 -07:00
Russell Belfer
b23b112dfe Add payloads, bitmaps to trace API
This is a proposed adjustment to the trace APIs.  This makes the
trace levels into a bitmask so that they can be selectively enabled
and adds a callback-level payload, plus a message-level payload.

This makes it easier for me to a GIT_TRACE_PERF callbacks that
are simply bypassed if the PERF level is not set.
2014-05-02 09:21:33 -07:00
Russell Belfer
cd424ad551 Add GIT_STATUS_OPT_UPDATE_INDEX and use trace API
This adds an option to refresh the stat cache while generating
status.  It also rips out the GIT_PERF stuff I had an makes use
of the trace API to keep statistics about what happens during diff.
2014-05-02 09:21:33 -07:00
Russell Belfer
8ef4e11a76 Skip diff oid calc when size definitely changed
When we think the stat cache in the index seems valid and the size
or mode of a file has definitely changed, then don't bother trying
to recalculate the OID of the workdir bits to confirm that it is
modified - just accept that it is modified.

This can result in files that show as modified with no actual diff,
but the behavior actually appears to match Git on the command line.

This also includes a minor optimization to not perform a submodule
lookup on the ".git" directory itself.
2014-05-02 09:21:32 -07:00
Russell Belfer
240f4af321 Add build option for diff internal statistics 2014-05-02 09:21:32 -07:00
Russell Belfer
a409acefbb Handle explicitly ignored dir slightly differently
When considering status of untracked directories, if we find an
explicitly ignored item, even if it is a directory, treat the
parent as an IGNORED item.  It was accidentally being treated as
an EMPTY item because we were not looking into the ignored subdir.
2014-04-24 11:59:50 -07:00
Russell Belfer
219c89d19d Treat ignored, empty, and untracked dirs different
In the iterator, distinguish between ignores and empty directories
so that diff and status can ignore empty directories, but checkout
and stash can treat them as untracked items.
2014-04-23 16:28:45 -07:00
Russell Belfer
37da368545 Make checkout match diff for untracked/ignored dir
When diff finds an untracked directory, it emulates Git behavior
by looking inside the directory to see if there are any untracked
items inside it. If there are only ignored items inside the dir,
then diff considers it ignored, even if there is no direct ignore
rule for it.

Checkout was not copying this behavior - when it found an untracked
directory, it just treated it as untracked.  Unfortunately, when
combined with GIT_CHECKOUT_REMOVE_UNTRACKED, this made is seem that
checkout (and stash, which uses checkout) was removing ignored
items when you had only asked it to remove untracked ones.

This commit moves the logic for advancing past an untracked dir
while scanning for non-ignored items into an iterator helper fn,
and uses that for both diff and checkout.
2014-04-22 21:51:54 -07:00