When duplicating a `struct git_tree_entry` with
`git_tree_entry_dup` the resulting structure is not allocated
inside a memory pool. As we do a 1:1 copy of the original struct,
though, we also copy the `pooled` field, which is set to `true`
for pooled entries. This results in a huge memory leak as we
never free tree entries that were duplicated from a pooled
tree entry.
Fix this by marking the newly duplicated entry as un-pooled.
When formatting a patch as email we do not include the commit's
message in the formatted patch output. Implement this and add a
test that verifies behavior.
It is already possible to get a commit's summary with the
`git_commit_summary` function. It is not possible to get the
remaining part of the commit message, that is the commit
message's body.
Fix this by introducing a new function `git_commit_body`.
The `git_blame__entry` struct keeps track of line counts with
`int` fields. Since `int` is only guaranteed to be at least 16
bits we may overflow on certain platforms when line counts exceed
2^15.
Fix this by instead storing line counts in `size_t`.
It is not unreasonable to have versioned files with a line count
exceeding 2^16. Upon blaming such files we fail to correctly keep
track of the lines as `git_blame_hunk` stores them in `uint16_t`
fields.
Fix this by converting the line fields of `git_blame_hunk` to
`size_t`. Add test to verify behavior.
This reduces the size of the struct from 32 to 26 bytes, and leaves a
single padding byte at the end of the struct (which comes from the
zero-length array).
These are rather small allocations, so we end up spending a non-trivial
amount of time asking the OS for memory. Since these entries are tied to
the lifetime of their tree, we can give the tree a pool so we speed up
the allocations.
We've already looked at the filename with `memchr()` and then used
`strlen()` to allocate the entry. We already know how much we have to
advance to get to the object id, so add the filename length instead of
looking at each byte again.
When building a recursive merge base, allow conflicts to occur.
Use the file (with conflict markers) as the common ancestor.
The user has already seen and dealt with this conflict by virtue
of having a criss-cross merge. If they resolved this conflict
identically in both branches, then there will be no conflict in the
result. This is the best case scenario.
If they did not resolve the conflict identically in the two branches,
then we will generate a new conflict. If the user is simply using
standard conflict output then the results will be fairly sensible.
But if the user is using a mergetool or using diff3 output, then the
common ancestor will be a conflict file (itself with diff3 output,
haha!). This is quite terrible, but it matches git's behavior.
Use annotated commits to act as our virtual bases, instead of regular
commits, to avoid polluting the odb with virtual base commits and
trees. Instead, build an annotated commit with an index and pointers
to the commits that it was merged from.
When there are more than two common ancestors, continue merging the
virtual base with the additional common ancestors, effectively
octopus merging a new virtual base.
When examining the working directory and determining whether it's
up-to-date, only consider the nanoseconds in the index entry when
built with `GIT_USE_NSEC`. This prevents us from believing that
the working directory is always dirty when the index was originally
written with a git client that uinderstands nsecs (like git 2.x).