Commit Graph

103 Commits

Author SHA1 Message Date
Russell Belfer
abfed59c27 Clean up one other mode_t assertion 2013-09-04 16:23:00 -07:00
Russell Belfer
cae5293854 Fix resolving relative windows network paths 2013-09-03 14:00:27 -07:00
Russell Belfer
0d1af399e9 don't use inline in tests for win32 2013-09-03 12:33:34 -07:00
Nikolai Vladimirov
6d9a6c5cec path: properly resolve relative paths 2013-09-03 20:45:53 +03:00
Vicent Martí
dbecec37a7 Merge pull request #1805 from libgit2/threading-packed-load
Thread safety for the refdb_fs
2013-08-28 09:38:14 -07:00
Vicent Martí
1ef05e3f0e Merge pull request #1803 from libgit2/ntk/topic/even_more_lenient_remote_parsing
Even more lenient remote parsing
2013-08-28 06:05:50 -07:00
Edward Thomson
1ff3a09415 Improve win32 version check, no ipv6 tests on XP 2013-08-27 19:44:35 -05:00
nulltoken
191adce875 vector: Teach git_vector_uniq() to free while deduplicating 2013-08-27 20:14:07 +02:00
Fraser Tweedale
9d85f00722 fix tests on FreeBSD
238b761 introduced a test for posix behaviour, but on FreeBSD some
of the structs and constants used aren't defined in <arpa/inet.h>.
Include the appropriate headers to get the tests working again on
FreeBSD.
2013-08-24 17:39:15 +10:00
Russell Belfer
972bb689c4 Add SRWLock implementation of rwlocks for Win32 2013-08-22 14:10:56 -07:00
Russell Belfer
8d9a85d43a Convert sortedcache to use rwlock
This is the first use we have of pthread_rwlock_t in libgit2.
Hopefully it won't cause any serious portability problems.
2013-08-22 11:40:53 -07:00
Russell Belfer
a4977169e1 Add sortedcache APIs to lookup index and remove
This adds two other APIs that I need to the sortedcache type.
2013-08-21 14:09:38 -07:00
Russell Belfer
0b7cdc0263 Add sorted cache data type
This adds a convenient new data type for caching the contents of
file in memory when each item in that file corresponds to a name
and you need to both be able to lookup items by name and iterate
over them in some sorted order.  The new data type has locks in
place to manage usage in a threaded environment.
2013-08-20 16:14:24 -07:00
Edward Thomson
c0b01b7572 Skip UTF-8 BOM in binary detection
When a git_buf contains a UTF-8 BOM, the three bytes comprising
that BOM are treated as unprintable characters.  For a small git_buf,
the three BOM characters overwhelm the printable characters.  This
is problematic when trying to check out a small file as the CR/LF
filtering will not apply.
2013-08-19 18:46:26 -05:00
Edward Thomson
238b761491 Fix p_inet_pton on windows
p_inet_pton on Windows should set errno properly for callers.
Rewrite p_inet_pton to handle error cases correctly and add
test cases to exercise this function.
2013-08-19 17:21:35 -05:00
Russell Belfer
d730d3f4f0 Major rename detection changes
After doing further profiling, I found that a lot of time was
being spent attempting to insert hashes into the file hash
signature when using the rolling hash because the rolling hash
approach generates a hash per byte of the file instead of one
per run/line of data.

To optimize this, I decided to convert back to a run-based file
signature algorithm which would be more like core Git.

After changing this, a number of the existing tests started to
fail.  In some cases, this appears to have been because the test
was coded to be too specific to the particular results of the file
similarity metric and in some cases there appear to have been bugs
in the core rename detection code where only by the coincidence
of the file similarity scoring were the expected results being
generated.

This renames all the variables in the core rename detection code
to be more consistent and hopefully easier to follow which made it
a bit easier to reason about the behavior of that code and fix the
problems that I was seeing.  I think it's in better shape now.

There are a couple of tests now that attempt to stress test the
rename detection code and they are quite slow.  Most of the time
is spent setting up the test data on disk and in the index.  When
we roll out performance improvements for index insertion, it
should also speed up these tests I hope.
2013-07-31 16:40:42 -07:00
Russell Belfer
6fc5a58197 Basic bit vector
This is a simple bit vector object that is not resizable after
the initial allocation but can be of arbitrary size.  It will
keep the bti vector entirely on the stack for vectors 64 bits
or less, and will allocate the vector on the heap for larger
sizes.  The API is uniform regardless of storage location.

This is very basic right now and all the APIs are inline functions,
but it is useful for storing an array of boolean values.
2013-07-10 20:50:33 +02:00
Russell Belfer
290e147985 Add GIT_CAP_SSH if library was built with SSH
This also adds a test that actually calls git_libgit2_capabilities
and git_libgit2_version.
2013-07-09 16:17:41 -07:00
Edward Thomson
e3b4a47c1e git__strcasesort_cmp: strcasecmp sorting rules but requires strict equality 2013-06-17 10:03:15 -07:00
Edward Thomson
2d160ef782 allow (ignore) bare slash in gitignore 2013-05-29 16:26:25 -05:00
Russell Belfer
aa8f010120 Add git_oid_strcmp and use it for git_oid_streq
Add a new git_oid_strcmp that compares a string OID with a hex
oid for sort order, and then reimplement git_oid_streq using it.
This actually should speed up git_oid_streq because it only reads
as far into the string as it needs to, whereas previously it would
convert the whole string into an OID and then use git_oid_cmp.
2013-04-29 08:59:46 -07:00
Russell Belfer
8564a0224a Fix fragile git_oid_ncmp
git_oid_ncmp was making some assumptions about the length of
the data - this shifts the check to the top of the loop so it
will work more robustly, limits the max, and adds some tests
to verify the functionality.
2013-04-29 08:51:24 -07:00
Russell Belfer
917f60c50b Add tests for oidmap and new cache with threading
This adds some basic tests for the oidmap just to make sure that
collisions, etc. are dealt with correctly.

This also adds some tests for the new caching that check if items
are inserted (or not inserted) properly into the cache, and that
the cache can hold up in a multithreaded environment without error.
2013-04-22 16:50:51 +02:00
Vicent Marti
5df184241a lol this worked first try wtf 2013-04-22 16:50:50 +02:00
Vicent Martí
0b061b5bfa Merge pull request #1436 from schu/opts-cache-size
opts: allow configuration of odb cache size
2013-03-26 11:05:57 -07:00
Russell Belfer
3658e81e34 Move crlf conversion into buf_text
This adds crlf/lf conversion functions into buf_text with more
efficient implementations that bypass the high level buffer
functions.  They attempt to minimize the number of reallocations
done and they directly write the buffer data as needed if they
know that there is enough memory allocated to memcpy data.

Tests are added for these new functions.  The crlf.c code is
updated to use the new functions.

Removed the include of buf_text.h from filter.h and just include
it more narrowly in the places that need it.
2013-03-25 14:20:07 -07:00
Vicent Marti
13640d1bb8 oid: Do not parse OIDs longer than 40 2013-03-25 21:39:11 +01:00
Michael Schubert
f5e28202cb opts: allow configuration of odb cache size
Currently, the odb cache has a fixed size of 128 slots as defined by
GIT_DEFAULT_CACHE_SIZE. Allow users to set the size of the cache via
git_libgit2_opts().

Fixes #1035.
2013-03-25 15:45:56 +01:00
Xavier L.
b3c174835b Clarified string value 2013-03-21 14:50:28 -03:00
Xavier L
7e527ca700 Added test case for new function 2013-03-21 12:16:31 -04:00
Russell Belfer
324602514f Fixes and cleanups
Get rid of some dead code, tighten things up a bit, and fix a bug
with core::env test.
2013-03-18 15:54:35 -07:00
Russell Belfer
41954a49c1 Switch search paths to classic delimited strings
This switches the APIs for setting and getting the global/system
search paths from using git_strarray to using a simple string with
GIT_PATH_LIST_SEPARATOR delimited paths, just as the environment
PATH variable would contain.  This makes it simpler to get and set
the value.

I also added code to expand "$PATH" when setting a new value to
embed the old value of the path.  This means that I no longer
require separate actions to PREPEND to the value.
2013-03-18 14:19:35 -07:00
Russell Belfer
5540d9477e Implement global/system file search paths
The goal of this work is to expose the search logic for "global",
"system", and "xdg" files through the git_libgit2_opts() interface.

Behind the scenes, I changed the logic for finding files to have a
notion of a git_strarray that represents a search path and to store
a separate search path for each of the three tiers of config file.
For each tier, I implemented a function to initialize it to default
values (generally based on environment variables), and then general
interfaces to get it, set it, reset it, and prepend new directories
to it.

Next, I exposed these interfaces through the git_libgit2_opts
interface, reusing the GIT_CONFIG_LEVEL_SYSTEM, etc., constants
for the user to control which search path they were modifying.
There are alternative designs for the opts interface / argument
ordering, so I'm putting this phase out for discussion.

Additionally, I ended up doing a little bit of clean up regarding
attr.h and attr_file.h, adding a new attrcache.h so the other two
files wouldn't have to be included in so many places.
2013-03-15 16:39:00 -07:00
Russell Belfer
0c46863384 Improved tree iterator internals
This updates the tree iterator internals to be more efficient.

The tree_iterator_entry objects are now kept as pointers that are
allocated from a git_pool, so that we may use git__tsort_r for
sorting (which is better than qsort, given that the tree is
likely mostly ordered already).

Those tree_iterator_entry objects now keep direct pointers to the
data they refer to instead of keeping indirect index values.  This
simplifies a lot of the data structure traversal code.

This also adds bsearch to find the start item position for range-
limited tree iterators, and is more explicit about using
git_path_cmp instead of reimplementing it.  The git_path_cmp
changed a bit to make it easier for tree_iterators to use it (but
it was barely being used previously, so not a big deal).

This adds a git_pool_free_array function that efficiently frees a
list of pool allocated pointers (which the tree_iterator keeps).
Also, added new tests for the git_pool free list functionality
that was not previously being tested (or used).
2013-03-14 13:40:15 -07:00
Russell Belfer
9bc8be3d7e Refine pluggable similarity API
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API.  In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.

Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
2013-02-20 15:09:41 -08:00
Russell Belfer
aa6432604e More tests of file signatures with whitespace opts
Seems to be working pretty well...
2013-02-20 15:09:41 -08:00
Russell Belfer
5e5848eb15 Change similarity metric to sampled hashes
This moves the similarity metric code out of buf_text and into a
new file.  Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes.  In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
2013-02-20 15:09:40 -08:00
Russell Belfer
9c454b007b Initial implementation of similarity scoring algo
This adds a new `git_buf_text_hashsig` type and functions to
generate these hash signatures and compare them to give a
similarity score.  This can be plugged into diff similarity
scoring.
2013-02-20 15:09:40 -08:00
Russell Belfer
56543a609a Clear up warnings from cppcheck
The cppcheck static analyzer generates warnings for a bunch of
places in the libgit2 code base.  All the ones fixed in this
commit are actually false positives, but I've reorganized the
code to hopefully make it easier for static analysis tools to
correctly understand the structure.  I wouldn't do this if I
felt like it was making the code harder to read or worse for
humans, but in this case, these fixes don't seem too bad and will
hopefully make it easier for better analysis tools to get at any
real issues.
2013-02-15 16:02:45 -08:00
Philip Kelley
8c29dca6c3 Fix some incorrect MSVC #ifdef's. Fixes #1305 2013-02-11 09:25:57 -05:00
Russell Belfer
17c92beaca Test buf join with NULL behavior explicitly 2013-01-29 12:13:24 -08:00
Vicent Marti
0d52cb4aea opts: Some basic tests 2013-01-24 00:09:55 +01:00
Russell Belfer
f63d0ee9fc Move all non-ascii test data to raw hex
This takes all of the characters in core::env and makes them use
hex sequences instead of keeping tricky character data inline in
the test.
2013-01-17 15:47:10 -08:00
Russell Belfer
0d65acade8 Match binary file check of core git in diff
Core git just looks for NUL bytes in files when deciding about
is-binary inside diff (although it uses a better algorithm in
checkout, when deciding if CRLF conversion should be done).
Libgit2 was using the better algorithm in both places, but that
is causing some confusion.  For now, this makes diff just look
for NUL bytes to decide if a file is binary by content in diff.
2013-01-11 11:24:26 -08:00
Russell Belfer
b8a1ea7cf9 Fix core::env cleanup code
Mark fake home directories that failed to be created, so we won't
try to remove them and have cleanup just use p_rmdir.
2013-01-03 11:04:03 -08:00
Ben Straub
600d8dbf6d Move test cleanup into cleanup functions 2013-01-03 09:10:38 -08:00
Ben Straub
6fef1ab344 Tests should clean up after themselves 2013-01-03 07:47:51 -08:00
nulltoken
50a762a563 path: Teach UNC paths to git_path_dirname_r()
Fix libgit2/libgit2sharp#256
2012-12-26 23:07:25 +01:00
nulltoken
34b6f05f39 path: enhance git_path_dirname_r() test coverage 2012-12-26 11:59:07 +01:00
Russell Belfer
7bf87ab698 Consolidate text buffer functions
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc).  This groups all those functions together into a
new file and converts the code to use that.

This has two enhancements to existing functionality.  The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).
2012-11-28 09:58:48 -08:00