This begins the process of exposing git_filter objects to the
public API. This includes:
* new public type and API for `git_buffer` through which an
allocated buffer can be passed to the user
* new API `git_blob_filtered_content`
* make the git_filter type and GIT_FILTER_TO_... constants public
When a git_buf contains a UTF-8 BOM, the three bytes comprising
that BOM are treated as unprintable characters. For a small git_buf,
the three BOM characters overwhelm the printable characters. This
is problematic when trying to check out a small file as the CR/LF
filtering will not apply.
This adds crlf/lf conversion functions into buf_text with more
efficient implementations that bypass the high level buffer
functions. They attempt to minimize the number of reallocations
done and they directly write the buffer data as needed if they
know that there is enough memory allocated to memcpy data.
Tests are added for these new functions. The crlf.c code is
updated to use the new functions.
Removed the include of buf_text.h from filter.h and just include
it more narrowly in the places that need it.
Core git just looks for NUL bytes in files when deciding about
is-binary inside diff (although it uses a better algorithm in
checkout, when deciding if CRLF conversion should be done).
Libgit2 was using the better algorithm in both places, but that
is causing some confusion. For now, this makes diff just look
for NUL bytes to decide if a file is binary by content in diff.
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc). This groups all those functions together into a
new file and converts the code to use that.
This has two enhancements to existing functionality. The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).