The pair of `git_blob_create_frombuffer()` and
`git_blob_create_frombuffer_commit()` is meant to replace
`git_blob_create_fromchunks()` by providing a way for a user to write a
new blob when they want filtering or they do not know the size.
This approach allows the caller to retain control over when to add data
to this buffer and a more natural fit into higher-level language's own
stream abstractions instead of having to handle IO wait in the callback.
The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a
round multiple of usual page sizes and a value where most blobs seem
likely to be either going to be way below or way over that size. It's
also a round number of pages.
This implementation re-uses the helper we have from `_fromchunks()` so
we end up writing everything to disk, but hopefully more efficiently
than with a default filebuf. A later optimisation can be to avoid
writing the in-memory contents to disk, with some extra complexity.
We currently recommend using `git_buf_grow` in order to make a buffer
make an owned copy of the memory it points to. This is not behaviour we
should encourage, so remove this recommendation.
The function itself is not changed, as we need to remain compatible, but
it will be changed not to allow usage on borrowed buffers.
The callback to supply data chunks could return a negative value
to stop creation of the blob, but we were neither using GIT_EUSER
nor propagating the return value. This makes things use the new
behavior of returning the negative value back to the user.
This makes the git_buf struct that was used internally into an
externally available structure and eliminates the git_buffer.
As part of that, some of the special cases that arose with the
externally used git_buffer were blended into the git_buf, such as
being careful about git_buf objects that may have a NULL ptr and
allowing for bufs with a valid ptr and size but zero asize as a
way of referring to externally owned data.
This begins the process of exposing git_filter objects to the
public API. This includes:
* new public type and API for `git_buffer` through which an
allocated buffer can be passed to the user
* new API `git_blob_filtered_content`
* make the git_filter type and GIT_FILTER_TO_... constants public
This removes the GIT_INLINE versions of the simple git_object
accessors and standardizes them with a helper macro in src/object.h
to build the function bodies.
This moves the implementation of these two APIs into common code
that will be shared between the two. Also, this adds tests for
the `git_diff_blob_to_buffer` API. Lastly, this adds some extra
`const` to a few places that can use it.
1. The license header is technically not valid if it doesn't have a
copyright signature.
2. The COPYING file has been updated with the different licenses used in
the project.
3. The full GPLv2 header in each file annoys me.
In the same spirit that git_repository_lookup is no longer available,
add wrappers so the users don't have to cast when closing their
objects.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Hey. Apologies in advance -- I broke your bindings.
This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.
Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!
Major features include:
- Real caching and refcounting on parsed objects
- Real caching and refcounting on objects read from the ODB
- Streaming writes & reads from the ODB
- Single-method writes for all object types
- The external API is now partially thread-safe
The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.
Here's a full changelog for the external API:
blob.h
------
- Remove `git_blob_new`
- Remove `git_blob_set_rawcontent`
- Remove `git_blob_set_rawcontent_fromfile`
- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
- Change `git_blob_create_fromfile`:
The `path` argument is now relative to the repository's working dir
- Add `git_blob_create_frombuffer`
commit.h
--------
- Remove `git_commit_new`
- Remove `git_commit_add_parent`
- Remove `git_commit_set_message`
- Remove `git_commit_set_committer`
- Remove `git_commit_set_author`
- Remove `git_commit_set_tree`
- Add `git_commit_create`
- Add `git_commit_create_v`
- Add `git_commit_create_o`
- Add `git_commit_create_ov`
tag.h
-----
- Remove `git_tag_new`
- Remove `git_tag_set_target`
- Remove `git_tag_set_name`
- Remove `git_tag_set_tagger`
- Remove `git_tag_set_message`
- Add `git_tag_create`
- Add `git_tag_create_o`
tree.h
------
- Change `git_tree_entry_2object`:
New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`
- Remove `git_tree_new`
- Remove `git_tree_add_entry`
- Remove `git_tree_remove_entry_byindex`
- Remove `git_tree_remove_entry_byname`
- Remove `git_tree_clearentries`
- Remove `git_tree_entry_set_id`
- Remove `git_tree_entry_set_name`
- Remove `git_tree_entry_set_attributes`
object.h
------------
- Remove `git_object_new
- Remove `git_object_write`
- Change `git_object_close`:
This method is now *mandatory*. Not closing an object causes a
memory leak.
odb.h
-----
- Remove type `git_rawobj`
- Remove `git_rawobj_close`
- Rename `git_rawobj_hash` -> `git_odb_hash`
- Change `git_odb_hash`:
New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`
- Add type `git_odb_object`
- Add `git_odb_object_close`
- Change `git_odb_read`:
New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
- Change `git_odb_read_header`:
New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
- Remove `git_odb_write`
- Add `git_odb_open_wstream`
- Add `git_odb_open_rstream`
odb_backend.h
-------------
- Change type `git_odb_backend`:
New internal signatures are as follows
int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)
- Add type `git_odb_stream`
- Add enum `git_odb_streammode`
Signed-off-by: Vicent Marti <tanoku@gmail.com>