Commit Graph

25 Commits

Author SHA1 Message Date
Russell Belfer
786062639f Add callback to git_objects_table
This adds create and free callback to the git_objects_table so
that more of the creation and destruction of objects can be table
driven instead of using switch statements.  This also makes the
semantics of certain object creation functions consistent so that
we can make better use of function pointers.  This also fixes a
theoretical error case where an object allocation fails and we
end up storing NULL into the cache.
2013-04-22 16:51:40 +02:00
Vicent Marti
8842c75f17 What has science done. 2013-04-22 16:50:50 +02:00
Vicent Marti
5df184241a lol this worked first try wtf 2013-04-22 16:50:50 +02:00
Edward Thomson
359fc2d241 update copyrights 2013-01-08 17:31:27 -06:00
Carlos Martín Nieto
f56f8585c0 indexer: use the packfile streaming API
The new API allows us to read the object bit by bit from the packfile,
instead of needing it all at once in the packfile. This also allows us
to hash the object as it comes in from the network instead of having
to try to read it all and failing repeatedly for larger objects.

This is only the first step, but it already shows huge improvements
when dealing with objects over a few megabytes in size. It reduces the
memory needs in some cases, but delta objects still need to be
completely in memory and the old inefficent method is still used for
that.
2012-11-30 15:55:23 +01:00
Russell Belfer
c6ac28fdc5 Reorg internal odb read header and object lookup
Often `git_odb_read_header` will "fail" and have to read the
entire object into memory instead of just the header.  When this
happens, the object is loaded and then disposed of immediately,
which makes it difficult to efficiently use the header information
to decide if the object should be loaded (since attempting to do
so will often result in loading the object twice).

This commit takes the existing code and reorganizes it to have
two new functions:

- `git_odb__read_header_or_object` which acts just like the old
  read header function except that it returns the object, too, if
  it was forced to load the whole thing.  It then becomes the
  callers responsibility to free the `git_odb_object`.
- `git_object__from_odb_object` which was extracted from the old
  `git_object_lookup` and creates a subclass of `git_object` from
  an existing `git_odb_object` (separating the ODB lookup from the
  `git_object` creation).  This allows you to use the first header
  reading function efficiently without instantiating the
  `git_odb_object` twice.

There is no net change to the behavior of any of the existing
functions, but this allows internal code to tap into the ODB
lookup and object creation to be more efficient.
2012-09-10 12:24:05 -07:00
Russell Belfer
60b9d3fcef Implement filters for status/diff blobs
This adds support to diff and status for running filters (a la crlf)
on blobs in the workdir before computing SHAs and before generating
text diffs.  This ended up being a bit more code change than I had
thought since I had to reorganize some of the diff logic to minimize
peak memory use when filtering blobs in a diff.

This also adds a cap on the maximum size of data that will be loaded
to diff.  I set it at 512Mb which should match core git.  Right now
it is a #define in src/diff.h but it could be moved into the public
API if desired.
2012-09-06 15:34:02 -07:00
Russell Belfer
282283acc6 Fix valgrind issues
There are three changes here:
- correctly propogate error code from failed object lookups
- make zlib inflate use our allocators
- add OID to notfound error in ODB lookups
2012-05-04 16:46:46 -07:00
Russell Belfer
e1de726c15 Migrate ODB files to new error handling
This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to
the new style of error handling.  Also got the unix and win32
versions of map.c.  There are some minor changes to other
files but no others were completely converted.

This also contains an update to filebuf so that a zeroed out
filebuf will not think that the fd (== 0) is actually open
(and inadvertently call close() on fd 0 if cleaned up).

Lastly, this was built and tested on win32 and contains a
bunch of fixes for the win32 build which was pretty broken.
2012-03-12 22:55:40 -07:00
schu
5e0de32818 Update Copyright header
Signed-off-by: schu <schu-github@schulog.org>
2012-02-13 17:11:09 +01:00
Vicent Martí
f19e3ca288 odb: Proper symlink hashing 2012-02-10 20:16:42 +01:00
Vicent Martí
18e5b8547d odb: Add internal git_odb__hashfd 2012-02-10 19:47:02 +01:00
Vicent Marti
9462c47143 repository: Change ownership semantics
The ownership semantics have been changed all over the library to be
consistent. There are no more "borrowed" or duplicated references.

Main changes:

	- `git_repository_open2` and `3` have been dropped.

	- Added setters and getters to hotswap all the repository owned
	objects:

		`git_repository_index`
		`git_repository_set_index`
		`git_repository_odb`
		`git_repository_set_odb`
		`git_repository_config`
		`git_repository_set_config`
		`git_repository_workdir`
		`git_repository_set_workdir`

	Now working directories/index files/ODBs and so on can be
	hot-swapped after creating a repository and between operations.

	- All these objects now have proper ownership semantics with
	refcounting: they all require freeing after they are no longer
	needed (the repository always keeps its internal reference).

	- Repository open and initialization has been updated to keep in
	mind the configuration files. Bare repositories are now always
	detected, and a default config file is created on init.

	- All the tests affected by these changes have been dropped from the
	old test suite and ported to the new one.
2011-11-26 08:37:08 +01:00
Brodie Rao
01ad7b3a9e *: correct and codify various file permissions
The following files now have 0444 permissions:

- loose objects
- pack indexes
- pack files
- packs downloaded by fetch
- packs downloaded by the HTTP transport

And the following files now have 0666 permissions:

- config files
- repository indexes
- reflogs
- refs

This brings libgit2 more in line with Git.

Note that git_filebuf_commit() and git_filebuf_commit_at() have both
gained a new mode parameter.

The latter change fixes an important issue where filebufs created with
GIT_FILEBUF_TEMPORARY received 0600 permissions (due to mkstemp(3)
usage). Now we chmod() the file before renaming it into place.

Tests have been added to confirm that new commit, tag, and tree
objects are created with the right permissions. I don't have access to
Windows, so for now I've guarded the tests with "#ifndef GIT_WIN32".
2011-10-14 16:07:47 -07:00
Brodie Rao
ce8cd006ce fileops/repository: create (most) directories with 0777 permissions
To further match how Git behaves, this change makes most of the
directories libgit2 creates in a git repo have a file mode of
0777. Specifically:

- Intermediate directories created with git_futils_mkpath2file() have
  0777 permissions. This affects odb_loose, reflog, and refs.

- The top level folder for bare repos is created with 0777
  permissions.

- The top level folder for non-bare repos is created with 0755
  permissions.

- /objects/info/, /objects/pack/, /refs/heads/, and /refs/tags/ are
  created with 0777 permissions.

Additionally, the following changes have been made:

- fileops functions that create intermediate directories have grown a
  new dirmode parameter. The only exception to this is filebuf's
  lock_file(), which unconditionally creates intermediate directories
  with 0777 permissions when GIT_FILEBUF_FORCE is set.

- The test runner now sets the umask to 0 before running any
  tests. This ensurses all file mode checks are consistent across
  systems.

- t09-tree.c now does a directory permissions check. I've avoided
  adding this check to other tests that might reuse existing
  directories from the prefabricated test repos. Because they're
  checked into the repo, they have 0755 permissions.

- Other assorted directories created by tests have 0777 permissions.
2011-10-14 16:04:34 -07:00
Vicent Marti
87d9869fc3 Tabify everything
There were quite a few places were spaces were being used instead of
tabs. Try to catch them all. This should hopefully not break anything.
Except for `git blame`. Oh well.
2011-09-19 03:34:49 +03:00
Vicent Marti
bb742ede3d Cleanup legal data
1. The license header is technically not valid if it doesn't have a
copyright signature.

2. The COPYING file has been updated with the different licenses used in
the project.

3. The full GPLv2 header in each file annoys me.
2011-09-19 01:54:32 +03:00
Vicent Marti
c85e08b1bd odb: Do not pass around a header when hashing 2011-08-18 02:34:10 +02:00
Vicent Marti
72a3fe42fb I broke your bindings
Hey. Apologies in advance -- I broke your bindings.

This is a major commit that includes a long-overdue redesign of the
whole object-database structure. This is expected to be the last major
external API redesign of the library until the first non-alpha release.

Please get your bindings up to date with these changes. They will be
included in the next minor release. Sorry again!

Major features include:

	- Real caching and refcounting on parsed objects
	- Real caching and refcounting on objects read from the ODB
	- Streaming writes & reads from the ODB
	- Single-method writes for all object types
	- The external API is now partially thread-safe

The speed increases are significant in all aspects, specially when
reading an object several times from the ODB (revwalking) and when
writing big objects to the ODB.

Here's a full changelog for the external API:

blob.h
------

	- Remove `git_blob_new`
	- Remove `git_blob_set_rawcontent`
	- Remove `git_blob_set_rawcontent_fromfile`
	- Rename `git_blob_writefile` -> `git_blob_create_fromfile`
	- Change `git_blob_create_fromfile`:
		The `path` argument is now relative to the repository's working dir
	- Add `git_blob_create_frombuffer`

commit.h
--------

	- Remove `git_commit_new`
	- Remove `git_commit_add_parent`
	- Remove `git_commit_set_message`
	- Remove `git_commit_set_committer`
	- Remove `git_commit_set_author`
	- Remove `git_commit_set_tree`

	- Add `git_commit_create`
	- Add `git_commit_create_v`
	- Add `git_commit_create_o`
	- Add `git_commit_create_ov`

tag.h
-----

	- Remove `git_tag_new`
	- Remove `git_tag_set_target`
	- Remove `git_tag_set_name`
	- Remove `git_tag_set_tagger`
	- Remove `git_tag_set_message`

	- Add `git_tag_create`
	- Add `git_tag_create_o`

tree.h
------

	- Change `git_tree_entry_2object`:
		New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)`

	- Remove `git_tree_new`
	- Remove `git_tree_add_entry`
	- Remove `git_tree_remove_entry_byindex`
	- Remove `git_tree_remove_entry_byname`
	- Remove `git_tree_clearentries`
	- Remove `git_tree_entry_set_id`
	- Remove `git_tree_entry_set_name`
	- Remove `git_tree_entry_set_attributes`

object.h
------------

	- Remove `git_object_new
	- Remove `git_object_write`

	- Change `git_object_close`:
		This method is now *mandatory*. Not closing an object causes a
		memory leak.

odb.h
-----

	- Remove type `git_rawobj`
	- Remove `git_rawobj_close`
	- Rename `git_rawobj_hash` -> `git_odb_hash`
	- Change `git_odb_hash`:
		New signature is `(git_oid *id, const void *data, size_t len, git_otype type)`

	- Add type `git_odb_object`
	- Add `git_odb_object_close`

	- Change `git_odb_read`:
		New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)`
	- Change `git_odb_read_header`:
		New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)`
	- Remove `git_odb_write`
	- Add `git_odb_open_wstream`
	- Add `git_odb_open_rstream`

odb_backend.h
-------------

	- Change type `git_odb_backend`:
		New internal signatures are as follows

			int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *)
			int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype)
			int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *)

	- Add type `git_odb_stream`
	- Add enum `git_odb_streammode`

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-03-20 21:45:11 +02:00
Vicent Marti
becff0428e Fix compilation in MSVC
The git_odb_backend_* symbols were being redefined as external.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2011-02-07 08:09:11 +02:00
Vicent Marti
44908fe763 Change the library include file
Libgit2 is now officially include as

	#include "<git2.h>"

or indidividual files may be included as

	#include <git2/index.h>

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-06 23:03:16 +02:00
Vicent Marti
d12299fe22 Change include structure for the project
The maze with include dependencies has been fixed.
There is now a global include:

	#include <git.h>

The git_odb_backend API has been exposed.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-06 01:14:15 +02:00
Vicent Marti
7d7cd8857a Decouple storage from ODB logic
Comes with two default backends: loose object and packfiles.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
2010-12-06 01:14:15 +02:00
Ramsay Jones
89217d8f1a Add functions to open a '*.pack' file and perform some basic validation
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
2010-04-30 09:48:07 +02:00
Shawn O. Pearce
a7c60cfc6d Add basic support to read pack-*.idx v1 and v2 files
The index data is mapped into memory and then scanned using a
binary search algorithm to locate the matching entry for the
supplied git_oid.  The standard fanout hash trick is applied to
reduce the search space by 8 iterations.

Since the v1 and v2 file formats differ in their search function,
due to the different layouts used for the object records, we use
two different search implementations and a virtual function pointer
to jump to the correct version of code for the current pack index.
The single function jump per-pack should be faster then computing
a branch point inside the inner loop of a common binary search.

To improve concurrency during read operations the pack lock is only
held while verifying the index is actually open, or while opening
the index for the first time.  This permits multiple concurrent
readers to scan through the same index.

If an invalid index file is opened we close it and mark the
git_pack's invalid bit to true.  The git_pack structure is kept
around in its parent git_packlist, but the invalid bit will cause
all future readers to skip over the pack entirely.  Pruning the
invalid entries is relatively unimportant because they shouldn't
be very common, a $GIT_DIRECTORY/objects/pack directory tends to
only have valid pack files.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2009-01-03 02:56:16 -08:00