mirror_ubuntu-kernels/fs/xfs/libxfs
Christoph Hellwig 6bdcf26ade xfs: use a b+tree for the in-core extent list
Replace the current linear list and the indirection array for the in-core
extent list with a b+tree to avoid the need for larger memory allocations
for the indirection array when lots of extents are present.  The current
extent list implementations leads to heavy pressure on the memory
allocator when modifying files with a high extent count, and can lead
to high latencies because of that.

The replacement is a b+tree with a few quirks.  The leaf nodes directly
store the extent record in two u64 values.  The encoding is a little bit
different from the existing in-core extent records so that the start
offset and length which are required for lookups can be retreived with
simple mask operations.  The inner nodes store a 64-bit key containing
the start offset in the first half of the node, and the pointers to the
next lower level in the second half.  In either case we walk the node
from the beginninig to the end and do a linear search, as that is more
efficient for the low number of cache lines touched during a search
(2 for the inner nodes, 4 for the leaf nodes) than a binary search.
We store termination markers (zero length for the leaf nodes, an
otherwise impossible high bit for the inner nodes) to terminate the key
list / records instead of storing a count to use the available cache
lines as efficiently as possible.

One quirk of the algorithm is that while we normally split a node half and
half like usual btree implementations we just spill over entries added at
the very end of the list to a new node on its own.  This means we get a
100% fill grade for the common cases of bulk insertion when reading an
inode into memory, and when only sequentially appending to a file.  The
downside is a slightly higher chance of splits on the first random
insertions.

Both insert and removal manually recurse into the lower levels, but
the bulk deletion of the whole tree is still implemented as a recursive
function call, although one limited by the overall depth and with very
little stack usage in every iteration.

For the first few extents we dynamically grow the list from a single
extent to the next powers of two until we have a first full leaf block
and that building the actual tree.

The code started out based on the generic lib/btree.c code from Joern
Engel based on earlier work from Peter Zijlstra, but has since been
rewritten beyond recognition.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-06 11:53:41 -08:00
..
xfs_ag_resv.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_ag_resv.h xfs: set up per-AG free space reservations 2016-09-19 10:30:52 +10:00
xfs_alloc_btree.c xfs: always compile the btree inorder check functions 2017-06-19 14:11:33 -07:00
xfs_alloc_btree.h
xfs_alloc.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_alloc.h xfs: create block pointer check functions 2017-10-26 15:38:23 -07:00
xfs_attr_leaf.c xfs: remove the never fully implemented UUID fork format 2017-10-26 15:38:27 -07:00
xfs_attr_leaf.h Merge branch 'xfs-4.10-misc-fixes-3' into for-next 2016-12-07 17:42:30 +11:00
xfs_attr_remote.c xfs: remove the ip argument to xfs_defer_finish 2017-09-01 10:55:30 -07:00
xfs_attr_remote.h
xfs_attr_sf.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_attr.c xfs: remove the ip argument to xfs_defer_finish 2017-09-01 10:55:30 -07:00
xfs_bit.c libxfs: Optimize the loop for xfs_bitmap_empty 2016-01-04 16:10:19 +11:00
xfs_bit.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_bmap_btree.c xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_bmap_btree.h xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_bmap.c xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_bmap.h xfs: simplify xfs_reflink_convert_cow 2017-11-06 11:53:40 -08:00
xfs_btree.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_btree.h xfs: compare btree block keys to parent block's keys during scrub 2017-10-27 09:20:31 -07:00
xfs_cksum.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_da_btree.c xfs: abort dir/attr btree operation if btree is obviously weird 2017-10-27 09:20:31 -07:00
xfs_da_btree.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_da_format.c xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_da_format.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_defer.c xfs: remove the ip argument to xfs_defer_finish 2017-09-01 10:55:30 -07:00
xfs_defer.h xfs: remove the ip argument to xfs_defer_finish 2017-09-01 10:55:30 -07:00
xfs_dir2_block.c xfs: don't crash on unexpected holes in dir/attr btrees 2017-07-07 18:55:17 -07:00
xfs_dir2_data.c xfs: check that dir block entries don't off the end of the buffer 2017-07-25 08:36:35 -07:00
xfs_dir2_leaf.c xfs: don't crash on unexpected holes in dir/attr btrees 2017-07-07 18:55:17 -07:00
xfs_dir2_node.c xfs: return the hash value of a leaf1 directory block 2017-06-20 10:45:21 -07:00
xfs_dir2_priv.h xfs: pass along transaction context when reading directory block buffers 2017-06-20 10:45:22 -07:00
xfs_dir2_sf.c xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_dir2.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_dir2.h xfs: scrub directory metadata 2017-10-26 15:38:25 -07:00
xfs_dquot_buf.c xfs: simplify xfs_calc_dquots_per_chunk 2017-04-12 08:42:51 -07:00
xfs_errortag.h xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_format.h xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_fs.h xfs: scrub quota information 2017-10-26 15:38:26 -07:00
xfs_ialloc_btree.c xfs: plumb in needed functions for range querying of various btrees 2017-06-19 14:11:34 -07:00
xfs_ialloc_btree.h xfs: use per-AG reservations for the finobt 2017-01-25 07:49:35 -08:00
xfs_ialloc.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_ialloc.h xfs: create inode pointer verifiers 2017-10-26 15:38:23 -07:00
xfs_iext_tree.c xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_inode_buf.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_inode_buf.h xfs: export various function for the online scrubber 2017-06-19 14:11:34 -07:00
xfs_inode_fork.c xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_inode_fork.h xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00
xfs_log_format.h xfs: remove inode log format typedef 2017-11-01 15:03:16 -07:00
xfs_log_recover.h xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_log_rlimit.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_quota_defs.h Revert "xfs: grab dquots without taking the ilock" 2017-07-13 14:55:05 -07:00
xfs_refcount_btree.c xfs: always compile the btree inorder check functions 2017-06-19 14:11:33 -07:00
xfs_refcount_btree.h xfs: use the actual AG length when reserving blocks 2017-01-03 18:39:33 -08:00
xfs_refcount.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_refcount.h xfs: try to avoid blowing out the transaction reservation when bunmaping a shared extent 2017-06-19 08:59:10 -07:00
xfs_rmap_btree.c xfs: always compile the btree inorder check functions 2017-06-19 14:11:33 -07:00
xfs_rmap_btree.h xfs: use the actual AG length when reserving blocks 2017-01-03 18:39:33 -08:00
xfs_rmap.c xfs: move error injection tags into their own file 2017-11-01 15:03:16 -07:00
xfs_rmap.h xfs: export various function for the online scrubber 2017-06-19 14:11:34 -07:00
xfs_rtbitmap.c xfs: remove redundant assignment to variable bit 2017-10-31 12:03:35 -07:00
xfs_sb.c xfs: remove double-underscore integer types 2017-06-19 14:11:33 -07:00
xfs_sb.h xfs: remove unused function definitions 2016-02-08 14:58:07 +11:00
xfs_shared.h xfs: define the on-disk refcount btree format 2016-10-03 09:11:18 -07:00
xfs_symlink_remote.c xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLEN 2017-07-07 08:37:26 -07:00
xfs_trans_resv.c xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLEN 2017-07-07 08:37:26 -07:00
xfs_trans_resv.h xfs: increase log reservations for reflink 2016-10-05 16:26:29 -07:00
xfs_trans_space.h xfs: reserve enough blocks to handle btree splits when remapping 2017-05-03 13:21:40 -07:00
xfs_types.h xfs: use a b+tree for the in-core extent list 2017-11-06 11:53:41 -08:00