linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-02 16:44:59 +00:00

Author	SHA1	Message	Date
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Brett Holman	8dd6ed9451	bcachefs: add progress stats to sysfs This adds progress stats to sysfs for copygc, rebalance, recovery, and the cmd_job ioctls. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:10 -04:00
Kent Overstreet	9f1833cadd	bcachefs: Update btree ptrs after every write This closes a significant hole (and last known hole) in our ability to verify metadata. Previously, since btree nodes are log structured, we couldn't detect lost btree writes that weren't the first write to a given node. Additionally, this seems to have lead to some significant metadata corruption on multi device filesystems with metadata replication: since a write may have made it to one device and not another, if we read that btree node back from the replica that did have that write and started appending after that point, the other replica would have a gap in the bset entries and reading from that replica wouldn't find the rest of the bsets. But, since updates to interior btree nodes are now journalled, we can close this hole by updating pointers to btree nodes after every write with the currently written number of sectors, without negatively affecting performance. This means we will always detect lost or corrupt metadata - it also means that our btree is now a curious hybrid of COW and non COW btrees, with all the benefits of both (excluding complexity). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	d976a84e3b	bcachefs: Don't loop into topology repair Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:07 -04:00
Kent Overstreet	890b74f03d	bcachefs: Fsck for reflink refcounts Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	9f2772c454	bcachefs: Split out btree_error_wq We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	ddc7dd62f0	bcachefs: Don't use uuid in tracepoints %pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	731bdd2eff	bcachefs: Add a workqueue for btree io completions Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	1ce0cf5fe9	bcachefs: Add a debug mode that always reads from every btree replica There's a new module parameter, verify_all_btree_replicas, that enables reading from every btree replica when reading in btree nodes and comparing them against each other. We've been seeing some strange btree corruption - this will hopefully aid in tracking it down and catching it more often. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	ef1b20924b	bcachefs: Ratelimiting for writeback IOs Writeback throttling is a kernel config option and not always enabled. When it's not enabled we need a fallback, to avoid unbounded memory pinning and work item backlogs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	595c1e9bab	bcachefs: Fix time handling There were some overflows in the time conversion functions - fix this by converting tv_sec and tv_nsec separately. Also, set sb->time_min and sb->time_max. Fixes xfstest generic/258. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	aae15aafcd	bcachefs: New and improved topology repair code This splits out btree topology repair into a separate pass, and makes some improvements: - When we have to pick which of two overlapping nodes to drop keys from, we use the btree node header sequence number to preserve the newer node - the gc code has been changed so that it doesn't bail out if we're continuing/ignoring on fsck error - this way the dump tool can skip running the repair pass but still walk all reachable metadata - add a new superblock flag indicating when a filesystem is known to have btree topology issues, and the topology repair pass should be run - changing the start/end of a node might mean keys in that node have to be deleted: this patch handles that better by splitting it out into a separate function and running it explicitly in the topology repair code, previously those keys were only being dropped when the btree node was read in. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	4932e07ea0	bcachefs: Fix key cache assertion Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	6adaac0b95	bcachefs: Update bch2_btree_verify() bch2_btree_verify() verifies that the btree node on disk matches what we have in memory. This patch changes it to verify every replica, and also fixes it for interior btree nodes - there's a mem_ptr field which is used as a scratch space and needs to be zeroed out for comparing with what's on disk. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:01 -04:00
Kent Overstreet	dac1525d9c	bcachefs: gc shouldn't care about owned_by_allocator The owned_by_allocator field is a purely in memory thing, even if/when we bring back GC at runtime there's no need for it to be recalculating this field. This is prep work for pulling it out of struct bucket, and eventually getting rid of the bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:01 -04:00
Kent Overstreet	ac516d0e7d	bcachefs: Add the status of bucket gen gc to sysfs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:00 -04:00
Kent Overstreet	ba5f03d362	bcachefs: Add a sysfs var for average btree write size Useful number for performance tuning. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:58 -04:00
Kent Overstreet	84cc758d6b	bcachefs: Validate bset version field against sb version fields The superblock version fields need to be accurate to know whether a filesystem is supported, thus we should be verifying them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	41f8b09edc	bcachefs: Rename BTREE_ID enums for consistency with other enums Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	9620c3ec2f	bcachefs: Add a mempool for the replicas delta list Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	e131b6aa0a	bcachefs: Add a mempool for btree_trans bump allocator This allocation is required for filesystem operations to make forward progress, thus needs a mempool. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	bae895a5a3	bcachefs: Add allocator thread state to sysfs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	51c66fedc0	bcachefs: Rip out copygc pd controller We have a separate mechanism for ratelimiting copygc now - the pd controller has only been causing problems. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	5bbe4bf95b	bcachefs: Add copygc wait to sysfs Currently debugging an issue with copygc not running when it's supposed to, and this is an obvious first step. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	cb66fc5fe4	bcachefs: Fix copygc threshold Awhile back the meaning of is_available_bucket() and thus also bch_dev_usage->buckets_unavailable changed to include buckets that are owned by the allocator - this was so that the stat could be persisted like other allocation information, and wouldn't have to be regenerated by walking each bucket at mount time. This broke copygc, which needs to consider buckets that are reclaimable and haven't yet been grabbed by the allocator thread and moved onta freelist. This patch fixes that by adding dev_buckets_reclaimable() for copygc and the allocator thread, and cleans up some of the callers a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	4b8f89afd4	bcachefs: Fixes/improvements for journal entry reservations This fixes some arithmetic bugs in "bcachefs: Journal updates to dev usage" - additionally, it cleans things up by switching everything that goes in every journal entry to the journal_entry_res mechanism. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	180fb49dea	bcachefs: Journal updates to dev usage This eliminates the need to scan every bucket to regenerate dev_usage at mount time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	2abe542087	bcachefs: Persist 64 bit io clocks Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But, this required periodically rescaling counters on every bucket to avoid wraparound. That wasn't an issue in bcache, where we'd perodically rewrite the per bucket metadata all at once, but in bcachefs we're trying to avoid having to walk every single bucket. This patch switches to persisting 64 bit io clocks, corresponding to the 64 bit bucket timestaps introduced in the previous patch with KEY_TYPE_alloc_v2. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	a28bd48a7f	bcachefs: Add an assertion to check for journal writes to same location Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	a0b73c1c53	bcachefs: Add (partial) support for fixing btree topology When we walk the btrees during recovery, part of that is checking that btree topology is correct: for every interior btree node, its child nodes should exactly span the range the parent node covers. Previously, we had checks for this, but not repair code. Now that we have the ability to do btree updates during initial GC, this patch adds that repair code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	5b593ee172	bcachefs: Add support for doing btree updates prior to journal replay Some errors may need to be fixed in order for GC to successfully run - walk and mark all metadata. But we can't start the allocators and do normal btree updates until after GC has completed, and allocation information is known to be consistent, so we need a different method of doing btree updates. Fortunately, we already have code for walking the btree while overlaying keys from the journal to be replayed. This patch adds an update path that adds keys to the list of keys to be replayed by journal replay, and also fixes up iterators. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:52 -04:00
Kent Overstreet	4291a3317f	bcachefs: bch2_alloc_write() should be writing for all devices Alloc info isn't stored on a particular device, it makes no sense to only be writing it out for rw members - this was causing fsck to not fix alloc info errors, oops. Also, make sure we write out alloc info in other repair paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:50 -04:00
Kent Overstreet	0fefe8d8ef	bcachefs: Improve some IO error messages it's useful to know whether an error was for a read or a write - this also standardizes error messages a bit more. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	f299d57350	bcachefs: Refactor filesystem usage accounting Various filesystem usage counters are kept in percpu counters, with one set per in flight journal buffer. Right now all the code that deals with it assumes that there's only two buffers/sets of counters, but the number of journal bufs is getting increased to 4 in the next patch - so refactor that code to not assume a constant. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	b7a9bbfc1b	bcachefs: Move journal reclaim to a kthread This is to make tracing easier. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:48 -04:00
Kent Overstreet	876c7af3a6	bcachefs: Take a SRCU lock in btree transactions Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:47 -04:00
Kent Overstreet	1a21bf9866	bcachefs: Add a single slot percpu buf for btree iters Allocating our array of btree iters is a big enough allocation that it hits the buddy allocator, and we're seeing lots of lock contention. Sticking a single element buffer in front of it should help. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:46 -04:00
Kent Overstreet	b5e8a6992f	bcachefs: Improved inode create optimization This shards new inodes into different btree nodes by using the processor ID for the high bits of the new inode number. Much faster than the previous inode create optimization - this also helps with sharding in the other btrees that index by inode number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:46 -04:00
Kent Overstreet	692d4031a4	bcachefs: Split out debug_check_btree_accounting This check is very expensive Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:45 -04:00
Kent Overstreet	29364f3453	bcachefs: Drop sysfs interface to debug parameters It's not used much anymore, the module paramter interface is better. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:45 -04:00
Kent Overstreet	45e4dcba79	bcachefs: Inode create optimization On workloads that do a lot of multithreaded creates all at once, lock contention on the inodes btree turns out to still be an issue. This patch adds a small buffer of inode numbers that are known to be free, so that we can avoid touching the btree on every create. Also, this changes inode creates to update via the btree key cache for the initial create. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:45 -04:00
Kent Overstreet	d5e4dcc29c	bcachefs: Fix unmount path There was a long standing race in the mount/unmount code - the VFS intends for mount/unmount synchronizatino to be handled by the list of superblocks, but we were still holding devices open after tearing down our superblock in the unmount path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:44 -04:00
Kent Overstreet	e6d1161530	bcachefs: Make copygc thread global Per device copygc threads don't move data to different devices and they make fragmentation works - they don't make much sense anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:42 -04:00
Kent Overstreet	703e2a43bf	bcachefs: Move stripe creation to workqueue This is mainly to solve a lock ordering issue, and also simplifies the code a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:42 -04:00
Kent Overstreet	ba6dd1dd49	bcachefs: Improve stripe triggers/heap code Soon we'll be able to modify existing stripes - replacing empty blocks with new blocks and new p/q blocks. This patch updates the trigger code to handle pointers changing in an existing stripe; also, it significantly improves how the stripes heap works, which means we can get rid of the stripe creation/deletion lock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:42 -04:00
Kent Overstreet	7dd1ebfa1e	bcachefs: Increase size of btree node reserve Also tweak the allocator to be more aggressive about keeping it full. The recent changes to make updates to interior nodes transactional (and thus generate updates to the alloc btree) all put more stress on the btree node reserves. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:41 -04:00
Kent Overstreet	2ca88e5ad9	bcachefs: Btree key cache This introduces a new kind of btree iterator, cached iterators, which point to keys cached in a hash table. The cache also acts as a write cache - in the update path, we journal the update but defer updating the btree until the cached entry is flushed by journal reclaim. Cache coherency is for now up to the users to handle, which isn't ideal but should be good enough for now. These new iterators will be used for updating inodes and alloc info (the alloc and stripes btrees). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:41 -04:00
Kent Overstreet	1ada160618	bcachefs: Turn c->state_lock into an rwsem Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:41 -04:00
Kent Overstreet	374153c2a9	bcachefs: More open buckets We need a larger open bucket reserve now that the btree interior update path holds onto open bucket references; filesystems with many high through devices may need more open buckets now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:41 -04:00
Kent Overstreet	a27443bc76	bcachefs: Kill old allocator startup code It's not needed anymore since we can now write to buckets before updating the alloc btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:40 -04:00
Kent Overstreet	495aabede3	bcachefs: Add debug code to print btree transactions Intented to help debug deadlocks, since we can't use lockdep to check btree node lock ordering. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:40 -04:00
Kent Overstreet	039fc4c522	bcachefs: Fixes for going RO Now that interior btree updates are fully transactional, we don't need to write out alloc info in a loop. However, interior btree updates do put more things in the journal, so we still need a loop in the RO sequence. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:40 -04:00
Kent Overstreet	00b8ccf707	bcachefs: Interior btree updates are now fully transactional We now update the alloc info (bucket sector counts) atomically with journalling the update to the interior btree nodes, and we also set new btree roots atomically with the journalled part of the btree update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:40 -04:00
Kent Overstreet	2340fd9d27	bcachefs: Be more rigorous about marking the filesystem clean Previously, there was at least one error path where we could mark the filesystem clean when we hadn't sucessfully written out alloc info. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:39 -04:00
Kent Overstreet	f1d786a0db	bcachefs: Add an option for keeping journal entries after startup This will be used by the userspace debug tools. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:37 -04:00
Kent Overstreet	ac7c51b218	bcachefs: Seralize btree_update operations at btree_update_nodes_written() Prep work for journalling updates to interior nodes - enforcing ordering will greatly simplify those changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:35 -04:00
Kent Overstreet	1c3ff72c0f	bcachefs: Convert some enums to x-macros Helps for preventing things from getting out of sync. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:33 -04:00
Kent Overstreet	bd7e82ee2a	bcachefs: kill ca->freelist_lock All uses were supposed to be switched over to c->freelist_lock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:32 -04:00
Kent Overstreet	35189e09ab	bcachefs: bkey_on_stack This implements code for storing small bkeys on the stack and allocating out of a mempool if they're too big. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:32 -04:00
Kent Overstreet	ff929515cc	bcachefs: Trust btree alloc info at runtime This lets us avoid a cache miss in the write path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:30 -04:00
Kent Overstreet	2a9101a989	bcachefs: Refactor bch2_trans_commit() path Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:30 -04:00
Kent Overstreet	ad7e137ebc	bcachefs: Switch reconstruct_alloc to a mount option Right now this is the only way of repairing bucket gens in the future Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:26 -04:00
Kent Overstreet	6671a7089f	bcachefs: Refactor bch2_alloc_write() Major simplification - gets rid of the need for marking buckets as dirty, instead we write buckets if the in memory mark is different from what's in the btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:26 -04:00
Kent Overstreet	4e1510c3e9	bcachefs: Add a hint for allocating new stripes This way we aren't doing a full linear scan every time we create a new stripe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:26 -04:00
Kent Overstreet	76426098e4	bcachefs: Reflink Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:25 -04:00
Kent Overstreet	5e82a9a1f4	bcachefs: Write out fs usage consistently Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:21 -04:00
Kent Overstreet	fca1223ccf	bcachefs: Avoid write lock on mark_lock mark_lock is a frequently taken lock, and there's also potential for deadlocks since currently bch2_clear_page_bits which is called from memory reclaim has to take it to drop disk reservations. The disk reservation get path takes it when it recalculates the number of sectors known to be available, but it's not really needed for consistency. We just want to make sure we only have one thread updating the sectors_available count, which we can do with a dedicated mutex. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:21 -04:00
Kent Overstreet	3811aa6d4d	bcachefs: bch2_bkey_ptrs_invalid() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:21 -04:00
Kent Overstreet	ea41602344	bcachefs: use same timesource as current_time() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	f80b4e64a4	bcachefs: Fix hang while shutting down If the allocator thread exited before bch2_dev_allocator_stop() was called (because of an error), bch2_dev_allocator_quiesce() could hang. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	1dd7f9d98d	bcachefs: Rewrite journal_seq_blacklist machinery Now, we store blacklisted journal sequence numbers in the superblock, not the journal: this helps to greatly simplify the code, and more importantly it's now implemented in a way that doesn't require all btree nodes to be visited before starting the journal - instead, we unconditionally blacklist the next 4 journal sequence numbers after an unclean shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	ac7f0d77c2	bcachefs: ratelimit copygc warning Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	0bc166ff56	bcachefs: Track whether filesystem has errors in superblock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:19 -04:00
Kent Overstreet	f13f5a8c83	bcachefs: move some checks to expensive_debug_checks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:19 -04:00
Kent Overstreet	03e183cb5d	bcachefs: Verify fs hasn't been modified before going rw Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	134915f3d3	bcachefs: Go rw lazily Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	6122ab639c	bcachefs: More debug params for testing of recovery paths Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	dc3b63dc33	bcachefs: Add time stats for btree updates Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	49a67206e4	bcachefs: Add more time stats for being blocked on allocator Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	4d8100daa9	bcachefs: Allocate fs_usage in do_btree_insert_at() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	76f4c7b0c3	bcachefs: Fix oldest_gen handling Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:15 -04:00
Kent Overstreet	1df42b5715	bcachefs: don't do initial gc if have alloc info feature Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:15 -04:00
Kent Overstreet	2c5af169f7	bcachefs: reserve space in journal for fs usage entries Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:15 -04:00
Kent Overstreet	b935a8a67a	bcachefs: Fix a bug when shutting down before allocator started Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:15 -04:00
Kent Overstreet	430735cd1a	bcachefs: Persist alloc info on clean shutdown - Does not persist alloc info for stripes yet - Also does not yet include filesystem block/sector counts yet, from struct fs_usage - Not made use of just yet Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:14 -04:00
Kent Overstreet	7ef2a73a58	bcachefs: Fix check for if extent update is allocating Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:14 -04:00
Kent Overstreet	b030f691da	bcachefs: Fix some reserve calculations Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:14 -04:00
Kent Overstreet	0519b72dd2	bcachefs: Add a workqueue for journal reclaim journal reclaim writes btree nodes, which can end up waiting for in flight btree writes to complete, and btree write completions run out of workqueues - so we can't run out of the same workqueue or we risk deadlock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:14 -04:00
Kent Overstreet	0b847a19d9	bcachefs: Lots of option handling improvements Add helptext to option definitions - so we can unify the option handling with the format command Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:14 -04:00
Kent Overstreet	5663a41521	bcachefs: refactor bch_fs_usage Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:13 -04:00
Kent Overstreet	73e6ab9564	bcachefs: Switch replicas to mark_lock Prep work for upcoming disk accounting changes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:13 -04:00
Kent Overstreet	9166b41db1	bcachefs: s/usage_lock/mark_lock better describes what it's for, and we're going to call a new lock usage_lock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:13 -04:00
Kent Overstreet	8eb7f3ee46	bcachefs: move dirty into bucket_mark Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	90541a741d	bcachefs: Add new alloc fields prep work for persistent alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	f0cfb963ec	bcachefs: Track nr_inodes with the key marking machinery Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	26609b619f	bcachefs: Make bkey types globally unique this lets us get rid of a lot of extra switch statements - in a lot of places we dispatch on the btree node type, and then the key type, so this is a nice cleanup across a lot of code. Also improve the on disk format versioning stuff. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	dfe9bfb32e	bcachefs: Stripes now properly subject to gc gc now verifies the contents of the stripes radix tree, important for persistent alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	9ca53b55f7	bcachefs: gc now operates on second set of bucket marks This means we can now use gc to verify the allocation information - important for testing persistant alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	61274e9d45	bcachefs: Allocator startup improvements Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:12 -04:00
Kent Overstreet	cd575ddf57	bcachefs: Erasure coding Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:11 -04:00
Kent Overstreet	8b335baef2	bcachefs: Assorted fixes for running on very small devices It's now possible to create and use a filesystem on a 512k device with 4k buckets (though at that size we still waste almost half to internal reserves) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:11 -04:00
Kent Overstreet	b092dadd55	bcachefs: Scale down number of writepoints when low on space this means we don't have to reserve space for them when calculating filesystem capacity Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:11 -04:00
Kent Overstreet	7a920560d7	bcachefs: kill struct bch_replicas_cpu_entry Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:10 -04:00
Kent Overstreet	581edb6341	bcachefs: mempoolify btree_trans Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:09 -04:00
Kent Overstreet	a9bec5208b	bcachefs: Better calculation of copygc threshold Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:08 -04:00
Kent Overstreet	b29e197aaf	bcachefs: Invalidate buckets when writing to alloc btree Prep work for persistent alloc information. Refactoring also lets us make free_inc much smaller, which means a lot fewer buckets stranded on freelists. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:08 -04:00
Kent Overstreet	b2be7c8b73	bcachefs: kill bucket mark sector count saturation Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:08 -04:00
Kent Overstreet	c692399529	bcachefs: don't call bch2_bucket_seq_cleanup from journal_buf_switch journal_buf_switch is called from the foreground when getting a journal reservation and thus is somewhat latency sensitive; bch2_bucket_seq_cleanup has to run infrequently but is a bit expensive when it does run. Call it from the journal write path instead, and punt the journal write to worqueue context. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:08 -04:00
Kent Overstreet	88c07f7397	bcachefs: Only check inode i_nlink during full fsck Now that all filesystem operatinos that manipulate the filesystem heirachy and i_nlink are fully atomic, we can add a feature bit to indicate i_nlink doesn't need to be checked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:07 -04:00
Kent Overstreet	1c6fdbd8f2	bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:07 -04:00

... 4 5 6 7 8

360 Commits