mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-08-26 06:10:54 +00:00
master
288 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
3127f1010c |
f2fs: Prevent s_writer rw_sem count mismatch in f2fs_evict_inode
If f2fs_evict_inode is called between freeze_super and thaw_super, the s_writer rwsem count may become negative, resulting in hang. CPU1 CPU2 f2fs_resize_fs() f2fs_evict_inode() f2fs_freeze set SBI_IS_FREEZING skip sb_start_intwrite f2fs_unfreeze clear SBI_IS_FREEZING sb_end_intwrite To solve this problem, the call to sb_end_write is determined by whether sb_start_intwrite is called, rather than the current freezing status. Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com> Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
70d201a408 |
f2fs update for 6.8-rc1
In this series, we've some progress to support Zoned block device regarding to the power-cut recovery flow and enabling checkpoint=disable feature which is essential for Android OTA. Other than that, some patches touched sysfs entries and tracepoints which are minor, while several bug fixes on error handlers and compression flows are good to improve the overall stability. Enhancement: - enable checkpoint=disable for zoned block device - sysfs entries such as discard status, discard_io_aware, dir_level - tracepoints such as f2fs_vm_page_mkwrite(), f2fs_rename(), f2fs_new_inode() - use shared inode lock during f2fs_fiemap() and f2fs_seek_block() Bug fix: - address some power-cut recovery issues on zoned block device - handle errors and logics on do_garbage_collect(), f2fs_reserve_new_block(), f2fs_move_file_range(), f2fs_recover_xattr_data() - don't set FI_PREALLOCATED_ALL for partial write - fix to update iostat correctly in f2fs_filemap_fault() - fix to wait on block writeback for post_read case - fix to tag gcing flag on page during block migration - restrict max filesize for 16K f2fs - fix to avoid dirent corruption - explicitly null-terminate the xattr list There are also several clean-up patches to remove dead codes and better readability. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmWgMYcACgkQQBSofoJI UNJShxAAiYOXP7LPOAbPS1251BBgl8AIfs6u96hGTZkxOYsLHrBBbPbkWf3+nVbC JsBsVOe9K50rssK9kPg6XHPbmFGC8ERlyYcZTpONLfjtHOaQicbRnc//2qOvnCx8 JOKcMVkZyLU/HbOCoUW6mzNCQlOl0aAV8tRcb7jwAxT0HgpjHTHxej/62gRcPKzC 1E5w4iNTY//R97YGB36jPeGlKhbBZ7Ox1NM6AWadgE7B0j9rcYiBnPQllyeyaVVo XMCWRdl42tNMks2zgvU+vC41OrZ55bwLTQmVj3P1wnyKXig5/ZLQsrEcIGE+b2tP Mx+imCIRNYZqLwv5KYl6FU+KuLQGuZT1AjpP70Cb95WLyiYvVE6+xeiZg0fVTCEF 3Hg7lEqMtAEAh1NEmJyYmbiAm9KQ3vHyse9ix++tfm+Xvgqj8b2flmzAtIFKpCBV J+yFI+A55IYuYZt7gzPoZLkQL0tULPf80TKQrzwlnHNtZ6T6FK2Nunu+Urwf1/Th s5IulqHJZxHU/Bgd6yQZUVfDILcXTkqNCpO3+qLZMPZizlH1hXiJFTeVzS6mnGvZ sK2LL4rEJ8EhDHU1F0SJzCWJcuR8cQ/t2zKYUygo9LvHbtEM1bZwC1Bqfolt7NrU +pgiM2wnE9yjkPdfZN1JgYZDq0/lGvxPQ5NAc/5ERX71QonRyn8= =MQl3 -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs update from Jaegeuk Kim: "In this series, we've some progress to support Zoned block device regarding to the power-cut recovery flow and enabling checkpoint=disable feature which is essential for Android OTA. Other than that, some patches touched sysfs entries and tracepoints which are minor, while several bug fixes on error handlers and compression flows are good to improve the overall stability. Enhancements: - enable checkpoint=disable for zoned block device - sysfs entries such as discard status, discard_io_aware, dir_level - tracepoints such as f2fs_vm_page_mkwrite(), f2fs_rename(), f2fs_new_inode() - use shared inode lock during f2fs_fiemap() and f2fs_seek_block() Bug fixes: - address some power-cut recovery issues on zoned block device - handle errors and logics on do_garbage_collect(), f2fs_reserve_new_block(), f2fs_move_file_range(), f2fs_recover_xattr_data() - don't set FI_PREALLOCATED_ALL for partial write - fix to update iostat correctly in f2fs_filemap_fault() - fix to wait on block writeback for post_read case - fix to tag gcing flag on page during block migration - restrict max filesize for 16K f2fs - fix to avoid dirent corruption - explicitly null-terminate the xattr list There are also several clean-up patches to remove dead codes and better readability" * tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits) f2fs: show more discard status by sysfs f2fs: Add error handling for negative returns from do_garbage_collect f2fs: Constrain the modification range of dir_level in the sysfs f2fs: Use wait_event_freezable_timeout() for freezable kthread f2fs: fix to check return value of f2fs_recover_xattr_data f2fs: don't set FI_PREALLOCATED_ALL for partial write f2fs: fix to update iostat correctly in f2fs_filemap_fault() f2fs: fix to check compress file in f2fs_move_file_range() f2fs: fix to wait on block writeback for post_read case f2fs: fix to tag gcing flag on page during block migration f2fs: add tracepoint for f2fs_vm_page_mkwrite() f2fs: introduce f2fs_invalidate_internal_cache() for cleanup f2fs: update blkaddr in __set_data_blkaddr() for cleanup f2fs: introduce get_dnode_addr() to clean up codes f2fs: delete obsolete FI_DROP_CACHE f2fs: delete obsolete FI_FIRST_BLOCK_WRITTEN f2fs: Restrict max filesize for 16K f2fs f2fs: let's finish or reset zones all the time f2fs: check write pointers when checkpoint=disable f2fs: fix write pointers on zoned device after roll forward ... |
||
![]() |
2020cd48e4 |
f2fs: introduce get_dnode_addr() to clean up codes
Just cleanup, no logic changes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
a539363613 |
f2fs: delete obsolete FI_FIRST_BLOCK_WRITTEN
Commit |
||
![]() |
af7628d6ec |
fs: convert error_remove_page to error_remove_folio
There were already assertions that we were not passing a tail page to error_remove_page(), so make the compiler enforce that by converting everything to pass and use a folio. Link: https://lkml.kernel.org/r/20231117161447.2461643-7-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
aea6bf908d |
f2fs update for 6.7-rc1
In this cycle, we introduce a bigger page size support by changing the internal f2fs's block size aligned to the page size. We also continue to improve zoned block device support regarding the power off recovery. As usual, there are some bug fixes regarding the error handling routines in compression and ioctl. Enhancement: - Support Block Size == Page Size - let f2fs_precache_extents() traverses in file range - stop iterating f2fs_map_block if hole exists - preload extent_cache for POSIX_FADV_WILLNEED - compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file() Bug fix: - do not return EFSCORRUPTED, but try to run online repair - finish previous checkpoints before returning from remount - fix error handling of __get_node_page and __f2fs_build_free_nids - clean up zones when not successfully unmounted - fix to initialize map.m_pblk in f2fs_precache_extents() - fix to drop meta_inode's page cache in f2fs_put_super() - set the default compress_level on ioctl - fix to avoid use-after-free on dic - fix to avoid redundant compress extension - do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on - fix deadloop in f2fs_write_cache_pages() -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmVGlP4ACgkQQBSofoJI UNKahg/7BuiBi2+NZ93WeUoRURBNa4H1cK93GtyE4rPMdEjVPujxonnqzC/N2nx5 1ZjBLEjV0mCXfqlm+D+ORkZDIeCO+PcPI+vLjjHG+qhphWsf9wUJnWIUf+LhmXDP p0ertHlZrSONG+HsbXl6BOEq/LWYTe6WDOwY0MMO4U7IxHVyKUuFJSL6DQRRL2r2 Op4/NOes/cvj8dXvUZtyF3tQHyhkTbdPl8pfFtmagLlujC2WOuySKlfHUF8pxdv4 RT3TER66Rs8IStrFtyuv6yHHtvpfl8jTuJ1DNaNphBCi2RlERvx7+zY3iyLIZgIZ zKxDkGUIb/UnKmoipiu/ZkUvJ6wi2IfP4C5hffAHBsxwyVlrSldCXioE4j5Ysbcm pOWjgB7ASyGVU6yxyUpoQUlW5oztPwqkjnhfeXu6cgHOLRWdw224hJ6bA4XU3E61 R2rdaNrpGICwl3juFPxH7uHxNjnPTJB1G38LJApAypkoHoa/et5foZtK0pwSxMD+ j8ZkczJvdAgkk6fbFzPA1Urg+6Qhx8SpDeNxCXIS/F7izPZhkvyZ45SBuWboJXld 8rzel9JzswdB2dpeXm0VVofwBg9l8CG5b7mwzmgHpmfD2MjdstLKp9aO5G2SUSJt BKArSUfipzQhljv0xhous0OQitKLhg83o6+KMAXw3OpUlmWGLV4= =U4Ke -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this cycle, we introduce a bigger page size support by changing the internal f2fs's block size aligned to the page size. We also continue to improve zoned block device support regarding the power off recovery. As usual, there are some bug fixes regarding the error handling routines in compression and ioctl. Enhancements: - Support Block Size == Page Size - let f2fs_precache_extents() traverses in file range - stop iterating f2fs_map_block if hole exists - preload extent_cache for POSIX_FADV_WILLNEED - compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file() Bug fixes: - do not return EFSCORRUPTED, but try to run online repair - finish previous checkpoints before returning from remount - fix error handling of __get_node_page and __f2fs_build_free_nids - clean up zones when not successfully unmounted - fix to initialize map.m_pblk in f2fs_precache_extents() - fix to drop meta_inode's page cache in f2fs_put_super() - set the default compress_level on ioctl - fix to avoid use-after-free on dic - fix to avoid redundant compress extension - do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on - fix deadloop in f2fs_write_cache_pages()" * tag 'f2fs-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: finish previous checkpoints before returning from remount f2fs: fix error handling of __get_node_page f2fs: do not return EFSCORRUPTED, but try to run online repair f2fs: fix error path of __f2fs_build_free_nids f2fs: Clean up errors in segment.h f2fs: clean up zones when not successfully unmounted f2fs: let f2fs_precache_extents() traverses in file range f2fs: avoid format-overflow warning f2fs: fix to initialize map.m_pblk in f2fs_precache_extents() f2fs: Support Block Size == Page Size f2fs: stop iterating f2fs_map_block if hole exists f2fs: preload extent_cache for POSIX_FADV_WILLNEED f2fs: set the default compress_level on ioctl f2fs: compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file() f2fs: fix to drop meta_inode's page cache in f2fs_put_super() f2fs: split initial and dynamic conditions for extent_cache f2fs: compress: fix to avoid redundant compress extension f2fs: compress: do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on f2fs: compress: fix to avoid use-after-free on dic f2fs: compress: fix deadloop in f2fs_write_cache_pages() |
||
![]() |
11cc6426ad
|
f2fs: convert to new timestamp accessors
Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-34-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
![]() |
d7e9a9037d |
f2fs: Support Block Size == Page Size
This allows f2fs to support cases where the block size = page size for both 4K and 16K block sizes. Other sizes should work as well, should the need arise. This does not currently support 4K Block size filesystems if the page size is 16K. Signed-off-by: Daniel Rosenberg <drosen@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
92901222f8 |
f2fs update for 6.6-rc1
In this cycle, we don't have a highlighted feature enhancement, but mostly have fixed issues mainly in two parts: 1) zoned block device, 2) compression support. For zoned block device, we've tried to improve the power-off recovery flow as much as possible. For compression, we found some corner cases caused by wrong compression policy and logics. Other than them, there were some reverts and stat corrections. Bug fix: - use finish zone command when closing a zone - check zone type before sending async reset zone command - fix to assign compress_level for lz4 correctly - fix error path of f2fs_submit_page_read() - don't {,de}compress non-full cluster - send small discard commands during checkpoint back - flush inode if atomic file is aborted - correct to account gc/cp stats And, there are minor bug fixes, avoiding false lockdep warning, and clean-ups. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmTyemoACgkQQBSofoJI UNLLKQ//bupYGPOqAgKbd/s7FhtULMiiRmFVy7W2eoMIc/oeeXOGrzDAF/1NifLC WLV4uBNVTS4PS8D1vRzxZNEZt9aqPS0vQ8hxW/3nTI9Z425NX3nz7gLSxxmwIkIe xj++V6tvKPcCH0BfKvfFCtcxj09PsflgdEuT8w/sIkH6p4o+VEMFs1Lc9PQsjUmh epznK7JGBwpAxmHqI74n1eAw2CI6W+oKx23YDTNMBD6hmXTU0fkTeBURrOlSsUHZ nhafPecsrCEI+OpAj03G/7e/zt+iTUKdmHx9O5ir/P00vF/c+SU2vSwB97FiHqBi B4UmocTM0MAsU80PQcmE6aU3zgQFI0Yun5yZ24VeWjKTu76ssZSmT2HA/4RL+LLf AeAW4FSyfh76pls8X5IWfilxGLWq6kTzSZA0MF7dH2q7qlj5apL5wKpm/XH6POqn qELY/Y9+P1QuCcNL8BiYrgA5xBqVJ7Uw/6/6U3Y77PElc+Pwl3vI8UZ7uCOBrsXL e0TLXy23AJA6AS2DyLLziy669nXAZRb95B8TWMfEeVZIMFvCeeqYc74N8jOFa0T8 q6uQFZs+0cETLZA8MSZdlNhzvhJmbW6wgSIz++CEdikWSLBZMKWxBVjCPkkCY9uc DMh8zruSVbYPZWBTcxkMFEBJKKrU43++e7pb8ZoqTj4Pq1317b0= =Qa8+ -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this cycle, we don't have a highlighted feature enhancement, but mostly have fixed issues mainly in two parts: 1) zoned block device, and 2) compression support. For zoned block device, we've tried to improve the power-off recovery flow as much as possible. For compression, we found some corner cases caused by wrong compression policy and logics. Other than them, there were some reverts and stat corrections. Bug fixes: - use finish zone command when closing a zone - check zone type before sending async reset zone command - fix to assign compress_level for lz4 correctly - fix error path of f2fs_submit_page_read() - don't {,de}compress non-full cluster - send small discard commands during checkpoint back - flush inode if atomic file is aborted - correct to account gc/cp stats And, there are minor bug fixes, avoiding false lockdep warning, and clean-ups" * tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (25 commits) f2fs: use finish zone command when closing a zone f2fs: compress: fix to assign compress_level for lz4 correctly f2fs: fix error path of f2fs_submit_page_read() f2fs: clean up error handling in sanity_check_{compress_,}inode() f2fs: avoid false alarm of circular locking Revert "f2fs: do not issue small discard commands during checkpoint" f2fs: doc: fix description of max_small_discards f2fs: should update REQ_TIME for direct write f2fs: fix to account cp stats correctly f2fs: fix to account gc stats correctly f2fs: remove unneeded check condition in __f2fs_setxattr() f2fs: fix to update i_ctime in __f2fs_setxattr() Revert "f2fs: fix to do sanity check on extent cache correctly" f2fs: increase usage of folio_next_index() helper f2fs: Only lfs mode is allowed with zoned block device feature f2fs: check zone type before sending async reset zone command f2fs: compress: don't {,de}compress non-full cluster f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted f2fs: don't reopen the main block device in f2fs_scan_devices f2fs: fix to avoid mmap vs set_compress_option case ... |
||
![]() |
c988794984 |
f2fs: clean up error handling in sanity_check_{compress_,}inode()
In sanity_check_{compress_,}inode(), it doesn't need to set SBI_NEED_FSCK in each error case, instead, we can set the flag in do_read_inode() only once when sanity_check_inode() fails. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
958ccbbf1c |
Revert "f2fs: fix to do sanity check on extent cache correctly"
syzbot reports a f2fs bug as below: UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19 index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]') Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348 inline_data_addr fs/f2fs/f2fs.h:3275 [inline] __recover_inline_status fs/f2fs/inode.c:113 [inline] do_read_inode fs/f2fs/inode.c:480 [inline] f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604 f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601 mount_bdev+0x276/0x3b0 fs/super.c:1391 legacy_get_tree+0xef/0x190 fs/fs_context.c:611 vfs_get_tree+0x8c/0x270 fs/super.c:1519 do_new_mount+0x28f/0xae0 fs/namespace.c:3335 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The issue was bisected to: commit |
||
![]() |
c62ebd3501 |
f2fs: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230705190309.579783-41-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
![]() |
f240d3aaf5 |
f2fs: do more sanity check on inode
There are several issues in sanity_check_inode(): - The code looks not clean, it checks extra_attr related condition dispersively. - It missed to check i_extra_isize w/ lower boundary - It missed to check feature dependency: prjquota, inode_chksum, inode_crtime, compression features rely on extra_attr feature. - It's not necessary to check i_extra_isize due to it will only be assigned to non-zero value if f2fs_has_extra_attr() is true in do_read_inode(). Fix them all in this patch. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
64ee9163fe |
f2fs: compress: fix to check validity of i_compress_flag field
The last valid compress related field is i_compress_flag, check its validity instead of i_log_cluster_size. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
698a5c8c8e |
f2fs: add sanity compress level check for compressed file
Commit
|
||
![]() |
c277991d7c |
f2fs: remove unneeded in-memory i_crtime copy
i_crtime will never change after inode creation, so we don't need to copy it into f2fs_inode_info.i_disk_time[3], and monitor its change to decide whether updating inode page, remove related stuff. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
447286ebad |
f2fs: convert to use bitmap API
Let's use BIT() and GENMASK() instead of open it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
a46bebd502 |
f2fs: synchronize atomic write aborts
To fix a race condition between atomic write aborts, I use the inode
lock and make COW inode to be re-usable thoroughout the whole
atomic file inode lifetime.
Reported-by: syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com
Fixes:
|
||
![]() |
269d119481 |
f2fs: fix to do sanity check on extent cache correctly
In do_read_inode(), sanity check for extent cache should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes:
|
||
![]() |
3aa51c61cb |
f2fs: retry to update the inode page given data corruption
If the storage gives a corrupted node block due to short power failure and reset, f2fs stops the entire operations by setting the checkpoint failure flag. Let's give more chances to live by re-issuing IOs for a while in such critical path. Cc: stable@vger.kernel.org Suggested-by: Randall Huang <huangrandall@google.com> Suggested-by: Chao Yu <chao@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
b90e5086df |
f2fs: clean up i_compress_flag and i_compress_level usage
.i_compress_level was introduced by commit
|
||
![]() |
d48a7b3a72 |
f2fs: fix to do sanity check on extent cache correctly
In do_read_inode(), sanity_check_inode() should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes:
|
||
![]() |
c40e15a9a5 |
f2fs: merge f2fs_show_injection_info() into time_to_inject()
There is no need to additionally use f2fs_show_injection_info() to output information. Concatenate time_to_inject() and __time_to_inject() via a macro. In the new __time_to_inject() function, pass in the caller function name and parent function. In this way, we no longer need the f2fs_show_injection_info() function, and let's remove it. Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
71644dff48 |
f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method for f2fs, in order to improve the accuracy for data temperature classification, reduce the garbage collection overhead after long-term data updates. Enhanced hot/cold data separation can record data block update frequency as "age" of the extent per inode, and take use of the age info to indicate better temperature type for data block allocation: - It records total data blocks allocated since mount; - When file extent has been updated, it calculate the count of data blocks allocated since last update as the age of the extent; - Before the data block allocated, it searches for the age info and chooses the suitable segment for allocation. Test and result: - Prepare: create about 30000 files * 3% for cold files (with cold file extension like .apk, from 3M to 10M) * 50% for warm files (with random file extension like .FcDxq, from 1K to 4M) * 47% for hot files (with hot file extension like .db, from 1K to 256K) - create(5%)/random update(90%)/delete(5%) the files * total write amount is about 70G * fsync will be called for .db files, and buffered write will be used for other files The storage of test device is large enough(128G) so that it will not switch to SSR mode during the test. Benefit: dirty segment count increment reduce about 14% - before: Dirty +21110 - after: Dirty +18286 Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Signed-off-by: xiongping1 <xiongping1@xiaomi.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
72840cccc0 |
f2fs: allocate the extent_cache by default
Let's allocate it to remove the runtime complexity. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
e7547daccd |
f2fs: refactor extent_cache to support for read and more
This patch prepares extent_cache to be ready for addition. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
12607c1ba7 |
f2fs: specify extent cache for read explicitly
Let's descrbie it's read extent cache. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
4d8d45df22 |
f2fs: correct i_size change for atomic writes
We need to make sure i_size doesn't change until atomic write commit is successful and restore it when commit is failed. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
95fa90c9e5 |
f2fs: support recording errors into superblock
This patch supports to record detail reason of FSCORRUPTED error into f2fs_super_block.s_errors[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
a9cfee0ef9 |
f2fs: support recording stop_checkpoint reason into super_block
This patch supports to record stop_checkpoint error into f2fs_super_block.s_stop_reason[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
fcc2d8cc96 |
f2fs: fix to detect corrupted meta ino
It is possible that ino of dirent or orphan inode is corrupted in a fuzzed image, occasionally, if corrupted ino is equal to meta ino: meta_ino, node_ino or compress_ino, caller of f2fs_iget() from below call paths will get meta inode directly, it's not allowed, let's add sanity check to detect such cases. case #1 - recover_dentry - __f2fs_find_entry - f2fs_iget_retry case #2 - recover_orphan_inode - f2fs_iget_retry Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
049ea86cb5 |
f2fs: add static init_idisk_time function to reduce the code
We can use a inner function to init the disk time of f2fs_inode_info for cleaning redundant code. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
e53f864347 |
f2fs: clean up f2fs_abort_atomic_write()
f2fs_abort_atomic_write() has checked whether current inode is atomic_write one or not, it's redundant to check in its caller, remove it for cleanup. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
054cb2891b |
f2fs: replace F2FS_I(inode) and sbi by the local variable
We have define 'fi' at the begin of the functions, just use it, rather than use F2FS_I(inode) again. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> [Jaegeuk Kim: replace sbi] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
677a82b44e |
f2fs: fix to do sanity check for inline inode
Yanming reported a kernel bug in Bugzilla kernel [1], which can be reproduced. The bug message is: The kernel message is shown below: kernel BUG at fs/inode.c:611! Call Trace: evict+0x282/0x4e0 __dentry_kill+0x2b2/0x4d0 dput+0x2dd/0x720 do_renameat2+0x596/0x970 __x64_sys_rename+0x78/0x90 do_syscall_64+0x3b/0x90 [1] https://bugzilla.kernel.org/show_bug.cgi?id=215895 The bug is due to fuzzed inode has both inline_data and encrypted flags. During f2fs_evict_inode(), as the inode was deleted by rename(), it will cause inline data conversion due to conflicting flags. The page cache will be polluted and the panic will be triggered in clear_inode(). Try fixing the bug by doing more sanity checks for inline data inode in sanity_check_inode(). Cc: stable@vger.kernel.org Reported-by: Ming Yan <yanming@tju.edu.cn> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
3db1de0e58 |
f2fs: change the current atomic write way
Current atomic write has three major issues like below. - keeps the updates in non-reclaimable memory space and they are even hard to be migrated, which is not good for contiguous memory allocation. - disk spaces used for atomic files cannot be garbage collected, so this makes it difficult for the filesystem to be defragmented. - If atomic write operations hit the threshold of either memory usage or garbage collection failure count, All the atomic write operations will fail immediately. To resolve the issues, I will keep a COW inode internally for all the updates to be flushed from memory, when we need to flush them out in a situation like high memory pressure. These COW inodes will be tagged as orphan inodes to be reclaimed in case of sudden power-cut or system failure during atomic writes. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
f2db71053d |
f2fs: fix to clear dirty inode in f2fs_evict_inode()
As Yanming reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215904 The kernel message is shown below: kernel BUG at fs/f2fs/inode.c:825! Call Trace: evict+0x282/0x4e0 __dentry_kill+0x2b2/0x4d0 shrink_dentry_list+0x17c/0x4f0 shrink_dcache_parent+0x143/0x1e0 do_one_tree+0x9/0x30 shrink_dcache_for_umount+0x51/0x120 generic_shutdown_super+0x5c/0x3a0 kill_block_super+0x90/0xd0 kill_f2fs_super+0x225/0x310 deactivate_locked_super+0x78/0xc0 cleanup_mnt+0x2b7/0x480 task_work_run+0xc8/0x150 exit_to_user_mode_prepare+0x14a/0x150 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0x90 The root cause is: inode node and dnode node share the same nid, so during f2fs_evict_inode(), dnode node truncation will invalidate its NAT entry, so when truncating inode node, it fails due to invalid NAT entry, result in inode is still marked as dirty, fix this issue by clearing dirty for inode and setting SBI_NEED_FSCK flag in filesystem. output from dump.f2fs: [print_node_info: 354] Node ID [0xf:15] is inode i_nid[0] [0x f : 15] Cc: stable@vger.kernel.org Reported-by: Ming Yan <yanming@tju.edu.cn> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
4d8ec91208 |
f2fs: should not truncate blocks during roll-forward recovery
If the file preallocated blocks and fsync'ed, we should not truncate them during
roll-forward recovery which will recover i_size correctly back.
Fixes:
|
||
![]() |
ba900534f8 |
f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
Let's purge inode cache in order to avoid the below deadlock. [freeze test] shrinkder freeze_super - pwercpu_down_write(SB_FREEZE_FS) - super_cache_scan - down_read(&sb->s_umount) - prune_icache_sb - dispose_list - evict - f2fs_evict_inode thaw_super - down_write(&sb->s_umount); - __percpu_down_read(SB_FREEZE_FS) Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
2fef99b837 |
f2fs: fix missing free nid in f2fs_handle_failed_inode
This patch fixes xfstests/generic/475 failure.
[ 293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
[ 293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[ 293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[ 293.691764] sh (7615): drop_caches: 3
[ 293.691819] sh (7616): drop_caches: 3
[ 293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
[ 293.695659] sh (7618): drop_caches: 3
[ 293.696979] sh (7617): drop_caches: 3
[ 293.700290] sh (7623): drop_caches: 3
[ 293.708621] sh (7626): drop_caches: 3
[ 293.711386] sh (7628): drop_caches: 3
[ 293.711825] sh (7627): drop_caches: 3
[ 293.716738] sh (7630): drop_caches: 3
[ 293.719613] sh (7632): drop_caches: 3
[ 293.720971] sh (7633): drop_caches: 3
[ 293.727741] sh (7634): drop_caches: 3
[ 293.730783] sh (7636): drop_caches: 3
[ 293.732681] sh (7635): drop_caches: 3
[ 293.732988] sh (7637): drop_caches: 3
[ 293.738836] sh (7639): drop_caches: 3
[ 293.740568] sh (7641): drop_caches: 3
[ 293.743053] sh (7640): drop_caches: 3
[ 293.821889] ------------[ cut here ]------------
[ 293.824654] kernel BUG at fs/f2fs/node.c:3334!
[ 293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G OE 5.17.0-rc1-custom #1
[ 293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
[ 293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
[ 293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
[ 293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
[ 293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
[ 293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
[ 293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
[ 293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
[ 293.847855] FS: 00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
[ 293.850647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
[ 293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 293.858380] Call Trace:
[ 293.859302] <TASK>
[ 293.860311] ? ttwu_do_wakeup+0x1c/0x170
[ 293.861800] ? ttwu_do_activate+0x6d/0xb0
[ 293.863057] ? _raw_spin_unlock_irqrestore+0x29/0x40
[ 293.864411] ? try_to_wake_up+0x9d/0x5e0
[ 293.865618] ? debug_smp_processor_id+0x17/0x20
[ 293.866934] ? debug_smp_processor_id+0x17/0x20
[ 293.868223] ? free_unref_page+0xbf/0x120
[ 293.869470] ? __free_slab+0xcb/0x1c0
[ 293.870614] ? preempt_count_add+0x7a/0xc0
[ 293.871811] ? __slab_free+0xa0/0x2d0
[ 293.872918] ? __wake_up_common_lock+0x8a/0xc0
[ 293.874186] ? __slab_free+0xa0/0x2d0
[ 293.875305] ? free_inode_nonrcu+0x20/0x20
[ 293.876466] ? free_inode_nonrcu+0x20/0x20
[ 293.877650] ? debug_smp_processor_id+0x17/0x20
[ 293.878949] ? call_rcu+0x11a/0x240
[ 293.880060] ? f2fs_destroy_stats+0x59/0x60 [f2fs]
[ 293.881437] ? kfree+0x1fe/0x230
[ 293.882674] f2fs_put_super+0x160/0x390 [f2fs]
[ 293.883978] generic_shutdown_super+0x7a/0x120
[ 293.885274] kill_block_super+0x27/0x50
[ 293.886496] kill_f2fs_super+0x7f/0x100 [f2fs]
[ 293.887806] deactivate_locked_super+0x35/0xa0
[ 293.889271] deactivate_super+0x40/0x50
[ 293.890513] cleanup_mnt+0x139/0x190
[ 293.891689] __cleanup_mnt+0x12/0x20
[ 293.892850] task_work_run+0x64/0xa0
[ 293.894035] exit_to_user_mode_prepare+0x1b7/0x1c0
[ 293.895409] syscall_exit_to_user_mode+0x27/0x50
[ 293.896872] do_syscall_64+0x48/0xc0
[ 293.898090] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 293.899517] RIP: 0033:0x7f5a975cd25b
Fixes:
|
||
![]() |
1d1df41c5a |
f2fs-for-5.17-rc1
In this round, we've tried to address some performance issues in f2fs_checkpoint and direct IO flows. Also, there was a work to enhance the page cache management used for compression. Other than them, we've done typical work including sysfs, code clean-ups, tracepoint, sanity check, in addition to bug fixes on corner cases. Enhancement: - use iomap for direct IO - try to avoid lock contention to improve f2fs_ckpt speed - avoid unnecessary memory allocation in compression flow - POSIX_FADV_DONTNEED drops the page cache containing compression pages - add some sysfs entries (gc_urgent_high_remaining, pending_discard) Bug fix: - try not to expose unwritten blocks to user by DIO : this was added to avoid merge conflict; another patch is coming to address other missing case. - relax minor error condition for file pinning feature used in Android OTA - fix potential deadlock case in compression flow - should not truncate any block on pinned file In addition, we've done some code clean-ups and tracepoint/sanity check improvement. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmHnY0sACgkQQBSofoJI UNIOkg//UmjCSSG63/YZM/lQQQe4kK/tT6QTT8W/VQtzWL9vXcL7bcaxzwX3LQbR Gb47Zmsw9bzVJt6GQ2VRbODE1py/KPNMl5SDXJXHo6fOZ/dOnHve32gLwcLEzhPd casB0TbwQJ6bpEsJiZ5ho741mURxUrSCHAAX6QIQVXh8ofm9qAqlWu74OLI6UHiV MM84XmXcHtGUZG5SCTWfSCJhJM6Az/3A83ws9KVeu86dlE7IrigphU2nI2vdCKiO trR3CiLC/364fiM+9ssLS3X2wKFPD/unEU7ljBv5UaG36jsVfW+tisjTKldzpiKK 44cNgDv1FEDxC0g3FKUhEGezAhxT8AJZB0in0zn8+5scarKGJtFCy9XhCGMVaeP+ usxvHVy8Ga1I7sMV6oHEBcGiPJWkmurzq1XXobtj6oL/JxN4gqUJeHTcod89hQHA lx9kZs7MLKm2au+T3gZf5xyx35YCie8sY/N1qoPy8tU9Q7FJ54NdqqAc9JEZ6mSk k9ybMaa/srHG/EI/XYPw0DrobHg6P5+bYtmsRvw2vP/nsNsD3ZI/EwBBEll2ITxC V5Dn7MljYWI/5kB41Hl5xz6X65WeIN7koRyTXw5mp9tkNrLugqII5hzhwhSlcqJ1 3k9TAN3RbVpWHBcyryDyLbm/+dcbwIJ4v/eJEMIDk8F2SrBGOZs= =LCJH -----END PGP SIGNATURE----- Merge tag 'f2fs-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've tried to address some performance issues in f2fs_checkpoint and direct IO flows. Also, there was a work to enhance the page cache management used for compression. Other than them, we've done typical work including sysfs, code clean-ups, tracepoint, sanity check, in addition to bug fixes on corner cases. Enhancements: - use iomap for direct IO - try to avoid lock contention to improve f2fs_ckpt speed - avoid unnecessary memory allocation in compression flow - POSIX_FADV_DONTNEED drops the page cache containing compression pages - add some sysfs entries (gc_urgent_high_remaining, pending_discard) Bug fixes: - try not to expose unwritten blocks to user by DIO (this was added to avoid merge conflict; another patch is coming to address other missing case) - relax minor error condition for file pinning feature used in Android OTA - fix potential deadlock case in compression flow - should not truncate any block on pinned file In addition, we've done some code clean-ups and tracepoint/sanity check improvement" * tag 'f2fs-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits) f2fs: do not allow partial truncation on pinned file f2fs: remove redunant invalidate compress pages f2fs: Simplify bool conversion f2fs: don't drop compressed page cache in .{invalidate,release}page f2fs: fix to reserve space for IO align feature f2fs: fix to check available space of CP area correctly in update_ckpt_flags() f2fs: support fault injection to f2fs_trylock_op() f2fs: clean up __find_inline_xattr() with __find_xattr() f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr() f2fs: do not bother checkpoint by f2fs_get_node_info f2fs: avoid down_write on nat_tree_lock during checkpoint f2fs: compress: fix potential deadlock of compress file f2fs: avoid EINVAL by SBI_NEED_FSCK when pinning a file f2fs: add gc_urgent_high_remaining sysfs node f2fs: fix to do sanity check in is_alive() f2fs: fix to avoid panic in is_alive() if metadata is inconsistent f2fs: fix to do sanity check on inode type during garbage collection f2fs: avoid duplicate call of mark_inode_dirty f2fs: show number of pending discard commands f2fs: support POSIX_FADV_DONTNEED drop compressed page cache ... |
||
![]() |
4034247a0d |
mm: introduce memalloc_retry_wait()
Various places in the kernel - largely in filesystems - respond to a memory allocation failure by looping around and re-trying. Some of these cannot conveniently use __GFP_NOFAIL, for reasons such as: - a GFP_ATOMIC allocation, which __GFP_NOFAIL doesn't work on - a need to check for the process being signalled between failures - the possibility that other recovery actions could be performed - the allocation is quite deep in support code, and passing down an extra flag to say if __GFP_NOFAIL is wanted would be clumsy. Many of these currently use congestion_wait() which (in almost all cases) simply waits the given timeout - congestion isn't tracked for most devices. It isn't clear what the best delay is for loops, but it is clear that the various filesystems shouldn't be responsible for choosing a timeout. This patch introduces memalloc_retry_wait() with takes on that responsibility. Code that wants to retry a memory allocation can call this function passing the GFP flags that were used. It will wait however is appropriate. For now, it only considers __GFP_NORETRY and whatever gfpflags_allow_blocking() tests. If blocking is allowed without __GFP_NORETRY, then alloc_page either made some reclaim progress, or waited for a while, before failing. So there is no need for much further waiting. memalloc_retry_wait() will wait until the current jiffie ends. If this condition is not met, then alloc_page() won't have waited much if at all. In that case memalloc_retry_wait() waits about 200ms. This is the delay that most current loops uses. linux/sched/mm.h needs to be included in some files now, but linux/backing-dev.h does not. Link: https://lkml.kernel.org/r/163754371968.13692.1277530886009912421@noble.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Michal Hocko <mhocko@suse.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
2b642898e5 |
f2fs: remove redunant invalidate compress pages
Compress page will invalidate in truncate block process too, so remove redunant invalidate compress pages in f2fs_evict_inode. Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
a9419b63bf |
f2fs: do not bother checkpoint by f2fs_get_node_info
This patch tries to mitigate lock contention between f2fs_write_checkpoint and f2fs_get_node_info along with nat_tree_lock. The idea is, if checkpoint is currently running, other threads that try to grab nat_tree_lock would be better to wait for checkpoint. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
d1917865a7 |
f2fs: fix remove page failed in invalidate compress pages
Since compress inode not a regular file, generic_error_remove_page in
f2fs_invalidate_compress_pages will always be failed, set compress
inode as a regular file to fix it.
Fixes:
|
||
![]() |
d4dd19ec1e |
f2fs: do not expose unwritten blocks to user by DIO
DIO preallocates physical blocks before writing data, but if an error occurrs or power-cut happens, we can see block contents from the disk. This patch tries to fix it by 1) turning to buffered writes for DIO into holes, 2) truncating unwritten blocks from error or power-cut. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
10a2687856 |
f2fs: support fault injection for dquot_initialize()
This patch adds a new function f2fs_dquot_initialize() to wrap dquot_initialize(), and it supports to inject fault into f2fs_dquot_initialize() to simulate inner failure occurs in dquot_initialize(). Usage: a) echo 65536 > /sys/fs/f2fs/<dev>/inject_type or b) mount -o fault_type=65536 <dev> <mountpoint> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
92d602bc71 |
f2fs: should use GFP_NOFS for directory inodes
We use inline_dentry which requires to allocate dentry page when adding a link.
If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem)
twice by f2fs_lock_op(). I think this should be okay, but how about stopping
the lockdep complaint [1]?
f2fs_create()
- f2fs_lock_op()
- f2fs_do_add_link()
- __f2fs_find_entry
- f2fs_get_read_data_page()
-> kswapd
- shrink_node
- f2fs_evict_inode
- f2fs_lock_op()
[1]
fs_reclaim
){+.+.}-{0:0}
:
kswapd0: lock_acquire+0x114/0x394
kswapd0: __fs_reclaim_acquire+0x40/0x50
kswapd0: prepare_alloc_pages+0x94/0x1ec
kswapd0: __alloc_pages_nodemask+0x78/0x1b0
kswapd0: pagecache_get_page+0x2e0/0x57c
kswapd0: f2fs_get_read_data_page+0xc0/0x394
kswapd0: f2fs_find_data_page+0xa4/0x23c
kswapd0: find_in_level+0x1a8/0x36c
kswapd0: __f2fs_find_entry+0x70/0x100
kswapd0: f2fs_do_add_link+0x84/0x1ec
kswapd0: f2fs_mkdir+0xe4/0x1e4
kswapd0: vfs_mkdir+0x110/0x1c0
kswapd0: do_mkdirat+0xa4/0x160
kswapd0: __arm64_sys_mkdirat+0x24/0x34
kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8
kswapd0: do_el0_svc+0x28/0xa0
kswapd0: el0_svc+0x24/0x38
kswapd0: el0_sync_handler+0x88/0xec
kswapd0: el0_sync+0x1c0/0x200
kswapd0:
-> #1
(
&sbi->cp_rwsem
){++++}-{3:3}
:
kswapd0: lock_acquire+0x114/0x394
kswapd0: down_read+0x7c/0x98
kswapd0: f2fs_do_truncate_blocks+0x78/0x3dc
kswapd0: f2fs_truncate+0xc8/0x128
kswapd0: f2fs_evict_inode+0x2b8/0x8b8
kswapd0: evict+0xd4/0x2f8
kswapd0: iput+0x1c0/0x258
kswapd0: do_unlinkat+0x170/0x2a0
kswapd0: __arm64_sys_unlinkat+0x4c/0x68
kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8
kswapd0: do_el0_svc+0x28/0xa0
kswapd0: el0_svc+0x24/0x38
kswapd0: el0_sync_handler+0x88/0xec
kswapd0: el0_sync+0x1c0/0x200
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
6ce19aff0b |
f2fs: compress: add compress_inode to cache compressed blocks
Support to use address space of inner inode to cache compressed block, in order to improve cache hit ratio of random read. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
b763f3bedc |
f2fs: restructure f2fs page.private layout
Restruct f2fs page private layout for below reasons: There are some cases that f2fs wants to set a flag in a page to indicate a specified status of page: a) page is in transaction list for atomic write b) page contains dummy data for aligned write c) page is migrating for GC d) page contains inline data for inline inode flush e) page belongs to merkle tree, and is verified for fsverity f) page is dirty and has filesystem/inode reference count for writeback g) page is temporary and has decompress io context reference for compression There are existed places in page structure we can use to store f2fs private status/data: - page.flags: PG_checked, PG_private - page.private However it was a mess when we using them, which may cause potential confliction: page.private PG_private PG_checked page._refcount (+1 at most) a) -1 set +1 b) -2 set c), d), e) set f) 0 set +1 g) pointer set The other problem is page.flags has no free slot, if we can avoid set zero to page.private and set PG_private flag, then we use non-zero value to indicate PG_private status, so that we may have chance to reclaim PG_private slot for other usage. [1] The other concern is f2fs has bad scalability in aspect of indicating more page status. So in this patch, let's restructure f2fs' page.private as below to solve above issues: Layout A: lowest bit should be 1 | bit0 = 1 | bit1 | bit2 | ... | bit MAX | private data .... | bit 0 PAGE_PRIVATE_NOT_POINTER bit 1 PAGE_PRIVATE_ATOMIC_WRITE bit 2 PAGE_PRIVATE_DUMMY_WRITE bit 3 PAGE_PRIVATE_ONGOING_MIGRATION bit 4 PAGE_PRIVATE_INLINE_INODE bit 5 PAGE_PRIVATE_REF_RESOURCE bit 6- f2fs private data Layout B: lowest bit should be 0 page.private is a wrapped pointer. After the change: page.private PG_private PG_checked page._refcount (+1 at most) a) 11 set +1 b) 101 set +1 c) 1001 set +1 d) 10001 set +1 e) set f) 100001 set +1 g) pointer set +1 [1] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@casper.infradead.org/T/#u Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
5f029c045c |
f2fs: clean up build warnings
This patch combined the below three clean-up patches. - modify open brace '{' following function definitions - ERROR: spaces required around that ':' - ERROR: spaces required before the open parenthesis '(' - ERROR: spaces prohibited before that ',' - Made suggested modifications from checkpatch in reference to WARNING: Missing a blank line after declarations Signed-off-by: Yi Zhuang <zhuangyi1@huawei.com> Signed-off-by: Jia Yang <jiayang5@huawei.com> Signed-off-by: Jack Qiu <jack.qiu@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
2c718feead |
f2fs: fix a typo in inode.c
Do a trivial typo fix. s/runing/running Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
b28f047b28 |
f2fs: compress: support chksum
This patch supports to store chksum value with compressed data, and verify the integrality of compressed data while reading the data. The feature can be enabled through specifying mount option 'compress_chksum'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
d662fad143 |
f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
If compressed inode has inconsistent fields on i_compress_algorithm, i_compr_blocks and i_log_cluster_size, we missed to set SBI_NEED_FSCK to notice fsck to repair the inode, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
f6322f3f12 |
f2fs: reject CASEFOLD inode flag without casefold feature
syzbot reported:
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 6860 Comm: syz-executor835 Not tainted 5.9.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:utf8_casefold+0x43/0x1b0 fs/unicode/utf8-core.c:107
[...]
Call Trace:
f2fs_init_casefolded_name fs/f2fs/dir.c:85 [inline]
__f2fs_setup_filename fs/f2fs/dir.c:118 [inline]
f2fs_prepare_lookup+0x3bf/0x640 fs/f2fs/dir.c:163
f2fs_lookup+0x10d/0x920 fs/f2fs/namei.c:494
__lookup_hash+0x115/0x240 fs/namei.c:1445
filename_create+0x14b/0x630 fs/namei.c:3467
user_path_create fs/namei.c:3524 [inline]
do_mkdirat+0x56/0x310 fs/namei.c:3664
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]
The problem is that an inode has F2FS_CASEFOLD_FL set, but the
filesystem doesn't have the casefold feature flag set, and therefore
super_block::s_encoding is NULL.
Fix this by making sanity_check_inode() reject inodes that have
F2FS_CASEFOLD_FL when the filesystem doesn't have the casefold feature.
Reported-by: syzbot+05139c4039d0679e19ff@syzkaller.appspotmail.com
Fixes:
|
||
![]() |
c2759ebaf7 |
f2fs: change i_compr_blocks of inode to atomic value
writepages() can be concurrently invoked for the same file by different threads such as a thread fsyncing the file and a kworker kernel thread. So, changing i_compr_blocks without protection is racy and we need to protect it by changing it with atomic type value. Plus, we don't need a 64bit value for i_compr_blocks, so just we will use a atomic value, not atomic64. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
a6d601f30d |
f2fs: fix to wait page writeback before update
Filesystem including f2fs should support stable page for special device like software raid, however there is one missing path that page could be updated while it is writeback state as below, fix this. - gc_node_segment - f2fs_move_node_page - __write_node_page - set_page_writeback - do_read_inode - f2fs_init_extent_tree - __f2fs_init_extent_tree i_ext->len = 0; Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
db5ae36329 |
f2fs: fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry
Under some condition, the __write_node_page will submit a page which is not f2fs_in_warm_node_list and will not call f2fs_add_fsync_node_entry. f2fs_gc continue to run to invoke f2fs_iget -> do_read_inode to read the same node page and set code node, which make f2fs_in_warm_node_list become true, that will cause f2fs_bug_on in f2fs_del_fsync_node_entry when f2fs_write_end_io called. - f2fs_write_end_io - f2fs_iget - do_read_inode - set_cold_node recover cold node flag - f2fs_in_warm_node_list - is_cold_node if node is cold, assume we have added node to fsync_node_list during writepages() - f2fs_del_fsync_node_entry - f2fs_bug_on() due to node page is not in fsync_node_list [ 34.966133] Call trace: [ 34.969902] f2fs_del_fsync_node_entry+0x100/0x108 [ 34.976071] f2fs_write_end_io+0x1e0/0x288 [ 34.981539] bio_endio+0x248/0x270 [ 34.986289] blk_update_request+0x2b0/0x4d8 [ 34.991841] scsi_end_request+0x40/0x440 [ 34.997126] scsi_io_completion+0xa4/0x748 [ 35.002593] scsi_finish_command+0xdc/0x110 [ 35.008143] scsi_softirq_done+0x118/0x150 [ 35.013610] blk_done_softirq+0x8c/0xe8 [ 35.018811] __do_softirq+0x2e8/0x578 [ 35.023828] irq_exit+0xfc/0x120 [ 35.028398] handle_IPI+0x1d8/0x330 [ 35.033233] gic_handle_irq+0x110/0x1d4 [ 35.038433] el1_irq+0xb4/0x130 [ 35.042917] kmem_cache_alloc+0x3f0/0x418 [ 35.048288] radix_tree_node_alloc+0x50/0xf8 [ 35.053933] __radix_tree_create+0xf8/0x188 [ 35.059484] __radix_tree_insert+0x3c/0x128 [ 35.065035] add_gc_inode+0x90/0x118 [ 35.069967] f2fs_gc+0x1b80/0x2d70 [ 35.074718] f2fs_disable_checkpoint+0x94/0x1d0 [ 35.080621] f2fs_fill_super+0x10c4/0x1b88 [ 35.086088] mount_bdev+0x194/0x1e0 [ 35.090923] f2fs_mount+0x40/0x50 [ 35.095589] mount_fs+0xb4/0x190 [ 35.100159] vfs_kern_mount+0x80/0x1d8 [ 35.105260] do_mount+0x478/0xf18 [ 35.109926] ksys_mount+0x90/0xd0 [ 35.114592] __arm64_sys_mount+0x24/0x38 Signed-off-by: Wuyun Zhao <zhaowuyun@wingtech.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7653b9d875 |
f2fs: fix potential .flags overflow on 32bit architecture
f2fs_inode_info.flags is unsigned long variable, it has 32 bits
in 32bit architecture, since we introduced FI_MMAP_FILE flag
when we support data compression, we may access memory cross
the border of .flags field, corrupting .i_sem field, result in
below deadlock.
To fix this issue, let's expand .flags as an array to grab enough
space to store new flags.
Call Trace:
__schedule+0x8d0/0x13fc
? mark_held_locks+0xac/0x100
schedule+0xcc/0x260
rwsem_down_write_slowpath+0x3ab/0x65d
down_write+0xc7/0xe0
f2fs_drop_nlink+0x3d/0x600 [f2fs]
f2fs_delete_inline_entry+0x300/0x440 [f2fs]
f2fs_delete_entry+0x3a1/0x7f0 [f2fs]
f2fs_unlink+0x500/0x790 [f2fs]
vfs_unlink+0x211/0x490
do_unlinkat+0x483/0x520
sys_unlink+0x4a/0x70
do_fast_syscall_32+0x12b/0x683
entry_SYSENTER_32+0xaa/0x102
Fixes:
|
||
![]() |
5df7731f60 |
f2fs: introduce DEFAULT_IO_TIMEOUT
As Geert Uytterhoeven reported: for parameter HZ/50 in congestion_wait(BLK_RW_ASYNC, HZ/50); On some platforms, HZ can be less than 50, then unexpected 0 timeout jiffies will be set in congestion_wait(). This patch introduces a macro DEFAULT_IO_TIMEOUT to wrap a determinate value with msecs_to_jiffies(20) to instead HZ/50 to avoid such issue. Quoted from Geert Uytterhoeven: "A timeout of HZ means 1 second. HZ/50 means 20 ms, but has the risk of being zero, if HZ < 50. If you want to use a timeout of 20 ms, you best use msecs_to_jiffies(20), as that takes care of the special cases, and never returns 0." Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7a88ddb560 |
f2fs: fix inconsistent comments
Lack of maintenance on comments may mislead developers, fix them. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
d940aa07ed |
f2fs: fix to check i_compr_blocks correctly
inode.i_blocks counts based on 512byte sector, we need to convert to 4kb sized block count before comparing to i_compr_blocks. In addition, add to print message when sanity check on inode compression configs failed. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
4c8ff7095b |
f2fs: support data compression
This patch tries to support compression in f2fs. - New term named cluster is defined as basic unit of compression, file can be divided into multiple clusters logically. One cluster includes 4 << n (n >= 0) logical pages, compression size is also cluster size, each of cluster can be compressed or not. - In cluster metadata layout, one special flag is used to indicate cluster is compressed one or normal one, for compressed cluster, following metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores data including compress header and compressed data. - In order to eliminate write amplification during overwrite, F2FS only support compression on write-once file, data can be compressed only when all logical blocks in file are valid and cluster compress ratio is lower than specified threshold. - To enable compression on regular inode, there are three ways: * chattr +c file * chattr +c dir; touch dir/file * mount w/ -o compress_extension=ext; touch file.ext Compress metadata layout: [Dnode Structure] +-----------------------------------------------+ | cluster 1 | cluster 2 | ......... | cluster N | +-----------------------------------------------+ . . . . . . . . . Compressed Cluster . . Normal Cluster . +----------+---------+---------+---------+ +---------+---------+---------+---------+ |compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 | +----------+---------+---------+---------+ +---------+---------+---------+---------+ . . . . . . +-------------+-------------+----------+----------------------------+ | data length | data chksum | reserved | compressed data | +-------------+-------------+----------+----------------------------+ Changelog: 20190326: - fix error handling of read_end_io(). - remove unneeded comments in f2fs_encrypt_one_page(). 20190327: - fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages(). - don't jump into loop directly to avoid uninitialized variables. - add TODO tag in error path of f2fs_write_cache_pages(). 20190328: - fix wrong merge condition in f2fs_read_multi_pages(). - check compressed file in f2fs_post_read_required(). 20190401 - allow overwrite on non-compressed cluster. - check cluster meta before writing compressed data. 20190402 - don't preallocate blocks for compressed file. - add lz4 compress algorithm - process multiple post read works in one workqueue Now f2fs supports processing post read work in multiple workqueue, it shows low performance due to schedule overhead of multiple workqueue executing orderly. 20190921 - compress: support buffered overwrite C: compress cluster flag V: valid block address N: NEW_ADDR One cluster contain 4 blocks before overwrite after overwrite - VVVV -> CVNN - CVNN -> VVVV - CVNN -> CVNN - CVNN -> CVVV - CVVV -> CVNN - CVVV -> CVVV 20191029 - add kconfig F2FS_FS_COMPRESSION to isolate compression related codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm. note that: will remove lzo backend if Jaegeuk agreed that too. - update codes according to Eric's comments. 20191101 - apply fixes from Jaegeuk 20191113 - apply fixes from Jaegeuk - split workqueue for fsverity 20191216 - apply fixes from Jaegeuk 20200117 - fix to avoid NULL pointer dereference [Jaegeuk Kim] - add tracepoint for f2fs_{,de}compress_pages() - fix many bugs and add some compression stats - fix overwrite/mmap bugs - address 32bit build error, reported by Geert. - bug fixes when handling errors and i_compressed_blocks Reported-by: <noreply@ellerman.id.au> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
c45d6002ff |
f2fs: show f2fs instance in printk_ratelimited
As Eric mentioned, bare printk{,_ratelimited} won't show which filesystem instance these message is coming from, this patch tries to show fs instance with sb->s_id field in all places we missed before. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
fe1897eaa6 |
f2fs: fix to update time in lazytime mode
generic/018 reports an inconsistent status of atime, the testcase is as below: - open file with O_SYNC - write file to construct fraged space - calc md5 of file - record {a,c,m}time - defrag file --- do nothing - umount & mount - check {a,c,m}time The root cause is, as f2fs enables lazytime by default, atime update will dirty vfs inode, rather than dirtying f2fs inode (by set with FI_DIRTY_INODE), so later f2fs_write_inode() called from VFS will fail to update inode page due to our skip: f2fs_write_inode() if (is_inode_flag_set(inode, FI_DIRTY_INODE)) return 0; So eventually, after evict(), we lose last atime for ever. To fix this issue, we need to check whether {a,c,m,cr}time is consistent in between inode cache and inode page, and only skip f2fs_update_inode() if f2fs inode is not dirty and time is consistent as well. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
fbc246a12a |
f2fs-for-5.4-rc1
In this round, we introduced casefolding support in f2fs, and fixed various bugs in individual features such as IO alignment, checkpoint=disable, quota, and swapfile. Enhancement: - support casefolding w/ enhancement in ext4 - support fiemap for directory - support FS_IO_GET|SET_FSLABEL Bug fix: - fix IO stuck during checkpoint=disable - avoid infinite GC loop - fix panic/overflow related to IO alignment feature - fix livelock in swap file - fix discard command leak - disallow dio for atomic_write -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl2FL1wACgkQQBSofoJI UNIRdg/+M6QNiAazIbqzPXyUUkyZQYR5YKhu2abd49o/g38xjTq1afH0PZpQyDrA 9RncR4xsW8F241vPLVoCSanfaa+MxN6xHi3TrD8zYtZWxOcPF6v1ETHeUXGTHuJ2 gqlk+mm+CnY02M6rxW7XwixuXwttT3bF9+cf1YBWRpNoVrR+SjNqgeJS7FmJwXKd nGKb+94OxuygL1NUop+LDUo3qRQjc0Sxv/7qj/K4lhqgTjhAxMYT2KvUP/1MZ7U0 Kh9WIayDXnpoioxMPnt4VEb+JgXfLLFELvQzNjwulk15GIweuJzwVYCBXcRoX0cK eRBRmRy/kRp/e0R1gvl3kYrXQC2AC5QTlBVH/0ESwnaukFiUBKB509vH4aqE/vpB Krldjfg+uMHkc7XiNBf1boDp713vJ76iRKUDWoVb6H/sPbdJ+jtrnUNeBP8CVpWh u31SY1MppnmKhhsoCHQRbhbXO/Z29imBQgF9Tm3IFWImyLY3IU40vFj2fR15gJkL X3x/HWxQynSqyqEOwAZrvhCRTvBAIGIVy5292Di1RkqIoh8saxcqiaywgLz1+eVE 0DCOoh8R6sSbfN/EEh+yZqTxmjo0VGVTw30XVI6QEo4cY5Vfc9u6dN6SRWVRvbjb kPb3dKcMrttgbn3fcXU8Jbw1AOor9N6afHaqs0swQJyci2RwJyc= =oonf -----END PGP SIGNATURE----- Merge tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we introduced casefolding support in f2fs, and fixed various bugs in individual features such as IO alignment, checkpoint=disable, quota, and swapfile. Enhancement: - support casefolding w/ enhancement in ext4 - support fiemap for directory - support FS_IO_GET|SET_FSLABEL Bug fix: - fix IO stuck during checkpoint=disable - avoid infinite GC loop - fix panic/overflow related to IO alignment feature - fix livelock in swap file - fix discard command leak - disallow dio for atomic_write" * tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits) f2fs: add a condition to detect overflow in f2fs_ioc_gc_range() f2fs: fix to add missing F2FS_IO_ALIGNED() condition f2fs: fix to fallback to buffered IO in IO aligned mode f2fs: fix to handle error path correctly in f2fs_map_blocks f2fs: fix extent corrupotion during directIO in LFS mode f2fs: check all the data segments against all node ones f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY f2fs: fix inode rwsem regression f2fs: fix to avoid accessing uninitialized field of inode page in is_alive() f2fs: avoid infinite GC loop due to stale atomic files f2fs: Fix indefinite loop in f2fs_gc() f2fs: convert inline_data in prior to i_size_write f2fs: fix error path of f2fs_convert_inline_page() f2fs: add missing documents of reserve_root/resuid/resgid f2fs: fix flushing node pages when checkpoint is disabled f2fs: enhance f2fs_is_checkpoint_ready()'s readability f2fs: clean up __bio_alloc()'s parameter f2fs: fix wrong error injection path in inc_valid_block_count() f2fs: fix to writeout dirty inode during node flush f2fs: optimize case-insensitive lookups ... |
||
![]() |
00e09c0bcc |
f2fs: enhance f2fs_is_checkpoint_ready()'s readability
This patch changes sematics of f2fs_is_checkpoint_ready()'s return value as: return true when checkpoint is ready, other return false, it can improve readability of below conditions. f2fs_submit_page_write() ... if (is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN) || !f2fs_is_checkpoint_ready(sbi)) __submit_merged_bio(io); f2fs_balance_fs() ... if (!f2fs_is_checkpoint_ready(sbi)) return; Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
33ac18a15c |
f2fs: use wrapped f2fs_cp_error()
Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
2c2eb7a300 |
f2fs: Support case-insensitive file name lookups
Modeled after commit
|
||
![]() |
0f1898f93c |
f2fs: fix to avoid tagging SBI_QUOTA_NEED_REPAIR incorrectly
On a quota disabled image, with fault injection, SBI_QUOTA_NEED_REPAIR will be set incorrectly in error path of f2fs_evict_inode(), fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
95ae251fe8 |
f2fs: add fs-verity support
Add fs-verity support to f2fs. fs-verity is a filesystem feature that enables transparent integrity protection and authentication of read-only files. It uses a dm-verity like mechanism at the file level: a Merkle tree is used to verify any block in the file in log(filesize) time. It is implemented mainly by helper functions in fs/verity/. See Documentation/filesystems/fsverity.rst for the full documentation. The f2fs support for fs-verity consists of: - Adding a filesystem feature flag and an inode flag for fs-verity. - Implementing the fsverity_operations to support enabling verity on an inode and reading/writing the verity metadata. - Updating ->readpages() to verify data as it's read from verity files and to support reading verity metadata pages. - Updating ->write_begin(), ->write_end(), and ->writepages() to support writing verity metadata pages. - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl(). Like ext4, f2fs stores the verity metadata (Merkle tree and fsverity_descriptor) past the end of the file, starting at the first 64K boundary beyond i_size. This approach works because (a) verity files are readonly, and (b) pages fully beyond i_size aren't visible to userspace but can be read/written internally by f2fs with only some relatively small changes to f2fs. Extended attributes cannot be used because (a) f2fs limits the total size of an inode's xattr entries to 4096 bytes, which wouldn't be enough for even a single Merkle tree block, and (b) f2fs encryption doesn't encrypt xattrs, yet the verity metadata *must* be encrypted when the file is because it contains hashes of the plaintext data. Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com> |
||
![]() |
10f966bbf5 |
f2fs: use generic EFSBADCRC/EFSCORRUPTED
f2fs uses EFAULT as error number to indicate filesystem is corrupted all the time, but generic filesystems use EUCLEAN for such condition, we need to change to follow others. This patch adds two new macros as below to wrap more generic error code macros, and spread them in code. EFSBADCRC EBADMSG /* Bad CRC detected */ EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Chao Yu <yuchao0@huawei.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
dcbb4c10e6 |
f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
- Add and use f2fs_<level> macros - Convert f2fs_msg to f2fs_printk - Remove level from f2fs_printk and embed the level in the format - Coalesce formats and align multi-line arguments - Remove unnecessary duplicate extern f2fs_msg f2fs.h Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
5043a9643f |
f2fs: only set project inherit bit for directory
It doesn't make any sense to have project inherit bits for regular files, even though this won't cause any problem, but it is better fix this. Cc: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
93770ab7a6 |
f2fs: introduce DATA_GENERIC_ENHANCE
Previously, f2fs_is_valid_blkaddr(, blkaddr, DATA_GENERIC) will check whether @blkaddr locates in main area or not. That check is weak, since the block address in range of main area can point to the address which is not valid in segment info table, and we can not detect such condition, we may suffer worse corruption as system continues running. So this patch introduce DATA_GENERIC_ENHANCE to enhance the sanity check which trigger SIT bitmap check rather than only range check. This patch did below changes as wel: - set SBI_NEED_FSCK in f2fs_is_valid_blkaddr(). - get rid of is_valid_data_blkaddr() to avoid panic if blkaddr is invalid. - introduce verify_fio_blkaddr() to wrap fio {new,old}_blkaddr validation check. - spread blkaddr check in: * f2fs_get_node_info() * __read_out_blkaddrs() * f2fs_submit_page_read() * ra_data_block() * do_recover_data() This patch can fix bug reported from bugzilla below: https://bugzilla.kernel.org/show_bug.cgi?id=203215 https://bugzilla.kernel.org/show_bug.cgi?id=203223 https://bugzilla.kernel.org/show_bug.cgi?id=203231 https://bugzilla.kernel.org/show_bug.cgi?id=203235 https://bugzilla.kernel.org/show_bug.cgi?id=203241 = Update by Jaegeuk Kim = DATA_GENERIC_ENHANCE enhanced to validate block addresses on read/write paths. But, xfstest/generic/446 compalins some generated kernel messages saying invalid bitmap was detected when reading a block. The reaons is, when we get the block addresses from extent_cache, there is no lock to synchronize it from truncating the blocks in parallel. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
b42b179bda |
f2fs: fix to do checksum even if inode page is uptodate
As Jungyeon reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203221 - Overview When mounting the attached crafted image and running program, this error is reported. The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on. - Reproduces cc poc_07.c mkdir test mount -t f2fs tmp.img test cp a.out test cd test sudo ./a.out - Messages kernel BUG at fs/f2fs/node.c:1279! RIP: 0010:read_node_page+0xcf/0xf0 Call Trace: __get_node_page+0x6b/0x2f0 f2fs_iget+0x8f/0xdf0 f2fs_lookup+0x136/0x320 __lookup_slow+0x92/0x140 lookup_slow+0x30/0x50 walk_component+0x1c1/0x350 path_lookupat+0x62/0x200 filename_lookup+0xb3/0x1a0 do_fchmodat+0x3e/0xa0 __x64_sys_chmod+0x12/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 On below paths, we can have opportunity to readahead inode page - gc_node_segment -> f2fs_ra_node_page - gc_data_segment -> f2fs_ra_node_page - f2fs_fill_dentries -> f2fs_ra_node_page Unlike synchronized read, on readahead path, we can set page uptodate before verifying page's checksum, then read_node_page() will trigger kernel panic once it encounters a uptodated page w/ incorrect checksum. So considering readahead scenario, we have to do checksum each time when loading inode page even if it is uptodated. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
546d22f070 |
f2fs: fix to clear dirty inode in error path of f2fs_iget()
As Jungyeon reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203217 - Overview When mounting the attached crafted image and running program, I got this error. Additionally, it hangs on sync after running the program. The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on. - Reproduces cc poc_test_05.c mkdir test mount -t f2fs tmp.img test sudo ./a.out sync - Messages kernel BUG at fs/f2fs/inode.c:707! RIP: 0010:f2fs_evict_inode+0x33f/0x3a0 Call Trace: evict+0xba/0x180 f2fs_iget+0x598/0xdf0 f2fs_lookup+0x136/0x320 __lookup_slow+0x92/0x140 lookup_slow+0x30/0x50 walk_component+0x1c1/0x350 path_lookupat+0x62/0x200 filename_lookup+0xb3/0x1a0 do_readlinkat+0x56/0x110 __x64_sys_readlink+0x16/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 During inode loading, __recover_inline_status() can recovery inode status and set inode dirty, once we failed in following process, it will fail the check in f2fs_evict_inode, result in trigger BUG_ON(). Let's clear dirty inode in error path of f2fs_iget() to avoid panic. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
5160bcce5c |
f2fs-for-5.1-rc1
We've continued mainly to fix bugs in this round, as f2fs has been shipped in more devices. Especially, we've focused on stabilizing checkpoint=disable feature, and provided some interfaces for QA. Enhancement: - expose FS_NOCOW_FL for pin_file - run discard jobs at unmount time with timeout - tune discarding thread to avoid idling which consumes power - some checking codes to address vulnerabilities - give random value to i_generation - shutdown with more flags for QA Bug fix: - clean up stale objects when mount is failed along with checkpoint=disable - fix system being stuck due to wrong count by atomic writes - handle some corrupted disk cases - fix a deadlock in f2fs_read_inline_dir We've also added some minor build errors and clean-up patches. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAlyKk4YACgkQQBSofoJI UNIMVw//Rb3nmbQkMW/86DxtHDxuS8GEJmle0DiHeFMHgwy0ET0uZs9/AEfmuejC 95cXnF44QfVaFwkOXCK6aKXJXwN0+ZS0YvV/gPE8lgU6sdQhJBox5DC+rx+OwFq5 rZiF8qvE8iyM9Xt+RfMBGufzUb+LKBz0ozQFZpKJiNTBBf5vpeqMYASEEfxiEmZz GvvUNSBRw39OB5zTl5l2hnoNqkoFu6XHnf4f9+DnraVi8SuQzj6hdqsx0nYTHfLi Rax8kA4HUwoVgjhaLLXFbbhWIQ83bcZ0cj6wq7Lr7NbbIi7bKYP6sxtKjbe2Fuql m9Chm2LIvD1BfJnjdTk2krqY7Z4bX/4gmXukno/8X/cjWkpBV6HFWS73iTgrJjU2 d8kBFXwlIn+JlATSjsTtdfvKkTwxUhaGw1bBA96Am4c5tLQyOqyYWcfQA/tam/v4 dM9EQX5ZeRb6NXDeIxkXNfTSpDRnqlhJsTV5aK8qporyF1RkKVbyCpSt1P4q3KO5 UwsGZLFAVMzFaUVfyIS7dR5QVczQUTCH4g0yFNpBMvF8epOA4+jbYxQeGZfqFK3H mTC/Ba+VWWdYW2pZRNc9TnBsHg/xadMJq7EQb/ykGBe6JZJfB0wREj4LSr1lGK9a cU8JFGyqg1Rt/uRP0bb5IIec1YVton3Lq8ND9VZPNcV/mS5Gehg= =9BoH -----END PGP SIGNATURE----- Merge tag 'f2fs-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "We've continued mainly to fix bugs in this round, as f2fs has been shipped in more devices. Especially, we've focused on stabilizing checkpoint=disable feature, and provided some interfaces for QA. Enhancements: - expose FS_NOCOW_FL for pin_file - run discard jobs at unmount time with timeout - tune discarding thread to avoid idling which consumes power - some checking codes to address vulnerabilities - give random value to i_generation - shutdown with more flags for QA Bug fixes: - clean up stale objects when mount is failed along with checkpoint=disable - fix system being stuck due to wrong count by atomic writes - handle some corrupted disk cases - fix a deadlock in f2fs_read_inline_dir We've also added some minor build error fixes and clean-up patches" * tag 'f2fs-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (53 commits) f2fs: set pin_file under CAP_SYS_ADMIN f2fs: fix to avoid deadlock in f2fs_read_inline_dir() f2fs: fix to adapt small inline xattr space in __find_inline_xattr() f2fs: fix to do sanity check with inode.i_inline_xattr_size f2fs: give some messages for inline_xattr_size f2fs: don't trigger read IO for beyond EOF page f2fs: fix to add refcount once page is tagged PG_private f2fs: remove wrong comment in f2fs_invalidate_page() f2fs: fix to use kvfree instead of kzfree f2fs: print more parameters in trace_f2fs_map_blocks f2fs: trace f2fs_ioc_shutdown f2fs: fix to avoid deadlock of atomic file operations f2fs: fix to dirty inode for i_mode recovery f2fs: give random value to i_generation f2fs: no need to take page lock in readdir f2fs: fix to update iostat correctly in IPU path f2fs: fix encrypted page memory leak f2fs: make fault injection covering __submit_flush_wait() f2fs: fix to retry fill_super only if recovery failed f2fs: silence VM_WARN_ON_ONCE in mempool_alloc ... |
||
![]() |
dd6c89b5f2 |
f2fs: fix to do sanity check with inode.i_inline_xattr_size
As Paul Bandha reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202709 When I run the poc on the mounted f2fs img I get a buffer overflow in read_inline_xattr due to there being no sanity check on the value of i_inline_xattr_size. I created the img by just modifying the value of i_inline_xattr_size in the inode: i_name [test1.txt] i_ext: fofs:0 blkaddr:0 len:0 i_extra_isize [0x 18 : 24] i_inline_xattr_size [0x ffff : 65535] i_addr[ofs] [0x 0 : 0] mkdir /mnt/f2fs mount ./f2fs1.img /mnt/f2fs gcc poc.c -o poc ./poc int main() { int y = syscall(SYS_listxattr, "/mnt/f2fs/test1.txt", NULL, 0); printf("ret %d", y); printf("errno: %d\n", errno); } BUG: KASAN: slab-out-of-bounds in read_inline_xattr+0x18f/0x260 Read of size 262140 at addr ffff88011035efd8 by task f2fs1poc/3263 CPU: 0 PID: 3263 Comm: f2fs1poc Not tainted 4.18.0-custom #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x71/0xab print_address_description+0x83/0x250 kasan_report+0x213/0x350 memcpy+0x1f/0x50 read_inline_xattr+0x18f/0x260 read_all_xattrs+0xba/0x190 f2fs_listxattr+0x9d/0x3f0 listxattr+0xb2/0xd0 path_listxattr+0x93/0xe0 do_syscall_64+0x9d/0x220 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Let's add sanity check for inode.i_inline_xattr_size during f2fs_iget() to avoid this issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
62230e0d70 |
f2fs: use IS_ENCRYPTED() to check encryption status
This commit removes the f2fs specific f2fs_encrypted_inode() and makes use of the generic IS_ENCRYPTED() macro to check for the encryption status of an inode. Acked-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> |
||
![]() |
bae0ee7a76 |
f2fs: check PageWriteback flag for ordered case
For all ordered cases in f2fs_wait_on_page_writeback(), we need to check PageWriteback status, so let's clean up to relocate the check into f2fs_wait_on_page_writeback(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7beb01f744 |
f2fs: clean up f2fs_sb_has_##feature_name
In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to access .raw_super field, to avoid unneeded pointer conversion, this patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly. Just do cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
af033b2aa8 |
f2fs: guarantee journalled quota data by checkpoint
For journalled quota mode, let checkpoint to flush dquot dirty data and quota file data to guarntee persistence of all quota sysfile in last checkpoint, by this way, we can avoid corrupting quota sysfile when encountering SPO. The implementation is as below: 1. add a global state SBI_QUOTA_NEED_FLUSH to indicate that there is cached dquot metadata changes in quota subsystem, and later checkpoint should: a) flush dquot metadata into quota file. b) flush quota file to storage to keep file usage be consistent. 2. add a global state SBI_QUOTA_NEED_REPAIR to indicate that quota operation failed due to -EIO or -ENOSPC, so later, a) checkpoint will skip syncing dquot metadata. b) CP_QUOTA_NEED_FSCK_FLAG will be set in last cp pack to give a hint for fsck repairing. 3. add a global state SBI_QUOTA_SKIP_FLUSH, in checkpoint, if quota data updating is very heavy, it may cause hungtask in block_operation(). To avoid this, if our retry time exceed threshold, let's just skip flushing and retry in next checkpoint(). Signed-off-by: Weichao Guo <guoweichao@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: avoid warnings and set fsck flag] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
ef2a007134 |
f2fs: fix to recover cold bit of inode block during POR
Testcase to reproduce this bug:
1. mkfs.f2fs /dev/sdd
2. mount -t f2fs /dev/sdd /mnt/f2fs
3. touch /mnt/f2fs/file
4. sync
5. chattr +A /mnt/f2fs/file
6. xfs_io -f /mnt/f2fs/file -c "fsync"
7. godown /mnt/f2fs
8. umount /mnt/f2fs
9. mount -t f2fs /dev/sdd /mnt/f2fs
10. chattr -A /mnt/f2fs/file
11. xfs_io -f /mnt/f2fs/file -c "fsync"
12. umount /mnt/f2fs
13. mount -t f2fs /dev/sdd /mnt/f2fs
14. lsattr /mnt/f2fs/file
-----------------N- /mnt/f2fs/file
But actually, we expect the corrct result is:
-------A---------N- /mnt/f2fs/file
The reason is in step 9) we missed to recover cold bit flag in inode
block, so later, in fsync, we will skip write inode block due to below
condition check, result in lossing data in another SPOR.
f2fs_fsync_node_pages()
if (!IS_DNODE(page) || !is_cold_node(page))
continue;
Note that, I guess that some non-dir inode has already lost cold bit
during POR, so in order to reenable recovery for those inode, let's
try to recover cold bit in f2fs_iget() to save more fsynced data.
Fixes:
|
||
![]() |
4354994f09 |
f2fs: checkpoint disabling
Note that, it requires "f2fs: return correct errno in f2fs_gc". This adds a lightweight non-persistent snapshotting scheme to f2fs. To use, mount with the option checkpoint=disable, and to return to normal operation, remount with checkpoint=enable. If the filesystem is shut down before remounting with checkpoint=enable, it will revert back to its apparent state when it was first mounted with checkpoint=disable. This is useful for situations where you wish to be able to roll back the state of the disk in case of some critical failure. Signed-off-by: Daniel Rosenberg <drosen@google.com> [Jaegeuk Kim: use SB_RDONLY instead of MS_RDONLY] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7c1a000d46 |
f2fs: add SPDX license identifiers
Remove the verbose license text from f2fs files and replace them with SPDX tags. This does not change the license of any of the code. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
dda9f4b9ca |
f2fs: fix to skip verifying block address for non-regular inode
generic/184 1s ... [failed, exit status 1]- output mismatch --- tests/generic/184.out 2015-01-11 16:52:27.643681072 +0800 QA output created by 184 - silence is golden +rm: cannot remove '/mnt/f2fs/null': Bad address +mknod: '/mnt/f2fs/null': Bad address +chmod: cannot access '/mnt/f2fs/null': Bad address +./tests/generic/184: line 36: /mnt/f2fs/null: Bad address ... F2FS-fs (zram0): access invalid blkaddr:259 EIP: f2fs_is_valid_blkaddr+0x14b/0x1b0 [f2fs] f2fs_iget+0x927/0x1010 [f2fs] f2fs_lookup+0x26e/0x630 [f2fs] __lookup_slow+0xb3/0x140 lookup_slow+0x31/0x50 walk_component+0x185/0x1f0 path_lookupat+0x51/0x190 filename_lookup+0x7f/0x140 user_path_at_empty+0x36/0x40 vfs_statx+0x61/0xc0 __do_sys_stat64+0x29/0x40 sys_stat64+0x13/0x20 do_fast_syscall_32+0xaa/0x22c entry_SYSENTER_32+0x53/0x86 In f2fs_iget(), we will check inode's first block address, if it is valid, we will set FI_FIRST_BLOCK_WRITTEN flag in inode. But we should only do this for regular inode, otherwise, like special inode, i_addr[0] is used for storing device info instead of block address, it will fail checking flow obviously. So for non-regular inode, let's skip verifying address and setting flag. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7fa750a163 |
f2fs: rework fault injection handling to avoid a warning
When CONFIG_F2FS_FAULT_INJECTION is disabled, we get a warning about an unused label: fs/f2fs/segment.c: In function '__submit_discard_cmd': fs/f2fs/segment.c:1059:1: error: label 'submit' defined but not used [-Werror=unused-label] This could be fixed by adding another #ifdef around it, but the more reliable way of doing this seems to be to remove the other #ifdefs where that is easily possible. By defining time_to_inject() as a trivial stub, most of the checks for CONFIG_F2FS_FAULT_INJECTION can go away. This also leads to nicer formatting of the code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
91291e9998 |
f2fs: fix to do sanity check with block address in main area v2
This patch adds f2fs_is_valid_blkaddr() in below functions to do sanity check with block address to avoid pentential panic: - f2fs_grab_read_bio() - __written_first_block() https://bugzilla.kernel.org/show_bug.cgi?id=200465 - Reproduce - POC (poc.c) #define _GNU_SOURCE #include <sys/types.h> #include <sys/mount.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/xattr.h> #include <dirent.h> #include <errno.h> #include <error.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <linux/falloc.h> #include <linux/loop.h> static void activity(char *mpoint) { char *xattr; int err; err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint); char buf2[113]; memset(buf2, 0, sizeof(buf2)); listxattr(xattr, buf2, sizeof(buf2)); } int main(int argc, char *argv[]) { activity(argv[1]); return 0; } - kernel message [ 844.718738] F2FS-fs (loop0): Mounted with checkpoint version = 2 [ 846.430929] F2FS-fs (loop0): access invalid blkaddr:1024 [ 846.431058] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160 [ 846.431059] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper [ 846.431310] CPU: 1 PID: 1249 Comm: a.out Not tainted 4.18.0-rc3+ #1 [ 846.431312] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 846.431315] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160 [ 846.431316] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00 [ 846.431347] RSP: 0018:ffff961c414a7bc0 EFLAGS: 00010282 [ 846.431349] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000000 [ 846.431350] RDX: 0000000000000000 RSI: ffff89dfffd165d8 RDI: ffff89dfffd165d8 [ 846.431351] RBP: ffff961c414a7c20 R08: 0000000000000001 R09: 0000000000000248 [ 846.431353] R10: 0000000000000000 R11: 0000000000000248 R12: 0000000000000007 [ 846.431369] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0 [ 846.431372] FS: 00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000 [ 846.431373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.431374] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0 [ 846.431384] Call Trace: [ 846.431426] f2fs_iget+0x6f4/0xe70 [ 846.431430] ? f2fs_find_entry+0x71/0x90 [ 846.431432] f2fs_lookup+0x1aa/0x390 [ 846.431452] __lookup_slow+0x97/0x150 [ 846.431459] lookup_slow+0x35/0x50 [ 846.431462] walk_component+0x1c6/0x470 [ 846.431479] ? memcg_kmem_charge_memcg+0x70/0x90 [ 846.431488] ? page_add_file_rmap+0x13/0x200 [ 846.431491] path_lookupat+0x76/0x230 [ 846.431501] ? __alloc_pages_nodemask+0xfc/0x280 [ 846.431504] filename_lookup+0xb8/0x1a0 [ 846.431534] ? _cond_resched+0x16/0x40 [ 846.431541] ? kmem_cache_alloc+0x160/0x1d0 [ 846.431549] ? path_listxattr+0x41/0xa0 [ 846.431551] path_listxattr+0x41/0xa0 [ 846.431570] do_syscall_64+0x55/0x100 [ 846.431583] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 846.431607] RIP: 0033:0x7f882de1c0d7 [ 846.431607] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48 [ 846.431639] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2 [ 846.431641] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7 [ 846.431642] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0 [ 846.431643] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000 [ 846.431645] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550 [ 846.431646] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000 [ 846.431648] ---[ end trace abca54df39d14f5c ]--- [ 846.431651] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix. [ 846.431762] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_iget+0xd17/0xe70 [ 846.431763] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper [ 846.431797] CPU: 1 PID: 1249 Comm: a.out Tainted: G W 4.18.0-rc3+ #1 [ 846.431798] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 846.431800] RIP: 0010:f2fs_iget+0xd17/0xe70 [ 846.431801] Code: ff ff 48 63 d8 e9 e1 f6 ff ff 48 8b 45 c8 41 b8 05 00 00 00 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b 48 8b 38 e8 f9 b4 00 00 <0f> 0b 48 8b 45 c8 f0 80 48 48 04 e9 d8 f9 ff ff 0f 0b 48 8b 43 18 [ 846.431832] RSP: 0018:ffff961c414a7bd0 EFLAGS: 00010282 [ 846.431834] RAX: 0000000000000000 RBX: ffffc5f787b8ea80 RCX: 0000000000000006 [ 846.431835] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0 [ 846.431836] RBP: ffff961c414a7c20 R08: 0000000000000000 R09: 0000000000000273 [ 846.431837] R10: 0000000000000000 R11: ffff89dfad50ca60 R12: 0000000000000007 [ 846.431838] R13: ffff89dff5492800 R14: ffff89dfae3aa000 R15: ffff89dff4ff88d0 [ 846.431840] FS: 00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000 [ 846.431841] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.431842] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0 [ 846.431846] Call Trace: [ 846.431850] ? f2fs_find_entry+0x71/0x90 [ 846.431853] f2fs_lookup+0x1aa/0x390 [ 846.431856] __lookup_slow+0x97/0x150 [ 846.431858] lookup_slow+0x35/0x50 [ 846.431874] walk_component+0x1c6/0x470 [ 846.431878] ? memcg_kmem_charge_memcg+0x70/0x90 [ 846.431880] ? page_add_file_rmap+0x13/0x200 [ 846.431882] path_lookupat+0x76/0x230 [ 846.431884] ? __alloc_pages_nodemask+0xfc/0x280 [ 846.431886] filename_lookup+0xb8/0x1a0 [ 846.431890] ? _cond_resched+0x16/0x40 [ 846.431891] ? kmem_cache_alloc+0x160/0x1d0 [ 846.431894] ? path_listxattr+0x41/0xa0 [ 846.431896] path_listxattr+0x41/0xa0 [ 846.431898] do_syscall_64+0x55/0x100 [ 846.431901] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 846.431902] RIP: 0033:0x7f882de1c0d7 [ 846.431903] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48 [ 846.431934] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2 [ 846.431936] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7 [ 846.431937] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0 [ 846.431939] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000 [ 846.431940] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550 [ 846.431941] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000 [ 846.431943] ---[ end trace abca54df39d14f5d ]--- [ 846.432033] F2FS-fs (loop0): access invalid blkaddr:1024 [ 846.432051] WARNING: CPU: 1 PID: 1249 at fs/f2fs/checkpoint.c:154 f2fs_is_valid_blkaddr+0x10f/0x160 [ 846.432051] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper [ 846.432085] CPU: 1 PID: 1249 Comm: a.out Tainted: G W 4.18.0-rc3+ #1 [ 846.432086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 846.432089] RIP: 0010:f2fs_is_valid_blkaddr+0x10f/0x160 [ 846.432089] Code: 00 eb ed 31 c0 83 fa 05 75 ae 48 83 ec 08 48 8b 3f 89 f1 48 c7 c2 fc 0b 0f 8b 48 c7 c6 8b d7 09 8b 88 44 24 07 e8 61 8b ff ff <0f> 0b 0f b6 44 24 07 48 83 c4 08 eb 81 4c 8b 47 10 8b 8f 38 04 00 [ 846.432120] RSP: 0018:ffff961c414a7900 EFLAGS: 00010286 [ 846.432122] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006 [ 846.432123] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff89dfffd165d0 [ 846.432124] RBP: ffff89dff5492800 R08: 0000000000000001 R09: 000000000000029d [ 846.432125] R10: ffff961c414a7820 R11: 000000000000029d R12: 0000000000000400 [ 846.432126] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000 [ 846.432128] FS: 00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000 [ 846.432130] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.432131] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0 [ 846.432135] Call Trace: [ 846.432151] f2fs_wait_on_block_writeback+0x20/0x110 [ 846.432158] f2fs_grab_read_bio+0xbc/0xe0 [ 846.432161] f2fs_submit_page_read+0x21/0x280 [ 846.432163] f2fs_get_read_data_page+0xb7/0x3c0 [ 846.432165] f2fs_get_lock_data_page+0x29/0x1e0 [ 846.432167] f2fs_get_new_data_page+0x148/0x550 [ 846.432170] f2fs_add_regular_entry+0x1d2/0x550 [ 846.432178] ? __switch_to+0x12f/0x460 [ 846.432181] f2fs_add_dentry+0x6a/0xd0 [ 846.432184] f2fs_do_add_link+0xe9/0x140 [ 846.432186] __recover_dot_dentries+0x260/0x280 [ 846.432189] f2fs_lookup+0x343/0x390 [ 846.432193] __lookup_slow+0x97/0x150 [ 846.432195] lookup_slow+0x35/0x50 [ 846.432208] walk_component+0x1c6/0x470 [ 846.432212] ? memcg_kmem_charge_memcg+0x70/0x90 [ 846.432215] ? page_add_file_rmap+0x13/0x200 [ 846.432217] path_lookupat+0x76/0x230 [ 846.432219] ? __alloc_pages_nodemask+0xfc/0x280 [ 846.432221] filename_lookup+0xb8/0x1a0 [ 846.432224] ? _cond_resched+0x16/0x40 [ 846.432226] ? kmem_cache_alloc+0x160/0x1d0 [ 846.432228] ? path_listxattr+0x41/0xa0 [ 846.432230] path_listxattr+0x41/0xa0 [ 846.432233] do_syscall_64+0x55/0x100 [ 846.432235] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 846.432237] RIP: 0033:0x7f882de1c0d7 [ 846.432237] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48 [ 846.432269] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2 [ 846.432271] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7 [ 846.432272] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0 [ 846.432273] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000 [ 846.432274] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550 [ 846.432275] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000 [ 846.432277] ---[ end trace abca54df39d14f5e ]--- [ 846.432279] F2FS-fs (loop0): invalid blkaddr: 1024, type: 5, run fsck to fix. [ 846.432376] WARNING: CPU: 1 PID: 1249 at fs/f2fs/f2fs.h:2697 f2fs_wait_on_block_writeback+0xb1/0x110 [ 846.432376] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper [ 846.432410] CPU: 1 PID: 1249 Comm: a.out Tainted: G W 4.18.0-rc3+ #1 [ 846.432411] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 846.432413] RIP: 0010:f2fs_wait_on_block_writeback+0xb1/0x110 [ 846.432414] Code: 66 90 f0 ff 4b 34 74 59 5b 5d c3 48 8b 7d 00 41 b8 05 00 00 00 89 d9 48 c7 c2 d8 e8 0e 8b 48 c7 c6 1d b0 0a 8b e8 df bc fd ff <0f> 0b f0 80 4d 48 04 e9 67 ff ff ff 48 8b 03 48 c1 e8 37 83 e0 07 [ 846.432445] RSP: 0018:ffff961c414a7910 EFLAGS: 00010286 [ 846.432447] RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000006 [ 846.432448] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff89dfffd165d0 [ 846.432449] RBP: ffff89dff5492800 R08: 0000000000000000 R09: 00000000000002d1 [ 846.432450] R10: ffff961c414a7820 R11: ffff89dfad50cf80 R12: 0000000000000400 [ 846.432451] R13: 0000000000000000 R14: ffff89dff4ff88d0 R15: 0000000000000000 [ 846.432453] FS: 00007f882e2fb700(0000) GS:ffff89dfffd00000(0000) knlGS:0000000000000000 [ 846.432454] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.432455] CR2: 0000000001a88008 CR3: 00000001eb572000 CR4: 00000000000006e0 [ 846.432459] Call Trace: [ 846.432463] f2fs_grab_read_bio+0xbc/0xe0 [ 846.432464] f2fs_submit_page_read+0x21/0x280 [ 846.432466] f2fs_get_read_data_page+0xb7/0x3c0 [ 846.432468] f2fs_get_lock_data_page+0x29/0x1e0 [ 846.432470] f2fs_get_new_data_page+0x148/0x550 [ 846.432473] f2fs_add_regular_entry+0x1d2/0x550 [ 846.432475] ? __switch_to+0x12f/0x460 [ 846.432477] f2fs_add_dentry+0x6a/0xd0 [ 846.432480] f2fs_do_add_link+0xe9/0x140 [ 846.432483] __recover_dot_dentries+0x260/0x280 [ 846.432485] f2fs_lookup+0x343/0x390 [ 846.432488] __lookup_slow+0x97/0x150 [ 846.432490] lookup_slow+0x35/0x50 [ 846.432505] walk_component+0x1c6/0x470 [ 846.432509] ? memcg_kmem_charge_memcg+0x70/0x90 [ 846.432511] ? page_add_file_rmap+0x13/0x200 [ 846.432513] path_lookupat+0x76/0x230 [ 846.432515] ? __alloc_pages_nodemask+0xfc/0x280 [ 846.432517] filename_lookup+0xb8/0x1a0 [ 846.432520] ? _cond_resched+0x16/0x40 [ 846.432522] ? kmem_cache_alloc+0x160/0x1d0 [ 846.432525] ? path_listxattr+0x41/0xa0 [ 846.432526] path_listxattr+0x41/0xa0 [ 846.432529] do_syscall_64+0x55/0x100 [ 846.432531] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 846.432533] RIP: 0033:0x7f882de1c0d7 [ 846.432533] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48 [ 846.432565] RSP: 002b:00007ffe8e66c238 EFLAGS: 00000202 ORIG_RAX: 00000000000000c2 [ 846.432567] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f882de1c0d7 [ 846.432568] RDX: 0000000000000071 RSI: 00007ffe8e66c280 RDI: 0000000001a880c0 [ 846.432569] RBP: 00007ffe8e66c300 R08: 0000000001a88010 R09: 0000000000000000 [ 846.432570] R10: 00000000000001ab R11: 0000000000000202 R12: 0000000000400550 [ 846.432571] R13: 00007ffe8e66c400 R14: 0000000000000000 R15: 0000000000000000 [ 846.432573] ---[ end trace abca54df39d14f5f ]--- [ 846.434280] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 846.434424] PGD 80000001ebd3a067 P4D 80000001ebd3a067 PUD 1eb1ae067 PMD 0 [ 846.434551] Oops: 0000 [#1] SMP PTI [ 846.434697] CPU: 0 PID: 44 Comm: kworker/u5:0 Tainted: G W 4.18.0-rc3+ #1 [ 846.434805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 846.435000] Workqueue: fscrypt_read_queue decrypt_work [ 846.435174] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0 [ 846.435351] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24 [ 846.435696] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206 [ 846.435870] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80 [ 846.436051] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8 [ 846.436261] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000 [ 846.436433] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80 [ 846.436562] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60 [ 846.436658] FS: 0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000 [ 846.436758] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.436898] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0 [ 846.437001] Call Trace: [ 846.437181] ? check_preempt_wakeup+0xf2/0x230 [ 846.437276] ? check_preempt_curr+0x7c/0x90 [ 846.437370] fscrypt_decrypt_page+0x48/0x4d [ 846.437466] __fscrypt_decrypt_bio+0x5b/0x90 [ 846.437542] decrypt_work+0x12/0x20 [ 846.437651] process_one_work+0x15e/0x3d0 [ 846.437740] worker_thread+0x4c/0x440 [ 846.437848] kthread+0xf8/0x130 [ 846.437938] ? rescuer_thread+0x350/0x350 [ 846.438022] ? kthread_associate_blkcg+0x90/0x90 [ 846.438117] ret_from_fork+0x35/0x40 [ 846.438201] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd input_leds joydev soundcore serio_raw i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear qxl ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops pcbc drm 8139too aesni_intel 8139cp floppy psmouse mii aes_x86_64 crypto_simd pata_acpi cryptd glue_helper [ 846.438653] CR2: 0000000000000008 [ 846.438713] ---[ end trace abca54df39d14f60 ]--- [ 846.438796] RIP: 0010:fscrypt_do_page_crypto+0x6e/0x2d0 [ 846.438844] Code: 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 88 00 00 00 31 c0 e8 43 c2 e0 ff 49 8b 86 48 02 00 00 85 ed c7 44 24 70 00 00 00 00 <48> 8b 58 08 0f 84 14 02 00 00 48 8b 78 10 48 8b 0c 24 48 c7 84 24 [ 846.439084] RSP: 0018:ffff961c40f9bd60 EFLAGS: 00010206 [ 846.439176] RAX: 0000000000000000 RBX: ffffc5f787719b80 RCX: ffffc5f787719b80 [ 846.440927] RDX: ffffffff8b9f4b88 RSI: ffffffff8b0ae622 RDI: ffff961c40f9bdb8 [ 846.442083] RBP: 0000000000001000 R08: ffffc5f787719b80 R09: 0000000000001000 [ 846.443284] R10: 0000000000000018 R11: fefefefefefefeff R12: ffffc5f787719b80 [ 846.444448] R13: ffffc5f787719b80 R14: ffff89dff4ff88d0 R15: 0ffff89dfaddee60 [ 846.445558] FS: 0000000000000000(0000) GS:ffff89dfffc00000(0000) knlGS:0000000000000000 [ 846.446687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 846.447796] CR2: 0000000000000008 CR3: 00000001eddd0000 CR4: 00000000000006f0 - Location https://elixir.bootlin.com/linux/v4.18-rc4/source/fs/crypto/crypto.c#L149 struct crypto_skcipher *tfm = ci->ci_ctfm; Here ci can be NULL Note that this issue maybe require CONFIG_F2FS_FS_ENCRYPTION=y to reproduce. Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
bcbfbd604d |
f2fs: fix to do sanity check with inline flags
https://bugzilla.kernel.org/show_bug.cgi?id=200221 - Overview BUG() in clear_inode() when mounting and un-mounting a corrupted f2fs image - Reproduce - Kernel message [ 538.601448] F2FS-fs (loop0): Invalid segment/section count (31, 24 x 1376257) [ 538.601458] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock [ 538.724091] F2FS-fs (loop0): Try to recover 2th superblock, ret: 0 [ 538.724102] F2FS-fs (loop0): Mounted with checkpoint version = 2 [ 540.970834] ------------[ cut here ]------------ [ 540.970838] kernel BUG at fs/inode.c:512! [ 540.971750] invalid opcode: 0000 [#1] SMP KASAN PTI [ 540.972755] CPU: 1 PID: 1305 Comm: umount Not tainted 4.18.0-rc1+ #4 [ 540.974034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 540.982913] RIP: 0010:clear_inode+0xc0/0xd0 [ 540.983774] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55 [ 540.987570] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002 [ 540.988636] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a [ 540.990063] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8 [ 540.991499] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce [ 540.992923] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668 [ 540.994360] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530 [ 540.995786] FS: 00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 540.997403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 540.998571] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0 [ 541.000015] Call Trace: [ 541.000554] f2fs_evict_inode+0x253/0x630 [ 541.001381] evict+0x16f/0x290 [ 541.002015] iput+0x280/0x300 [ 541.002654] dentry_unlink_inode+0x165/0x1e0 [ 541.003528] __dentry_kill+0x16a/0x260 [ 541.004300] dentry_kill+0x70/0x250 [ 541.005018] dput+0x154/0x1d0 [ 541.005635] do_one_tree+0x34/0x40 [ 541.006354] shrink_dcache_for_umount+0x3f/0xa0 [ 541.007285] generic_shutdown_super+0x43/0x1c0 [ 541.008192] kill_block_super+0x52/0x80 [ 541.008978] kill_f2fs_super+0x62/0x70 [ 541.009750] deactivate_locked_super+0x6f/0xa0 [ 541.010664] deactivate_super+0x5e/0x80 [ 541.011450] cleanup_mnt+0x61/0xa0 [ 541.012151] __cleanup_mnt+0x12/0x20 [ 541.012893] task_work_run+0xc8/0xf0 [ 541.013635] exit_to_usermode_loop+0x125/0x130 [ 541.014555] do_syscall_64+0x138/0x170 [ 541.015340] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 541.016375] RIP: 0033:0x7f46624bf487 [ 541.017104] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48 [ 541.020923] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 541.022452] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487 [ 541.023885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0 [ 541.025318] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014 [ 541.026755] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c [ 541.028186] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30 [ 541.029626] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 541.039445] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 541.040392] RIP: 0010:clear_inode+0xc0/0xd0 [ 541.041240] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55 [ 541.045042] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002 [ 541.046099] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a [ 541.047537] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8 [ 541.048965] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce [ 541.050402] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668 [ 541.051832] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530 [ 541.053263] FS: 00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 541.054891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 541.056039] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0 [ 541.058506] ================================================================== [ 541.059991] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0 [ 541.061513] Read of size 8 at addr ffff8801e34a7970 by task umount/1305 [ 541.063302] CPU: 1 PID: 1305 Comm: umount Tainted: G D 4.18.0-rc1+ #4 [ 541.064838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 541.066778] Call Trace: [ 541.067294] dump_stack+0x7b/0xb5 [ 541.067986] print_address_description+0x70/0x290 [ 541.068941] kasan_report+0x291/0x390 [ 541.069692] ? update_stack_state+0x38c/0x3e0 [ 541.070598] __asan_load8+0x54/0x90 [ 541.071315] update_stack_state+0x38c/0x3e0 [ 541.072172] ? __read_once_size_nocheck.constprop.7+0x20/0x20 [ 541.073340] ? vprintk_func+0x27/0x60 [ 541.074096] ? printk+0xa3/0xd3 [ 541.074762] ? __save_stack_trace+0x5e/0x100 [ 541.075634] unwind_next_frame.part.5+0x18e/0x490 [ 541.076594] ? unwind_dump+0x290/0x290 [ 541.077368] ? __show_regs+0x2c4/0x330 [ 541.078142] __unwind_start+0x106/0x190 [ 541.085422] __save_stack_trace+0x5e/0x100 [ 541.086268] ? __save_stack_trace+0x5e/0x100 [ 541.087161] ? unlink_anon_vmas+0xba/0x2c0 [ 541.087997] save_stack_trace+0x1f/0x30 [ 541.088782] save_stack+0x46/0xd0 [ 541.089475] ? __alloc_pages_slowpath+0x1420/0x1420 [ 541.090477] ? flush_tlb_mm_range+0x15e/0x220 [ 541.091364] ? __dec_node_state+0x24/0xb0 [ 541.092180] ? lock_page_memcg+0x85/0xf0 [ 541.092979] ? unlock_page_memcg+0x16/0x80 [ 541.093812] ? page_remove_rmap+0x198/0x520 [ 541.094674] ? mark_page_accessed+0x133/0x200 [ 541.095559] ? _cond_resched+0x1a/0x50 [ 541.096326] ? unmap_page_range+0xcd4/0xe50 [ 541.097179] ? rb_next+0x58/0x80 [ 541.097845] ? rb_next+0x58/0x80 [ 541.098518] __kasan_slab_free+0x13c/0x1a0 [ 541.099352] ? unlink_anon_vmas+0xba/0x2c0 [ 541.100184] kasan_slab_free+0xe/0x10 [ 541.100934] kmem_cache_free+0x89/0x1e0 [ 541.101724] unlink_anon_vmas+0xba/0x2c0 [ 541.102534] free_pgtables+0x101/0x1b0 [ 541.103299] exit_mmap+0x146/0x2a0 [ 541.103996] ? __ia32_sys_munmap+0x50/0x50 [ 541.104829] ? kasan_check_read+0x11/0x20 [ 541.105649] ? mm_update_next_owner+0x322/0x380 [ 541.106578] mmput+0x8b/0x1d0 [ 541.107191] do_exit+0x43a/0x1390 [ 541.107876] ? mm_update_next_owner+0x380/0x380 [ 541.108791] ? deactivate_super+0x5e/0x80 [ 541.109610] ? cleanup_mnt+0x61/0xa0 [ 541.110351] ? __cleanup_mnt+0x12/0x20 [ 541.111115] ? task_work_run+0xc8/0xf0 [ 541.111879] ? exit_to_usermode_loop+0x125/0x130 [ 541.112817] rewind_stack_do_exit+0x17/0x20 [ 541.113666] RIP: 0033:0x7f46624bf487 [ 541.114404] Code: Bad RIP value. [ 541.115094] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 541.116605] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487 [ 541.118034] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0 [ 541.119472] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014 [ 541.120890] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c [ 541.122321] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30 [ 541.124061] The buggy address belongs to the page: [ 541.125042] page:ffffea00078d29c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 541.126651] flags: 0x2ffff0000000000() [ 541.127418] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000 [ 541.128963] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 541.130516] page dumped because: kasan: bad access detected [ 541.131954] Memory state around the buggy address: [ 541.132924] ffff8801e34a7800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 [ 541.134378] ffff8801e34a7880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 541.135814] >ffff8801e34a7900: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 [ 541.137253] ^ [ 541.138637] ffff8801e34a7980: f1 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 541.140075] ffff8801e34a7a00: 00 00 00 00 00 00 00 00 f3 00 00 00 00 00 00 00 [ 541.141509] ================================================================== - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/inode.c#L512 BUG_ON(inode->i_data.nrpages); The root cause is root directory inode is corrupted, it has both inline_data and inline_dentry flag, and its nlink is zero, so in ->evict(), after dropping all page cache, it grabs page #0 for inline data truncation, result in panic in later clear_inode() where we will check inode->i_data.nrpages value. This patch adds inline flags check in sanity_check_inode, in addition, do sanity check with root inode's nlink. Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
7735730d39 |
f2fs: fix to propagate error from __get_meta_page()
If caller of __get_meta_page() can handle error, let's propagate error from __get_meta_page(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
18dd6470c2 |
f2fs: fix to do sanity check with i_extra_isize
If inode.i_extra_isize was fuzzed to an abnormal value, when calculating inline data size, the result will overflow, result in accessing invalid memory area when operating inline data. Let's do sanity check with i_extra_isize during inode loading for fixing. https://bugzilla.kernel.org/show_bug.cgi?id=200421 - Reproduce - POC (poc.c) #define _GNU_SOURCE #include <sys/types.h> #include <sys/mount.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/xattr.h> #include <dirent.h> #include <errno.h> #include <error.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <linux/falloc.h> #include <linux/loop.h> static void activity(char *mpoint) { char *foo_bar_baz; char *foo_baz; char *xattr; int err; err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint); err = asprintf(&foo_baz, "%s/foo/baz", mpoint); err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint); rename(foo_bar_baz, foo_baz); char buf2[113]; memset(buf2, 0, sizeof(buf2)); listxattr(xattr, buf2, sizeof(buf2)); removexattr(xattr, "user.mime_type"); } int main(int argc, char *argv[]) { activity(argv[1]); return 0; } - Kernel message Umount the image will leave the following message [ 2910.995489] F2FS-fs (loop0): Mounted with checkpoint version = 2 [ 2918.416465] ================================================================== [ 2918.416807] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0xcb9/0x1a80 [ 2918.417009] Read of size 4 at addr ffff88018efc2068 by task a.out/1229 [ 2918.417311] CPU: 1 PID: 1229 Comm: a.out Not tainted 4.17.0+ #1 [ 2918.417314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 2918.417323] Call Trace: [ 2918.417366] dump_stack+0x71/0xab [ 2918.417401] print_address_description+0x6b/0x290 [ 2918.417407] kasan_report+0x28e/0x390 [ 2918.417411] ? f2fs_iget+0xcb9/0x1a80 [ 2918.417415] f2fs_iget+0xcb9/0x1a80 [ 2918.417422] ? f2fs_lookup+0x2e7/0x580 [ 2918.417425] f2fs_lookup+0x2e7/0x580 [ 2918.417433] ? __recover_dot_dentries+0x400/0x400 [ 2918.417447] ? legitimize_path.isra.29+0x5a/0xa0 [ 2918.417453] __lookup_slow+0x11c/0x220 [ 2918.417457] ? may_delete+0x2a0/0x2a0 [ 2918.417475] ? deref_stack_reg+0xe0/0xe0 [ 2918.417479] ? __lookup_hash+0xb0/0xb0 [ 2918.417483] lookup_slow+0x3e/0x60 [ 2918.417488] walk_component+0x3ac/0x990 [ 2918.417492] ? generic_permission+0x51/0x1e0 [ 2918.417495] ? inode_permission+0x51/0x1d0 [ 2918.417499] ? pick_link+0x3e0/0x3e0 [ 2918.417502] ? link_path_walk+0x4b1/0x770 [ 2918.417513] ? _raw_spin_lock_irqsave+0x25/0x50 [ 2918.417518] ? walk_component+0x990/0x990 [ 2918.417522] ? path_init+0x2e6/0x580 [ 2918.417526] path_lookupat+0x13f/0x430 [ 2918.417531] ? trailing_symlink+0x3a0/0x3a0 [ 2918.417534] ? do_renameat2+0x270/0x7b0 [ 2918.417538] ? __kasan_slab_free+0x14c/0x190 [ 2918.417541] ? do_renameat2+0x270/0x7b0 [ 2918.417553] ? kmem_cache_free+0x85/0x1e0 [ 2918.417558] ? do_renameat2+0x270/0x7b0 [ 2918.417563] filename_lookup+0x13c/0x280 [ 2918.417567] ? filename_parentat+0x2b0/0x2b0 [ 2918.417572] ? kasan_unpoison_shadow+0x31/0x40 [ 2918.417575] ? kasan_kmalloc+0xa6/0xd0 [ 2918.417593] ? strncpy_from_user+0xaa/0x1c0 [ 2918.417598] ? getname_flags+0x101/0x2b0 [ 2918.417614] ? path_listxattr+0x87/0x110 [ 2918.417619] path_listxattr+0x87/0x110 [ 2918.417623] ? listxattr+0xc0/0xc0 [ 2918.417637] ? mm_fault_error+0x1b0/0x1b0 [ 2918.417654] do_syscall_64+0x73/0x160 [ 2918.417660] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 2918.417676] RIP: 0033:0x7f2f3a3480d7 [ 2918.417677] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48 [ 2918.417732] RSP: 002b:00007fff4095b7d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000c2 [ 2918.417744] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2f3a3480d7 [ 2918.417746] RDX: 0000000000000071 RSI: 00007fff4095b810 RDI: 000000000126a0c0 [ 2918.417749] RBP: 00007fff4095b890 R08: 000000000126a010 R09: 0000000000000000 [ 2918.417751] R10: 00000000000001ab R11: 0000000000000206 R12: 00000000004005e0 [ 2918.417753] R13: 00007fff4095b990 R14: 0000000000000000 R15: 0000000000000000 [ 2918.417853] Allocated by task 329: [ 2918.418002] kasan_kmalloc+0xa6/0xd0 [ 2918.418007] kmem_cache_alloc+0xc8/0x1e0 [ 2918.418023] mempool_init_node+0x194/0x230 [ 2918.418027] mempool_init+0x12/0x20 [ 2918.418042] bioset_init+0x2bd/0x380 [ 2918.418052] blk_alloc_queue_node+0xe9/0x540 [ 2918.418075] dm_create+0x2c0/0x800 [ 2918.418080] dev_create+0xd2/0x530 [ 2918.418083] ctl_ioctl+0x2a3/0x5b0 [ 2918.418087] dm_ctl_ioctl+0xa/0x10 [ 2918.418092] do_vfs_ioctl+0x13e/0x8c0 [ 2918.418095] ksys_ioctl+0x66/0x70 [ 2918.418098] __x64_sys_ioctl+0x3d/0x50 [ 2918.418102] do_syscall_64+0x73/0x160 [ 2918.418106] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 2918.418204] Freed by task 0: [ 2918.418301] (stack is not available) [ 2918.418521] The buggy address belongs to the object at ffff88018efc0000 which belongs to the cache biovec-max of size 8192 [ 2918.418894] The buggy address is located 104 bytes to the right of 8192-byte region [ffff88018efc0000, ffff88018efc2000) [ 2918.419257] The buggy address belongs to the page: [ 2918.419431] page:ffffea00063bf000 count:1 mapcount:0 mapping:ffff8801f2242540 index:0x0 compound_mapcount: 0 [ 2918.419702] flags: 0x17fff8000008100(slab|head) [ 2918.419879] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f2242540 [ 2918.420101] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000 [ 2918.420322] page dumped because: kasan: bad access detected [ 2918.420599] Memory state around the buggy address: [ 2918.420764] ffff88018efc1f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2918.420975] ffff88018efc1f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2918.421194] >ffff88018efc2000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 2918.421406] ^ [ 2918.421627] ffff88018efc2080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 2918.421838] ffff88018efc2100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 2918.422046] ================================================================== [ 2918.422264] Disabling lock debugging due to kernel taint [ 2923.901641] BUG: unable to handle kernel paging request at ffff88018f0db000 [ 2923.901884] PGD 22226a067 P4D 22226a067 PUD 222273067 PMD 18e642063 PTE 800000018f0db061 [ 2923.902120] Oops: 0003 [#1] SMP KASAN PTI [ 2923.902274] CPU: 1 PID: 1231 Comm: umount Tainted: G B 4.17.0+ #1 [ 2923.902490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 2923.902761] RIP: 0010:__memset+0x24/0x30 [ 2923.902906] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 [ 2923.903446] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206 [ 2923.903622] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb [ 2923.903833] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc [ 2923.904062] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c [ 2923.904273] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000 [ 2923.904485] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80 [ 2923.904693] FS: 00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000 [ 2923.904937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2923.910080] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0 [ 2923.914930] Call Trace: [ 2923.919724] f2fs_truncate_inline_inode+0x114/0x170 [ 2923.924487] f2fs_truncate_blocks+0x11b/0x7c0 [ 2923.929178] ? f2fs_truncate_data_blocks+0x10/0x10 [ 2923.933834] ? dqget+0x670/0x670 [ 2923.938437] ? f2fs_destroy_extent_tree+0xd6/0x270 [ 2923.943107] ? __radix_tree_lookup+0x2f/0x150 [ 2923.947772] f2fs_truncate+0xd4/0x1a0 [ 2923.952491] f2fs_evict_inode+0x5ab/0x610 [ 2923.957204] evict+0x15f/0x280 [ 2923.961898] __dentry_kill+0x161/0x250 [ 2923.966634] shrink_dentry_list+0xf3/0x250 [ 2923.971897] shrink_dcache_parent+0xa9/0x100 [ 2923.976561] ? shrink_dcache_sb+0x1f0/0x1f0 [ 2923.981177] ? wait_for_completion+0x8a/0x210 [ 2923.985781] ? migrate_swap_stop+0x2d0/0x2d0 [ 2923.990332] do_one_tree+0xe/0x40 [ 2923.994735] shrink_dcache_for_umount+0x3a/0xa0 [ 2923.999077] generic_shutdown_super+0x3e/0x1c0 [ 2924.003350] kill_block_super+0x4b/0x70 [ 2924.007619] deactivate_locked_super+0x65/0x90 [ 2924.011812] cleanup_mnt+0x5c/0xa0 [ 2924.015995] task_work_run+0xce/0xf0 [ 2924.020174] exit_to_usermode_loop+0x115/0x120 [ 2924.024293] do_syscall_64+0x12f/0x160 [ 2924.028479] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 2924.032709] RIP: 0033:0x7fa8b2868487 [ 2924.036888] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48 [ 2924.045750] RSP: 002b:00007ffc39824d58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 2924.050190] RAX: 0000000000000000 RBX: 00000000008ea030 RCX: 00007fa8b2868487 [ 2924.054604] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000008f4360 [ 2924.058940] RBP: 00000000008f4360 R08: 0000000000000000 R09: 0000000000000014 [ 2924.063186] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007fa8b2d7183c [ 2924.067418] R13: 0000000000000000 R14: 00000000008ea210 R15: 00007ffc39824fe0 [ 2924.071534] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy [ 2924.098044] CR2: ffff88018f0db000 [ 2924.102520] ---[ end trace a8e0d899985faf31 ]--- [ 2924.107012] RIP: 0010:__memset+0x24/0x30 [ 2924.111448] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 [ 2924.120724] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206 [ 2924.125312] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb [ 2924.129931] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc [ 2924.134537] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c [ 2924.139175] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000 [ 2924.143825] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80 [ 2924.148500] FS: 00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000 [ 2924.153247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2924.158003] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0 [ 2924.164641] BUG: Bad rss-counter state mm:00000000fa04621e idx:0 val:4 [ 2924.170007] BUG: Bad rss-counter tate mm:00000000fa04621e idx:1 val:2 - Location https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/inline.c#L78 memset(addr + from, 0, MAX_INLINE_DATA(inode) - from); Here the length can be negative. Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
4c6b56c002 |
f2fs: clean up with IS_INODE()
Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
c9b60788fc |
f2fs: fix to do sanity check with block address in main area
This patch add to do sanity check with below field: - cp_pack_total_block_count - blkaddr of data/node - extent info - Overview BUG() in verify_block_addr() when writing to a corrupted f2fs image - Reproduce (4.18 upstream kernel) - POC (poc.c) static void activity(char *mpoint) { char *foo_bar_baz; int err; static int buf[8192]; memset(buf, 0, sizeof(buf)); err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint); int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777); if (fd >= 0) { write(fd, (char *)buf, sizeof(buf)); fdatasync(fd); close(fd); } } int main(int argc, char *argv[]) { activity(argv[1]); return 0; } - Kernel message [ 689.349473] F2FS-fs (loop0): Mounted with checkpoint version = 3 [ 699.728662] WARNING: CPU: 0 PID: 1309 at fs/f2fs/segment.c:2860 f2fs_inplace_write_data+0x232/0x240 [ 699.728670] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 699.729056] CPU: 0 PID: 1309 Comm: a.out Not tainted 4.18.0-rc1+ #4 [ 699.729064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 699.729074] RIP: 0010:f2fs_inplace_write_data+0x232/0x240 [ 699.729076] Code: ff e9 cf fe ff ff 49 8d 7d 10 e8 39 45 ad ff 4d 8b 7d 10 be 04 00 00 00 49 8d 7f 48 e8 07 49 ad ff 45 8b 7f 48 e9 fb fe ff ff <0f> 0b f0 41 80 4d 48 04 e9 65 fe ff ff 90 66 66 66 66 90 55 48 8d [ 699.729130] RSP: 0018:ffff8801f43af568 EFLAGS: 00010202 [ 699.729139] RAX: 000000000000003f RBX: ffff8801f43af7b8 RCX: ffffffffb88c9113 [ 699.729142] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8802024e5540 [ 699.729144] RBP: ffff8801f43af590 R08: 0000000000000009 R09: ffffffffffffffe8 [ 699.729147] R10: 0000000000000001 R11: ffffed0039b0596a R12: ffff8802024e5540 [ 699.729149] R13: ffff8801f0335500 R14: ffff8801e3e7a700 R15: ffff8801e1ee4450 [ 699.729154] FS: 00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 699.729156] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 699.729159] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0 [ 699.729171] Call Trace: [ 699.729192] f2fs_do_write_data_page+0x2e2/0xe00 [ 699.729203] ? f2fs_should_update_outplace+0xd0/0xd0 [ 699.729238] ? memcg_drain_all_list_lrus+0x280/0x280 [ 699.729269] ? __radix_tree_replace+0xa3/0x120 [ 699.729276] __write_data_page+0x5c7/0xe30 [ 699.729291] ? kasan_check_read+0x11/0x20 [ 699.729310] ? page_mapped+0x8a/0x110 [ 699.729321] ? page_mkclean+0xe9/0x160 [ 699.729327] ? f2fs_do_write_data_page+0xe00/0xe00 [ 699.729331] ? invalid_page_referenced_vma+0x130/0x130 [ 699.729345] ? clear_page_dirty_for_io+0x332/0x450 [ 699.729351] f2fs_write_cache_pages+0x4ca/0x860 [ 699.729358] ? __write_data_page+0xe30/0xe30 [ 699.729374] ? percpu_counter_add_batch+0x22/0xa0 [ 699.729380] ? kasan_check_write+0x14/0x20 [ 699.729391] ? _raw_spin_lock+0x17/0x40 [ 699.729403] ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30 [ 699.729413] ? iov_iter_advance+0x113/0x640 [ 699.729418] ? f2fs_write_end+0x133/0x2e0 [ 699.729423] ? balance_dirty_pages_ratelimited+0x239/0x640 [ 699.729428] f2fs_write_data_pages+0x329/0x520 [ 699.729433] ? generic_perform_write+0x250/0x320 [ 699.729438] ? f2fs_write_cache_pages+0x860/0x860 [ 699.729454] ? current_time+0x110/0x110 [ 699.729459] ? f2fs_preallocate_blocks+0x1ef/0x370 [ 699.729464] do_writepages+0x37/0xb0 [ 699.729468] ? f2fs_write_cache_pages+0x860/0x860 [ 699.729472] ? do_writepages+0x37/0xb0 [ 699.729478] __filemap_fdatawrite_range+0x19a/0x1f0 [ 699.729483] ? delete_from_page_cache_batch+0x4e0/0x4e0 [ 699.729496] ? __vfs_write+0x2b2/0x410 [ 699.729501] file_write_and_wait_range+0x66/0xb0 [ 699.729506] f2fs_do_sync_file+0x1f9/0xd90 [ 699.729511] ? truncate_partial_data_page+0x290/0x290 [ 699.729521] ? __sb_end_write+0x30/0x50 [ 699.729526] ? vfs_write+0x20f/0x260 [ 699.729530] f2fs_sync_file+0x9a/0xb0 [ 699.729534] ? f2fs_do_sync_file+0xd90/0xd90 [ 699.729548] vfs_fsync_range+0x68/0x100 [ 699.729554] ? __fget_light+0xc9/0xe0 [ 699.729558] do_fsync+0x3d/0x70 [ 699.729562] __x64_sys_fdatasync+0x24/0x30 [ 699.729585] do_syscall_64+0x78/0x170 [ 699.729595] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 699.729613] RIP: 0033:0x7f9bf930d800 [ 699.729615] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24 [ 699.729668] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b [ 699.729673] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800 [ 699.729675] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003 [ 699.729678] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000 [ 699.729680] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610 [ 699.729683] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000 [ 699.729687] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 699.729782] ------------[ cut here ]------------ [ 699.729785] kernel BUG at fs/f2fs/segment.h:654! [ 699.731055] invalid opcode: 0000 [#1] SMP KASAN PTI [ 699.732104] CPU: 0 PID: 1309 Comm: a.out Tainted: G W 4.18.0-rc1+ #4 [ 699.733684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 699.735611] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730 [ 699.736649] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0 [ 699.740524] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283 [ 699.741573] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef [ 699.743006] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c [ 699.744426] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55 [ 699.745833] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940 [ 699.747256] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001 [ 699.748683] FS: 00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 699.750293] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 699.751462] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0 [ 699.752874] Call Trace: [ 699.753386] ? f2fs_inplace_write_data+0x93/0x240 [ 699.754341] f2fs_inplace_write_data+0xd2/0x240 [ 699.755271] f2fs_do_write_data_page+0x2e2/0xe00 [ 699.756214] ? f2fs_should_update_outplace+0xd0/0xd0 [ 699.757215] ? memcg_drain_all_list_lrus+0x280/0x280 [ 699.758209] ? __radix_tree_replace+0xa3/0x120 [ 699.759164] __write_data_page+0x5c7/0xe30 [ 699.760002] ? kasan_check_read+0x11/0x20 [ 699.760823] ? page_mapped+0x8a/0x110 [ 699.761573] ? page_mkclean+0xe9/0x160 [ 699.762345] ? f2fs_do_write_data_page+0xe00/0xe00 [ 699.763332] ? invalid_page_referenced_vma+0x130/0x130 [ 699.764374] ? clear_page_dirty_for_io+0x332/0x450 [ 699.765347] f2fs_write_cache_pages+0x4ca/0x860 [ 699.766276] ? __write_data_page+0xe30/0xe30 [ 699.767161] ? percpu_counter_add_batch+0x22/0xa0 [ 699.768112] ? kasan_check_write+0x14/0x20 [ 699.768951] ? _raw_spin_lock+0x17/0x40 [ 699.769739] ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30 [ 699.770885] ? iov_iter_advance+0x113/0x640 [ 699.771743] ? f2fs_write_end+0x133/0x2e0 [ 699.772569] ? balance_dirty_pages_ratelimited+0x239/0x640 [ 699.773680] f2fs_write_data_pages+0x329/0x520 [ 699.774603] ? generic_perform_write+0x250/0x320 [ 699.775544] ? f2fs_write_cache_pages+0x860/0x860 [ 699.776510] ? current_time+0x110/0x110 [ 699.777299] ? f2fs_preallocate_blocks+0x1ef/0x370 [ 699.778279] do_writepages+0x37/0xb0 [ 699.779026] ? f2fs_write_cache_pages+0x860/0x860 [ 699.779978] ? do_writepages+0x37/0xb0 [ 699.780755] __filemap_fdatawrite_range+0x19a/0x1f0 [ 699.781746] ? delete_from_page_cache_batch+0x4e0/0x4e0 [ 699.782820] ? __vfs_write+0x2b2/0x410 [ 699.783597] file_write_and_wait_range+0x66/0xb0 [ 699.784540] f2fs_do_sync_file+0x1f9/0xd90 [ 699.785381] ? truncate_partial_data_page+0x290/0x290 [ 699.786415] ? __sb_end_write+0x30/0x50 [ 699.787204] ? vfs_write+0x20f/0x260 [ 699.787941] f2fs_sync_file+0x9a/0xb0 [ 699.788694] ? f2fs_do_sync_file+0xd90/0xd90 [ 699.789572] vfs_fsync_range+0x68/0x100 [ 699.790360] ? __fget_light+0xc9/0xe0 [ 699.791128] do_fsync+0x3d/0x70 [ 699.791779] __x64_sys_fdatasync+0x24/0x30 [ 699.792614] do_syscall_64+0x78/0x170 [ 699.793371] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 699.794406] RIP: 0033:0x7f9bf930d800 [ 699.795134] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24 [ 699.798960] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b [ 699.800483] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800 [ 699.801923] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003 [ 699.803373] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000 [ 699.804798] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610 [ 699.806233] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000 [ 699.807667] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 699.817079] ---[ end trace 4ce02f25ff7d3df6 ]--- [ 699.818068] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730 [ 699.819114] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0 [ 699.822919] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283 [ 699.823977] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef [ 699.825436] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c [ 699.826881] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55 [ 699.828292] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940 [ 699.829750] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001 [ 699.831192] FS: 00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 699.832793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 699.833981] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0 [ 699.835556] ================================================================== [ 699.837029] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0 [ 699.838462] Read of size 8 at addr ffff8801f43af970 by task a.out/1309 [ 699.840086] CPU: 0 PID: 1309 Comm: a.out Tainted: G D W 4.18.0-rc1+ #4 [ 699.841603] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 699.843475] Call Trace: [ 699.843982] dump_stack+0x7b/0xb5 [ 699.844661] print_address_description+0x70/0x290 [ 699.845607] kasan_report+0x291/0x390 [ 699.846351] ? update_stack_state+0x38c/0x3e0 [ 699.853831] __asan_load8+0x54/0x90 [ 699.854569] update_stack_state+0x38c/0x3e0 [ 699.855428] ? __read_once_size_nocheck.constprop.7+0x20/0x20 [ 699.856601] ? __save_stack_trace+0x5e/0x100 [ 699.857476] unwind_next_frame.part.5+0x18e/0x490 [ 699.858448] ? unwind_dump+0x290/0x290 [ 699.859217] ? clear_page_dirty_for_io+0x332/0x450 [ 699.860185] __unwind_start+0x106/0x190 [ 699.860974] __save_stack_trace+0x5e/0x100 [ 699.861808] ? __save_stack_trace+0x5e/0x100 [ 699.862691] ? unlink_anon_vmas+0xba/0x2c0 [ 699.863525] save_stack_trace+0x1f/0x30 [ 699.864312] save_stack+0x46/0xd0 [ 699.864993] ? __alloc_pages_slowpath+0x1420/0x1420 [ 699.865990] ? flush_tlb_mm_range+0x15e/0x220 [ 699.866889] ? kasan_check_write+0x14/0x20 [ 699.867724] ? __dec_node_state+0x92/0xb0 [ 699.868543] ? lock_page_memcg+0x85/0xf0 [ 699.869350] ? unlock_page_memcg+0x16/0x80 [ 699.870185] ? page_remove_rmap+0x198/0x520 [ 699.871048] ? mark_page_accessed+0x133/0x200 [ 699.871930] ? _cond_resched+0x1a/0x50 [ 699.872700] ? unmap_page_range+0xcd4/0xe50 [ 699.873551] ? rb_next+0x58/0x80 [ 699.874217] ? rb_next+0x58/0x80 [ 699.874895] __kasan_slab_free+0x13c/0x1a0 [ 699.875734] ? unlink_anon_vmas+0xba/0x2c0 [ 699.876563] kasan_slab_free+0xe/0x10 [ 699.877315] kmem_cache_free+0x89/0x1e0 [ 699.878095] unlink_anon_vmas+0xba/0x2c0 [ 699.878913] free_pgtables+0x101/0x1b0 [ 699.879677] exit_mmap+0x146/0x2a0 [ 699.880378] ? __ia32_sys_munmap+0x50/0x50 [ 699.881214] ? kasan_check_read+0x11/0x20 [ 699.882052] ? mm_update_next_owner+0x322/0x380 [ 699.882985] mmput+0x8b/0x1d0 [ 699.883602] do_exit+0x43a/0x1390 [ 699.884288] ? mm_update_next_owner+0x380/0x380 [ 699.885212] ? f2fs_sync_file+0x9a/0xb0 [ 699.885995] ? f2fs_do_sync_file+0xd90/0xd90 [ 699.886877] ? vfs_fsync_range+0x68/0x100 [ 699.887694] ? __fget_light+0xc9/0xe0 [ 699.888442] ? do_fsync+0x3d/0x70 [ 699.889118] ? __x64_sys_fdatasync+0x24/0x30 [ 699.889996] rewind_stack_do_exit+0x17/0x20 [ 699.890860] RIP: 0033:0x7f9bf930d800 [ 699.891585] Code: Bad RIP value. [ 699.892268] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b [ 699.893781] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800 [ 699.895220] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003 [ 699.896643] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000 [ 699.898069] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610 [ 699.899505] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000 [ 699.901241] The buggy address belongs to the page: [ 699.902215] page:ffffea0007d0ebc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 699.903811] flags: 0x2ffff0000000000() [ 699.904585] raw: 02ffff0000000000 0000000000000000 ffffffff07d00101 0000000000000000 [ 699.906125] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000 [ 699.907673] page dumped because: kasan: bad access detected [ 699.909108] Memory state around the buggy address: [ 699.910077] ffff8801f43af800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 [ 699.911528] ffff8801f43af880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 699.912953] >ffff8801f43af900: 00 00 00 00 00 00 00 00 f1 01 f4 f4 f4 f2 f2 f2 [ 699.914392] ^ [ 699.915758] ffff8801f43af980: f2 00 f4 f4 00 00 00 00 f2 00 00 00 00 00 00 00 [ 699.917193] ffff8801f43afa00: 00 00 00 00 00 00 00 00 00 f3 f3 f3 00 00 00 00 [ 699.918634] ================================================================== - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L644 Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
e34438c903 |
f2fs: fix to do sanity check with node footer and iblocks
This patch adds to do sanity check with below fields of inode to avoid reported panic. - node footer - iblocks https://bugzilla.kernel.org/show_bug.cgi?id=200223 - Overview BUG() triggered in f2fs_truncate_inode_blocks() when un-mounting a mounted f2fs image after writing to it - Reproduce - POC (poc.c) static void activity(char *mpoint) { char *foo_bar_baz; int err; static int buf[8192]; memset(buf, 0, sizeof(buf)); err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint); // open / write / read int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777); if (fd >= 0) { write(fd, (char *)buf, 517); write(fd, (char *)buf, sizeof(buf)); close(fd); } } int main(int argc, char *argv[]) { activity(argv[1]); return 0; } - Kernel meesage [ 552.479723] F2FS-fs (loop0): Mounted with checkpoint version = 2 [ 556.451891] ------------[ cut here ]------------ [ 556.451899] kernel BUG at fs/f2fs/node.c:987! [ 556.452920] invalid opcode: 0000 [#1] SMP KASAN PTI [ 556.453936] CPU: 1 PID: 1310 Comm: umount Not tainted 4.18.0-rc1+ #4 [ 556.455213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 556.457140] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0 [ 556.458280] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5 [ 556.462015] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286 [ 556.463068] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc [ 556.464479] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164 [ 556.465901] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d [ 556.467311] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64 [ 556.468706] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801 [ 556.470117] FS: 00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 556.471702] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 556.472838] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0 [ 556.474265] Call Trace: [ 556.474782] ? f2fs_alloc_nid_failed+0xf0/0xf0 [ 556.475686] ? truncate_nodes+0x980/0x980 [ 556.476516] ? pagecache_get_page+0x21f/0x2f0 [ 556.477412] ? __asan_loadN+0xf/0x20 [ 556.478153] ? __get_node_page+0x331/0x5b0 [ 556.478992] ? reweight_entity+0x1e6/0x3b0 [ 556.479826] f2fs_truncate_blocks+0x55e/0x740 [ 556.480709] ? f2fs_truncate_data_blocks+0x20/0x20 [ 556.481689] ? __radix_tree_lookup+0x34/0x160 [ 556.482630] ? radix_tree_lookup+0xd/0x10 [ 556.483445] f2fs_truncate+0xd4/0x1a0 [ 556.484206] f2fs_evict_inode+0x5ce/0x630 [ 556.485032] evict+0x16f/0x290 [ 556.485664] iput+0x280/0x300 [ 556.486300] dentry_unlink_inode+0x165/0x1e0 [ 556.487169] __dentry_kill+0x16a/0x260 [ 556.487936] dentry_kill+0x70/0x250 [ 556.488651] shrink_dentry_list+0x125/0x260 [ 556.489504] shrink_dcache_parent+0xc1/0x110 [ 556.490379] ? shrink_dcache_sb+0x200/0x200 [ 556.491231] ? bit_wait_timeout+0xc0/0xc0 [ 556.492047] do_one_tree+0x12/0x40 [ 556.492743] shrink_dcache_for_umount+0x3f/0xa0 [ 556.493656] generic_shutdown_super+0x43/0x1c0 [ 556.494561] kill_block_super+0x52/0x80 [ 556.495341] kill_f2fs_super+0x62/0x70 [ 556.496105] deactivate_locked_super+0x6f/0xa0 [ 556.497004] deactivate_super+0x5e/0x80 [ 556.497785] cleanup_mnt+0x61/0xa0 [ 556.498492] __cleanup_mnt+0x12/0x20 [ 556.499218] task_work_run+0xc8/0xf0 [ 556.499949] exit_to_usermode_loop+0x125/0x130 [ 556.500846] do_syscall_64+0x138/0x170 [ 556.501609] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 556.502659] RIP: 0033:0x7f8028b77487 [ 556.503384] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48 [ 556.507137] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 556.508637] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487 [ 556.510069] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0 [ 556.511481] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014 [ 556.512892] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c [ 556.514320] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820 [ 556.515745] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 556.529276] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 556.530340] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0 [ 556.531513] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5 [ 556.535330] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286 [ 556.536395] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc [ 556.537824] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164 [ 556.539290] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d [ 556.540709] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64 [ 556.542131] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801 [ 556.543579] FS: 00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 556.545180] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 556.546338] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0 [ 556.547809] ================================================================== [ 556.549248] BUG: KASAN: stack-out-of-bounds in arch_tlb_gather_mmu+0x52/0x170 [ 556.550672] Write of size 8 at addr ffff8801f292fd10 by task umount/1310 [ 556.552338] CPU: 1 PID: 1310 Comm: umount Tainted: G D 4.18.0-rc1+ #4 [ 556.553886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 556.555756] Call Trace: [ 556.556264] dump_stack+0x7b/0xb5 [ 556.556944] print_address_description+0x70/0x290 [ 556.557903] kasan_report+0x291/0x390 [ 556.558649] ? arch_tlb_gather_mmu+0x52/0x170 [ 556.559537] __asan_store8+0x57/0x90 [ 556.560268] arch_tlb_gather_mmu+0x52/0x170 [ 556.561110] tlb_gather_mmu+0x12/0x40 [ 556.561862] exit_mmap+0x123/0x2a0 [ 556.562555] ? __ia32_sys_munmap+0x50/0x50 [ 556.563384] ? exit_aio+0x98/0x230 [ 556.564079] ? __x32_compat_sys_io_submit+0x260/0x260 [ 556.565099] ? taskstats_exit+0x1f4/0x640 [ 556.565925] ? kasan_check_read+0x11/0x20 [ 556.566739] ? mm_update_next_owner+0x322/0x380 [ 556.567652] mmput+0x8b/0x1d0 [ 556.568260] do_exit+0x43a/0x1390 [ 556.568937] ? mm_update_next_owner+0x380/0x380 [ 556.569855] ? deactivate_super+0x5e/0x80 [ 556.570668] ? cleanup_mnt+0x61/0xa0 [ 556.571395] ? __cleanup_mnt+0x12/0x20 [ 556.572156] ? task_work_run+0xc8/0xf0 [ 556.572917] ? exit_to_usermode_loop+0x125/0x130 [ 556.573861] rewind_stack_do_exit+0x17/0x20 [ 556.574707] RIP: 0033:0x7f8028b77487 [ 556.575428] Code: Bad RIP value. [ 556.576106] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 556.577599] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487 [ 556.579020] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0 [ 556.580422] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014 [ 556.581833] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c [ 556.583252] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820 [ 556.584983] The buggy address belongs to the page: [ 556.585961] page:ffffea0007ca4bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 556.587540] flags: 0x2ffff0000000000() [ 556.588296] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000 [ 556.589822] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 556.591359] page dumped because: kasan: bad access detected [ 556.592786] Memory state around the buggy address: [ 556.593753] ffff8801f292fc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 556.595191] ffff8801f292fc80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 [ 556.596613] >ffff8801f292fd00: 00 00 f3 00 00 00 00 f3 f3 00 00 00 00 f4 f4 f4 [ 556.598044] ^ [ 556.598797] ffff8801f292fd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 556.600225] ffff8801f292fe00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 [ 556.601647] ================================================================== - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/node.c#L987 case NODE_DIND_BLOCK: err = truncate_nodes(&dn, nofs, offset[1], 3); cont = 0; break; default: BUG(); <--- } Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
76d56d4ab4 |
f2fs: fix to do sanity check with extra_attr feature
If FI_EXTRA_ATTR is set in inode by fuzzing, inode.i_addr[0] will be parsed as inode.i_extra_isize, then in __recover_inline_status, inline data address will beyond boundary of page, result in accessing invalid memory. So in this condition, during reading inode page, let's do sanity check with EXTRA_ATTR feature of fs and extra_attr bit of inode, if they're inconsistent, deny to load this inode. - Overview Out-of-bound access in f2fs_iget() when mounting a corrupted f2fs image - Reproduce The following message will be got in KASAN build of 4.18 upstream kernel. [ 819.392227] ================================================================== [ 819.393901] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0x736/0x1530 [ 819.395329] Read of size 4 at addr ffff8801f099c968 by task mount/1292 [ 819.397079] CPU: 1 PID: 1292 Comm: mount Not tainted 4.18.0-rc1+ #4 [ 819.397082] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 819.397088] Call Trace: [ 819.397124] dump_stack+0x7b/0xb5 [ 819.397154] print_address_description+0x70/0x290 [ 819.397159] kasan_report+0x291/0x390 [ 819.397163] ? f2fs_iget+0x736/0x1530 [ 819.397176] check_memory_region+0x139/0x190 [ 819.397182] __asan_loadN+0xf/0x20 [ 819.397185] f2fs_iget+0x736/0x1530 [ 819.397197] f2fs_fill_super+0x1b4f/0x2b40 [ 819.397202] ? f2fs_fill_super+0x1b4f/0x2b40 [ 819.397208] ? f2fs_commit_super+0x1b0/0x1b0 [ 819.397227] ? set_blocksize+0x90/0x140 [ 819.397241] mount_bdev+0x1c5/0x210 [ 819.397245] ? f2fs_commit_super+0x1b0/0x1b0 [ 819.397252] f2fs_mount+0x15/0x20 [ 819.397256] mount_fs+0x60/0x1a0 [ 819.397267] ? alloc_vfsmnt+0x309/0x360 [ 819.397272] vfs_kern_mount+0x6b/0x1a0 [ 819.397282] do_mount+0x34a/0x18c0 [ 819.397300] ? lockref_put_or_lock+0xcf/0x160 [ 819.397306] ? copy_mount_string+0x20/0x20 [ 819.397318] ? memcg_kmem_put_cache+0x1b/0xa0 [ 819.397324] ? kasan_check_write+0x14/0x20 [ 819.397334] ? _copy_from_user+0x6a/0x90 [ 819.397353] ? memdup_user+0x42/0x60 [ 819.397359] ksys_mount+0x83/0xd0 [ 819.397365] __x64_sys_mount+0x67/0x80 [ 819.397388] do_syscall_64+0x78/0x170 [ 819.397403] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 819.397422] RIP: 0033:0x7f54c667cb9a [ 819.397424] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 819.397483] RSP: 002b:00007ffd8f46cd08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 819.397496] RAX: ffffffffffffffda RBX: 0000000000dfa030 RCX: 00007f54c667cb9a [ 819.397498] RDX: 0000000000dfa210 RSI: 0000000000dfbf30 RDI: 0000000000e02ec0 [ 819.397501] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 819.397503] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000000000e02ec0 [ 819.397505] R13: 0000000000dfa210 R14: 0000000000000000 R15: 0000000000000003 [ 819.397866] Allocated by task 139: [ 819.398702] save_stack+0x46/0xd0 [ 819.398705] kasan_kmalloc+0xad/0xe0 [ 819.398709] kasan_slab_alloc+0x11/0x20 [ 819.398713] kmem_cache_alloc+0xd1/0x1e0 [ 819.398717] dup_fd+0x50/0x4c0 [ 819.398740] copy_process.part.37+0xbed/0x32e0 [ 819.398744] _do_fork+0x16e/0x590 [ 819.398748] __x64_sys_clone+0x69/0x80 [ 819.398752] do_syscall_64+0x78/0x170 [ 819.398756] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 819.399097] Freed by task 159: [ 819.399743] save_stack+0x46/0xd0 [ 819.399747] __kasan_slab_free+0x13c/0x1a0 [ 819.399750] kasan_slab_free+0xe/0x10 [ 819.399754] kmem_cache_free+0x89/0x1e0 [ 819.399757] put_files_struct+0x132/0x150 [ 819.399761] exit_files+0x62/0x70 [ 819.399766] do_exit+0x47b/0x1390 [ 819.399770] do_group_exit+0x86/0x130 [ 819.399774] __x64_sys_exit_group+0x2c/0x30 [ 819.399778] do_syscall_64+0x78/0x170 [ 819.399782] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 819.400115] The buggy address belongs to the object at ffff8801f099c680 which belongs to the cache files_cache of size 704 [ 819.403234] The buggy address is located 40 bytes to the right of 704-byte region [ffff8801f099c680, ffff8801f099c940) [ 819.405689] The buggy address belongs to the page: [ 819.406709] page:ffffea0007c26700 count:1 mapcount:0 mapping:ffff8801f69a3340 index:0xffff8801f099d380 compound_mapcount: 0 [ 819.408984] flags: 0x2ffff0000008100(slab|head) [ 819.409932] raw: 02ffff0000008100 ffffea00077fb600 0000000200000002 ffff8801f69a3340 [ 819.411514] raw: ffff8801f099d380 0000000080130000 00000001ffffffff 0000000000000000 [ 819.413073] page dumped because: kasan: bad access detected [ 819.414539] Memory state around the buggy address: [ 819.415521] ffff8801f099c800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 819.416981] ffff8801f099c880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 819.418454] >ffff8801f099c900: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 819.419921] ^ [ 819.421265] ffff8801f099c980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb [ 819.422745] ffff8801f099ca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 819.424206] ================================================================== [ 819.425668] Disabling lock debugging due to kernel taint [ 819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3 The kernel still mounts the image. If you run the following program on the mounted folder mnt, (poc.c) static void activity(char *mpoint) { char *foo_bar_baz; int err; static int buf[8192]; memset(buf, 0, sizeof(buf)); err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint); int fd = open(foo_bar_baz, O_RDONLY, 0); if (fd >= 0) { read(fd, (char *)buf, 11); close(fd); } } int main(int argc, char *argv[]) { activity(argv[1]); return 0; } You can get kernel crash: [ 819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3 [ 918.028501] BUG: unable to handle kernel paging request at ffffed0048000d82 [ 918.044020] PGD 23ffee067 P4D 23ffee067 PUD 23fbef067 PMD 0 [ 918.045207] Oops: 0000 [#1] SMP KASAN PTI [ 918.046048] CPU: 0 PID: 1309 Comm: poc Tainted: G B 4.18.0-rc1+ #4 [ 918.047573] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 918.049552] RIP: 0010:check_memory_region+0x5e/0x190 [ 918.050565] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0 [ 918.054322] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202 [ 918.055400] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14 [ 918.056832] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10 [ 918.058253] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82 [ 918.059717] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83 [ 918.061159] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08 [ 918.062614] FS: 00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 918.064246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 918.065412] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0 [ 918.066882] Call Trace: [ 918.067410] __asan_loadN+0xf/0x20 [ 918.068149] f2fs_find_target_dentry+0xf4/0x270 [ 918.069083] ? __get_node_page+0x331/0x5b0 [ 918.069925] f2fs_find_in_inline_dir+0x24b/0x310 [ 918.070881] ? f2fs_recover_inline_data+0x4c0/0x4c0 [ 918.071905] ? unwind_next_frame.part.5+0x34f/0x490 [ 918.072901] ? unwind_dump+0x290/0x290 [ 918.073695] ? is_bpf_text_address+0xe/0x20 [ 918.074566] __f2fs_find_entry+0x599/0x670 [ 918.075408] ? kasan_unpoison_shadow+0x36/0x50 [ 918.076315] ? kasan_kmalloc+0xad/0xe0 [ 918.077100] ? memcg_kmem_put_cache+0x55/0xa0 [ 918.077998] ? f2fs_find_target_dentry+0x270/0x270 [ 918.079006] ? d_set_d_op+0x30/0x100 [ 918.079749] ? __d_lookup_rcu+0x69/0x2e0 [ 918.080556] ? __d_alloc+0x275/0x450 [ 918.081297] ? kasan_check_write+0x14/0x20 [ 918.082135] ? memset+0x31/0x40 [ 918.082820] ? fscrypt_setup_filename+0x1ec/0x4c0 [ 918.083782] ? d_alloc_parallel+0x5bb/0x8c0 [ 918.084640] f2fs_find_entry+0xe9/0x110 [ 918.085432] ? __f2fs_find_entry+0x670/0x670 [ 918.086308] ? kasan_check_write+0x14/0x20 [ 918.087163] f2fs_lookup+0x297/0x590 [ 918.087902] ? f2fs_link+0x2b0/0x2b0 [ 918.088646] ? legitimize_path.isra.29+0x61/0xa0 [ 918.089589] __lookup_slow+0x12e/0x240 [ 918.090371] ? may_delete+0x2b0/0x2b0 [ 918.091123] ? __nd_alloc_stack+0xa0/0xa0 [ 918.091944] lookup_slow+0x44/0x60 [ 918.092642] walk_component+0x3ee/0xa40 [ 918.093428] ? is_bpf_text_address+0xe/0x20 [ 918.094283] ? pick_link+0x3e0/0x3e0 [ 918.095047] ? in_group_p+0xa5/0xe0 [ 918.095771] ? generic_permission+0x53/0x1e0 [ 918.096666] ? security_inode_permission+0x1d/0x70 [ 918.097646] ? inode_permission+0x7a/0x1f0 [ 918.098497] link_path_walk+0x2a2/0x7b0 [ 918.099298] ? apparmor_capget+0x3d0/0x3d0 [ 918.100140] ? walk_component+0xa40/0xa40 [ 918.100958] ? path_init+0x2e6/0x580 [ 918.101695] path_openat+0x1bb/0x2160 [ 918.102471] ? __save_stack_trace+0x92/0x100 [ 918.103352] ? save_stack+0xb5/0xd0 [ 918.104070] ? vfs_unlink+0x250/0x250 [ 918.104822] ? save_stack+0x46/0xd0 [ 918.105538] ? kasan_slab_alloc+0x11/0x20 [ 918.106370] ? kmem_cache_alloc+0xd1/0x1e0 [ 918.107213] ? getname_flags+0x76/0x2c0 [ 918.107997] ? getname+0x12/0x20 [ 918.108677] ? do_sys_open+0x14b/0x2c0 [ 918.109450] ? __x64_sys_open+0x4c/0x60 [ 918.110255] ? do_syscall_64+0x78/0x170 [ 918.111083] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 918.112148] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 918.113204] ? f2fs_empty_inline_dir+0x1e0/0x1e0 [ 918.114150] ? timespec64_trunc+0x5c/0x90 [ 918.114993] ? wb_io_lists_depopulated+0x1a/0xc0 [ 918.115937] ? inode_io_list_move_locked+0x102/0x110 [ 918.116949] do_filp_open+0x12b/0x1d0 [ 918.117709] ? may_open_dev+0x50/0x50 [ 918.118475] ? kasan_kmalloc+0xad/0xe0 [ 918.119246] do_sys_open+0x17c/0x2c0 [ 918.119983] ? do_sys_open+0x17c/0x2c0 [ 918.120751] ? filp_open+0x60/0x60 [ 918.121463] ? task_work_run+0x4d/0xf0 [ 918.122237] __x64_sys_open+0x4c/0x60 [ 918.123001] do_syscall_64+0x78/0x170 [ 918.123759] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 918.124802] RIP: 0033:0x7fac96e3e040 [ 918.125537] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 09 27 2d 00 00 75 10 b8 02 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e e0 01 00 48 89 04 24 [ 918.129341] RSP: 002b:00007fff1b37f848 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 [ 918.130870] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fac96e3e040 [ 918.132295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000122d080 [ 918.133748] RBP: 00007fff1b37f9b0 R08: 00007fac9710bbd8 R09: 0000000000000001 [ 918.135209] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000400c20 [ 918.136650] R13: 00007fff1b37fab0 R14: 0000000000000000 R15: 0000000000000000 [ 918.138093] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 918.147924] CR2: ffffed0048000d82 [ 918.148619] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 918.149563] RIP: 0010:check_memory_region+0x5e/0x190 [ 918.150576] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0 [ 918.154360] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202 [ 918.155411] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14 [ 918.156833] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10 [ 918.158257] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82 [ 918.159722] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83 [ 918.161149] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08 [ 918.162587] FS: 00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 918.164203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 918.165356] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0 Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
e1da7872f6 |
f2fs: introduce and spread verify_blkaddr
This patch introduces verify_blkaddr to check meta/data block address with valid range to detect bug earlier. In addition, once we encounter an invalid blkaddr, notice user to run fsck to fix, and let the kernel panic. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
24b81dfcb7 |
f2fs: use timespec64 for inode timestamps
The on-disk representation and the vfs both use 64-bit tv_sec values, so let's change the last missing piece in the middle. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
54c55c4e4f |
f2fs: support in-memory inode checksum when checking consistency
Enable in-memory inode checksum to protect metadata blocks from in-memory scribbles when checking consistency, which has no performance requirements. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |
||
![]() |
83a3bfdb5a |
f2fs: indicate shutdown f2fs to allow unmount successfully
Once we shutdown f2fs, we have to flush stale pages in order to unmount the system. In order to make stable, we need to stop fault injection as well. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> |