Commit Graph

873 Commits

Author SHA1 Message Date
Dr. David Alan Gilbert
ca7be9c0a1 mtd: ubi: Remove unused ubi_flush
ubi_flush() was added in 2012 as part of
commit 62f384552b ("UBI: modify ubi_wl_flush function to clear work queue
for a lnum")
but has remained unused.

(It's friend ubi_wl_flush() is still used)

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2025-07-24 22:19:17 +02:00
Anuj Gupta
75618ac6e9 block: remove unused parameter 'q' parameter in __blk_rq_map_sg()
request_queue param is no longer used by blk_rq_map_sg and
__blk_rq_map_sg. Remove it.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250313035322.243239-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-13 05:46:19 -06:00
Linus Torvalds
350130afc2 This pull request contains updates for UBI and UBIFS:
UBI:
 - New interface to dump detailed erase counters
 - Fixes around wear-leveling
 
 UBIFS:
 - Minor cleanups
 - Fix for TNC dumping code
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmeb8cgWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wRUPEADos2ZdLleilKSAuPV6osItbNjo
 UAkwFAu4suQMzChZv4V/b7uisu1vBQiOMGiqCZQdU6aIzSUsUlqNQSn8L3f6cPD1
 P9Kae1SNCFdlmrfiaduWyxHsl1pxBDfXQ8qt3hcVpNf3LEuYJfu25Fefqavg9IvI
 Egp0Akk89C4AzNoHUDhzCp8oHAV0nwX5kBN8SVc28G6sb/SQG6+1kU9Op+d6qjoX
 8UTfeyETkDIzpx+wjl4bn5o4lagtaJqS/TI6UfhE736KjPUnGK0oEWoHA6AGCqu4
 E158pqYh69LHTQ7Wxo7wQ+9Pc9xoE2qTrKUF5b/u9GzsbHMtd4q/pg9Aa8OOcZur
 ir7pjrRqGkybYGfnhs/b/9Jq7rr7nQNl0j413nmNHb3Wf9+qjwmxYvPQusd6/bLr
 nHfRymIGCXhuOqgB37t+F+iuQkwfnatVJHJ+Q0r8pJ35LgXj/+PTKZycYF5Jg72q
 tcth9CXoVCF78e6AJX6GioSOfBUMA5EXK8bqWKr20ZHPbVP8W7gOg7G8PFoNOnj+
 KiRMvOqXavPUCcrRxDbp1T50NC+dQYX0FuyCtjuX5vH+0YpgdgvA0trkGPOgMeqw
 x6py7J+u7jhCWoVecw2DL4ZwEdr4QcvEiTra8i+v4PyRkdrUZenuty+pbeSnUsNk
 rHKJZkie9Z2xycLJAA==
 =bgVN
 -----END PGP SIGNATURE-----

Merge tag 'ubifs-for-linus-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBI and UBIFS updates from Richard Weinberger:
 "UBI:
   - New interface to dump detailed erase counters
   - Fixes around wear-leveling

  UBIFS:
   - Minor cleanups
   - Fix for TNC dumping code"

* tag 'ubifs-for-linus-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubi: ubi_get_ec_info: Fix compiling error 'cast specifies array type'
  ubi: Implement ioctl for detailed erase counters
  ubi: Expose interface for detailed erase counters
  ubifs: skip dumping tnc tree when zroot is null
  ubi: Revert "ubi: wl: Close down wear-leveling before nand is suspended"
  ubifs: ubifs_dump_leb: remove return from end of void function
  ubifs: dump_lpt_leb: remove return at end of void function
  ubi: Add a check for ubi_num
2025-01-30 18:27:02 -08:00
Zhihao Cheng
69146a8c89 ubi: ubi_get_ec_info: Fix compiling error 'cast specifies array type'
On risc V platform, there is a type conversion for the return value
(unsigned long type) of __untagged_addr_remote() in function
untagged_addr(). The compiler will complain when the parameter 'addr'
is an array type:
  arch/riscv/include/asm/uaccess.h:33:9: error: cast specifies array type
  (__force  __typeof__(addr))__untagged_addr_remote(current->mm, __addr)

Fix it by converting the input parameter as a pointer.

Fixes: 01099f635a ("ubi: Implement ioctl for detailed erase counters")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501191405.WYnmdL0U-lkp@intel.com/
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-20 09:09:24 +01:00
Rickard Andersson
01099f635a ubi: Implement ioctl for detailed erase counters
Currently, "max_ec" can be read from sysfs, which provides a limited
view of the flash device’s wear. In certain cases, such as bugs in
the wear-leveling algorithm, specific blocks can be worn down more
than others, resulting in uneven wear distribution. Also some use cases
can wear the erase blocks of the fastmap area more heavily than other
parts of flash.
Providing detailed erase counter values give a better understanding of
the overall flash wear and is needed to be able to calculate for example
expected life time.
There exists more detailed info in debugfs, but this information is
only available for debug builds.

Signed-off-by: Rickard Andersson <rickard.andersson@axis.com>
Tested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18 15:32:52 +01:00
Zhihao Cheng
844c6fdc13 ubi: Revert "ubi: wl: Close down wear-leveling before nand is suspended"
Commit 5580cdae05 ("ubi: wl: Close down wear-leveling before nand is
suspended") added a reboot notification in UBI layer to shutdown the
wear-leveling subsystem, which imported an UAF problem[1]. Besides that,
the method also brings other potential UAF problems, for example:
       reboot             kworker
 ubi_wl_reboot_notifier
  ubi_wl_close
   ubi_fastmap_close
    kfree(ubi->fm)
                     update_fastmap_work_fn
		      ubi_update_fastmap
		       old_fm = ubi->fm
		       if (old_fm && old_fm->e[i]) // UAF!

Actually, the problem fixed by commit 5580cdae05 ("ubi: wl: Close down
wear-leveling before nand is suspended") has been solved by commit
8cba323437 ("mtd: rawnand: protect access to rawnand devices while in
suspend"), which was discussed in [2]. So we can revert the commit
5580cdae05 ("ubi: wl: Close down wear-leveling before nand is
suspended") directly.

[1] https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/
[2] https://lore.kernel.org/all/9bf76f5d-12a4-46ff-90d4-4a7f0f47c381@axis.com/

Fixes: 5580cdae05 ("ubi: wl: Close down wear-leveling before nand is suspended")
Reported-by: Dennis Lam <dennis.lamerice@gmail.com>
Closes: https://lore.kernel.org/linux-mtd/20241208175211.9406-2-dennis.lamerice@gmail.com/
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Acked-by: Mårten Lindahl <marten.lindahl@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18 15:28:19 +01:00
Denis Arefev
97bbf9e312 ubi: Add a check for ubi_num
Added a check for ubi_num for negative numbers
If the variable ubi_num takes negative values then we get:

qemu-system-arm ... -append "ubi.mtd=0,0,0,-22222345" ...
[    0.745065]  ubi_attach_mtd_dev from ubi_init+0x178/0x218
[    0.745230]  ubi_init from do_one_initcall+0x70/0x1ac
[    0.745344]  do_one_initcall from kernel_init_freeable+0x198/0x224
[    0.745474]  kernel_init_freeable from kernel_init+0x18/0x134
[    0.745600]  kernel_init from ret_from_fork+0x14/0x28
[    0.745727] Exception stack(0x90015fb0 to 0x90015ff8)

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 83ff59a066 ("UBI: support ubi_num on mtd.ubi command line")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18 15:22:04 +01:00
Christoph Hellwig
cc76ace465 block: remove BLK_MQ_F_SHOULD_MERGE
BLK_MQ_F_SHOULD_MERGE is set for all tag_sets except those that purely
process passthrough commands (bsg-lib, ufs tmf, various nvme admin
queues) and thus don't even check the flag.  Remove it to simplify the
driver interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23 08:17:23 -07:00
Colin Ian King
67efb77cb0 mtd: ubi: remove redundant check on bytes_left at end of function
In function ubi_nvmem_reg_read the while-loop can only be exiting
of bytes_left is zero or an error has occurred. There is an exit
return path if an error occurs, so the bytes_left can only be
zero after that point. Hence the check for a non-zero bytes_left
at the end of the function is redundant and can be removed. Remove
the check and just return 0.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 19:54:27 +01:00
Javier Carrasco
07593293ff mtd: ubi: fix unreleased fwnode_handle in find_volume_fwnode()
The 'fw_vols' fwnode_handle initialized via
device_get_named_child_node() requires explicit calls to
fwnode_handle_put() when the variable is no longer required.

Add the missing calls to fwnode_handle_put() before the function
returns.

Cc: stable@vger.kernel.org
Fixes: 51932f9fc4 ("mtd: ubi: populate ubi volume fwnode")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 19:52:14 +01:00
Zhihao Cheng
bcddf52b7a ubi: fastmap: Fix duplicate slab cache names while attaching
Since commit 4c39529663 ("slab: Warn on duplicate cache names when
DEBUG_VM=y"), the duplicate slab cache names can be detected and a
kernel WARNING is thrown out.
In UBI fast attaching process, alloc_ai() could be invoked twice
with the same slab cache name 'ubi_aeb_slab_cache', which will trigger
following warning messages:
 kmem_cache of name 'ubi_aeb_slab_cache' already exists
 WARNING: CPU: 0 PID: 7519 at mm/slab_common.c:107
          __kmem_cache_create_args+0x100/0x5f0
 Modules linked in: ubi(+) nandsim [last unloaded: nandsim]
 CPU: 0 UID: 0 PID: 7519 Comm: modprobe Tainted: G 6.12.0-rc2
 RIP: 0010:__kmem_cache_create_args+0x100/0x5f0
 Call Trace:
   __kmem_cache_create_args+0x100/0x5f0
   alloc_ai+0x295/0x3f0 [ubi]
   ubi_attach+0x3c3/0xcc0 [ubi]
   ubi_attach_mtd_dev+0x17cf/0x3fa0 [ubi]
   ubi_init+0x3fb/0x800 [ubi]
   do_init_module+0x265/0x7d0
   __x64_sys_finit_module+0x7a/0xc0

The problem could be easily reproduced by loading UBI device by fastmap
with CONFIG_DEBUG_VM=y.
Fix it by using different slab names for alloc_ai() callers.

Fixes: d2158f69a7 ("UBI: Remove alloc_ai() slab name from parameter list")
Fixes: fdf10ed710 ("ubi: Rework Fastmap attach base code")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 19:45:28 +01:00
Mårten Lindahl
5580cdae05 ubi: wl: Close down wear-leveling before nand is suspended
If a reboot/shutdown signal with double force (-ff) is triggered when
the erase worker or wear-leveling worker function runs we may end up in
a race condition since the MTD device gets a reboot notification and
suspends the nand flash before the erase or wear-leveling is done. This
will reject all accesses to the flash with -EBUSY.

Sequence for the erase worker function:

   systemctl reboot -ff           ubi_thread

                                do_work
 __do_sys_reboot
   blocking_notifier_call_chain
     mtd_reboot_notifier
       nand_shutdown
         nand_suspend
                                  __erase_worker
                                    ubi_sync_erase
                                      mtd_erase
                                        nand_erase_nand

                                          # Blocked by suspended chip
                                          nand_get_device
                                            => EBUSY

Similar sequence for the wear-leveling function:

   systemctl reboot -ff           ubi_thread

                                do_work
 __do_sys_reboot
   blocking_notifier_call_chain
     mtd_reboot_notifier
       nand_shutdown
         nand_suspend
                                  wear_leveling_worker
                                    ubi_eba_copy_leb
                                      ubi_io_write
                                        mtd_write
                                          nand_write_oob

                                            # Blocked by suspended chip
                                            nand_get_device
                                              => EBUSY

 systemd-shutdown[1]: Rebooting.
 ubi0 error: ubi_io_write: error -16 while writing 2048 bytes to PEB
 CPU: 1 PID: 82 Comm: ubi_bgt0d Kdump: loaded Tainted: G           O
 (unwind_backtrace) from [<80107b9f>] (show_stack+0xb/0xc)
 (show_stack) from [<8033641f>] (dump_stack_lvl+0x2b/0x34)
 (dump_stack_lvl) from [<803b7f3f>] (ubi_io_write+0x3ab/0x4a8)
 (ubi_io_write) from [<803b817d>] (ubi_io_write_vid_hdr+0x71/0xb4)
 (ubi_io_write_vid_hdr) from [<803b6971>] (ubi_eba_copy_leb+0x195/0x2f0)
 (ubi_eba_copy_leb) from [<803b939b>] (wear_leveling_worker+0x2ff/0x738)
 (wear_leveling_worker) from [<803b86ef>] (do_work+0x5b/0xb0)
 (do_work) from [<803b9ee1>] (ubi_thread+0xb1/0x11c)
 (ubi_thread) from [<8012c113>] (kthread+0x11b/0x134)
 (kthread) from [<80100139>] (ret_from_fork+0x11/0x38)
 Exception stack(0x80c43fb0 to 0x80c43ff8)
 ...
 ubi0 error: ubi_dump_flash: err -16 while reading 2048 bytes from PEB
 ubi0 error: wear_leveling_worker: error -16 while moving PEB 246 to PEB
 ubi0 warning: ubi_ro_mode.part.0: switch to read-only mode
 ...
 ubi0 error: do_work: work failed with error code -16
 ubi0 error: ubi_thread: ubi_bgt0d: work failed with error code -16
 ...
 Kernel panic - not syncing: Software Watchdog Timer expired

Add a reboot notification for the ubi/wear-leveling to shutdown any
potential flash work actions before the nand is suspended.

Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 18:46:04 +01:00
Zhang Zekun
cb33ade753 mtd: ubi: Rmove unused declaration in header file
The definition of ubi_destroy_ai() has been removed since
commit dac6e2087a ("UBI: Add fastmap stuff to attach.c").
Remove the empty declaration in header file.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 18:02:11 +01:00
Zhihao Cheng
c4595fe394 ubi: fastmap: wl: Schedule fm_work if wear-leveling pool is empty
Since commit 14072ee33d ("ubi: fastmap: Check wl_pool for free peb
before wear leveling"), wear_leveling_worker() won't schedule fm_work
if wear-leveling pool is empty, which could temporarily disable the
wear-leveling until the fastmap is updated(eg. pool becomes empty).
Fix it by scheduling fm_work if wl_pool is empty during wear-leveing.

Fixes: 14072ee33d ("ubi: fastmap: Check wl_pool for free peb before wear leveling")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 17:48:28 +01:00
Zhihao Cheng
d610020f03 ubi: wl: Put source PEB into correct list if trying locking LEB failed
During wear-leveing work, the source PEB will be moved into scrub list
when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
for non-scrub type source PEB. The problem could bring extra and
ineffective wear-leveing jobs, which makes more or less negative effects
for the life time of flash. Specifically, the process is divided 2 steps:
1. wear_leveling_worker // generate false scrub type PEB
     ubi_eba_copy_leb // MOVE_RETRY is returned
       leb_write_trylock // trylock failed
     scrubbing = 1;
     e1 is put into ubi->scrub
2. wear_leveling_worker // schedule false scrub type PEB for wl
     scrubbing = 1
     e1 = rb_entry(rb_first(&ubi->scrub))

The problem can be reproduced easily by running fsstress on a small
UBIFS partition(<64M, simulated by nandsim) for 5~10mins
(CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following
message is shown:
 ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165

Since scrub type source PEB has set variable scrubbing as '1', and
variable scrubbing is checked before variable keep, so the problem can
be fixed by setting keep variable as 1 directly if the source LEB cannot
be locked.

Fixes: e801e128b2 ("UBI: fix missing scrub when there is a bit-flip")
CC: stable@vger.kernel.org
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14 17:41:30 +01:00
Al Viro
cb787f4ac0 [tree-wide] finally take no_llseek out
no_llseek had been defined to NULL two years ago, in commit 868941b144
("fs: remove no_llseek")

To quote that commit,

  At -rc1 we'll need do a mechanical removal of no_llseek -

  git grep -l -w no_llseek | grep -v porting.rst | while read i; do
	sed -i '/\<no_llseek\>/d' $i
  done

  would do it.

Unfortunately, that hadn't been done.  Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
	.llseek = no_llseek,
so it's obviously safe.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-27 08:18:43 -07:00
Richard Weinberger
92a286e902 ubi: Fix ubi_init() ubiblock_exit() section mismatch
Since ubiblock_exit() is now called from an init function,
the __exit section no longer makes sense.

Cc: Ben Hutchings <bwh@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407131403.wZJpd8n2-lkp@intel.com/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-07-28 20:08:25 +02:00
Li Nan
4f9d406c8c ubi: block: fix null-pointer-dereference in ubiblock_create()
Similar to commit adbf4c4954 ("ubi: block: fix memleak in
ubiblock_create()"), 'dev->gd' is not assigned but dereferenced if
blk_mq_alloc_tag_set() fails, and leading to a null-pointer-dereference.
Fix it by using pr_err() and variable 'dev' to print error log.

Additionally, the log in the error handle path of idr_alloc() has
been improved by using pr_err(), too. Before initializing device
name, using dev_err() will print error log with 'null' instead of
the actual device name, like this:
  block (null): ...
        ~~~~~~
It is unclear. Using pr_err() can print more details of the device.
The improved log is:
  ubiblock0_0: ...

Fixes: 77567b25ab ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 21:56:23 +02:00
ZhaoLong Wang
7037c96d8c ubifs: correct UBIFS_DFS_DIR_LEN macro definition and improve code clarity
The UBIFS_DFS_DIR_LEN macro, which defines the maximum length of the UBIFS
debugfs directory name, has an incorrect formula and misleading comments.
The current formula is (3 + 1 + 2*2 + 1), which assumes that both UBI device
number and volume ID are limited to 2 characters. However, UBI device number
ranges from 0 to 31 (2 characters), and volume ID ranges from 0 to 127 (up
to 3 characters).

Although the current code works due to the cancellation of mathematical
errors (9 + 1 = 10, which matches the correct UBIFS_DFS_DIR_LEN value), it
can lead to confusion and potential issues in the future.

This patch aims to improve the code clarity and maintainability by making
the following changes:

1. Corrects the UBIFS_DFS_DIR_LEN macro definition to (3 + 1 + 2 + 3 + 1),
   accommodating the maximum lengths of both UBI device number and volume ID,
   plus the separators and null terminator.
2. Updates the snprintf calls to use UBIFS_DFS_DIR_LEN instead of
   UBIFS_DFS_DIR_LEN + 1, removing the unnecessary +1.
3. Modifies the error checks to compare against UBIFS_DFS_DIR_LEN using >=
   instead of >, aligning with the corrected macro definition.
4. Removes the redundant +1 in the dfs_dir_name array definitions in ubi.h
   and debug.h.

While these changes do not affect the runtime behavior, they make the code
more readable, maintainable, and less prone to future errors.

v2->v3:

 - Removes the duplicated UBIFS_DFS_DIR_LEN and UBIFS_DFS_DIR_NAME macro
   definitions in ubifs.h, as they are already defined in debug.h.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 21:52:24 +02:00
Ben Hutchings
72f3d3dadd mtd: ubi: Restore missing cleanup on ubi_init() failure path
We need to clean-up debugfs and ubiblock if we fail after initialising
them.

Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Fixes: 927c145208 ("mtd: ubi: attach from device tree")
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 21:43:09 +02:00
Arnd Bergmann
02096a0cf1 mtd: ubi: avoid expensive do_div() on 32-bit machines
The use of do_div() in ubi_nvmem_reg_read() makes calling it on
32-bit machines rather expensive. Since the 'from' variable is
known to be a 32-bit quantity, it is clearly never needed and
can be optimized into a regular division operation.

Fixes: b8a77b9a5f ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
Fixes: 3ce485803d ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 19:46:21 +02:00
Ricardo B. Marliere
299af26eb4 mtd: ubi: make ubi_class constant
Since commit 43a7206b09 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the ubi_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 19:45:09 +02:00
Fedor Pchelkin
745d9f4a31 ubi: eba: properly rollback inside self_check_eba
In case of a memory allocation failure in the volumes loop we can only
process the already allocated scan_eba and fm_eba array elements on the
error path - others are still uninitialized.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 00abf30415 ("UBI: Add self_check_eba()")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-07-12 19:42:46 +02:00
Linus Torvalds
85a79128c4 This pull request contains updates for UBI and UBIFS:
UBI:
         - Add Zhihao Cheng as reviewer
 	- Attach via device tree
 	- Add NVMEM layer
 	- Various fastmap related fixes
 
 UBIFS:
         - Add Zhihao Cheng as reviewer
 	- Convert to folios
 	- Various fixes (memory leaks in error paths, function prototypes)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmX8kjUWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wSUcD/sFJyv3oD9qqt+OZJUI2b84nHdk
 7EXC4vAd1ioTZzQS0txWx8rPPrhi/XKKGIea71qkDpHyi3foT0n2MlELHNpIZaoH
 r8F50LeMzxBC7NEdGMaU4JYR5FOhNrLJanF5H1MEiiN+IaovhPWrA0V9ViWvS8tM
 e3WDA3tEPo2bbpkzgstjow7YxIAD4OcXhgkFxqb0j299zZzO9GmhLqTlyaidBFne
 VJIjurHd4ixgFEBRJGxAxcAdST5ONwx5RmlTy+9/lubn326jRz5VTRj6pkcugjvn
 odyPeLHc3jEXGP+6qvtyuL2jy6AqyRksXQvZYgP5iL8m2+ga0Edj8/zfoiGPnjRN
 ukYIFI2l9Qv4jUsByHX/klSdILL2L5gK2G5u9LrgDameOTnBcQH/i/TBb1MWzPCA
 O48XJo8T0XvwOLCbgHOuQ7+yKKaI49C9AtM2cbrMRL1gJJKjUsXcC5YZu+3a9+Fi
 TO0o0Y61GKS893mmMznhQqTMMr+5JMMlHJ6C7F6pXdt90twThwABZidWQz1uZc2h
 s+KWo7ts5itxBLW4XP8oue4aBsRdVTQ0IbYcB7j+EXE3EjY7CEge2SNHY6/7eiEK
 Y86M75svkMkQdbLNgV+iSUrn7Uddozm14eHL6wIrWv8Pe9bx0OFlCTFsXzhM37hK
 EK3aNxhyIHk5EFkGHA==
 =70g8
 -----END PGP SIGNATURE-----

Merge tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBI and UBIFS updates from Richard Weinberger:
 "UBI:
   - Add Zhihao Cheng as reviewer
   - Attach via device tree
   - Add NVMEM layer
   - Various fastmap related fixes

  UBIFS:
   - Add Zhihao Cheng as reviewer
   - Convert to folios
   - Various fixes (memory leaks in error paths, function prototypes)"

* tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits)
  mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems
  mtd: ubi: provide NVMEM layer over UBI volumes
  mtd: ubi: populate ubi volume fwnode
  mtd: ubi: introduce pre-removal notification for UBI volumes
  mtd: ubi: attach from device tree
  mtd: ubi: block: use notifier to create ubiblock from parameter
  dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM
  dt-bindings: mtd: add basic bindings for UBI
  ubifs: Queue up space reservation tasks if retrying many times
  ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
  ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed
  ubi: Correct the number of PEBs after a volume resize failure
  ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130
  ubi: correct the calculation of fastmap size
  ubifs: Remove unreachable code in dbg_check_ltab_lnum
  ubifs: fix function pointer cast warnings
  ubifs: fix sort function prototype
  ubi: Check for too small LEB size in VTBL code
  MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer
  ubifs: Convert populate_page() to take a folio
  ...
2024-03-21 15:09:29 -07:00
Daniel Golle
b8a77b9a5f mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems
A compiler warning related to sizeof(int) != 8 when calling do_div()
is triggered when building on 32-bit platforms.
Address this by using integer types having a well-defined size.

Fixes: 3ce485803d ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-03-10 22:14:28 +01:00
Daniel Golle
3ce485803d mtd: ubi: provide NVMEM layer over UBI volumes
In an ideal world we would like UBI to be used where ever possible on a
NAND chip. And with UBI support in ARM Trusted Firmware and U-Boot it
is possible to achieve an (almost-)all-UBI flash layout. Hence the need
for a way to also use UBI volumes to store board-level constants, such
as MAC addresses and calibration data of wireless interfaces.

Add UBI volume NVMEM driver module exposing UBI volumes as NVMEM
providers. Allow UBI devices to have a "volumes" firmware subnode with
volumes which may be compatible with "nvmem-cells".
Access to UBI volumes via the NVMEM interface at this point is
read-only, and it is slow, opening and closing the UBI volume for each
access due to limitations of the NVMEM provider API.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 22:42:23 +01:00
Daniel Golle
51932f9fc4 mtd: ubi: populate ubi volume fwnode
Look for the 'volumes' subnode of an MTD partition attached to a UBI
device and attach matching child nodes to UBI volumes.
This allows UBI volumes to be referenced in device tree, e.g. for use
as NVMEM providers.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 22:41:33 +01:00
Daniel Golle
7e84c961b2 mtd: ubi: introduce pre-removal notification for UBI volumes
Introduce a new notification type UBI_VOLUME_SHUTDOWN to inform users
that a volume is just about to be removed.
This is needed because users (such as the NVMEM subsystem) expect that
at the time their removal function is called, the parenting device is
still available (for removal of sysfs nodes, for example, in case of
NVMEM which otherwise WARNs on volume removal).

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 22:41:33 +01:00
Daniel Golle
927c145208 mtd: ubi: attach from device tree
Introduce device tree compatible 'linux,ubi' and attach compatible MTD
devices using the MTD add notifier. This is needed for a UBI device to
be available early at boot (and not only after late_initcall), so
volumes on them can be used eg. as NVMEM providers for other drivers.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 22:41:33 +01:00
Daniel Golle
762d73cd93 mtd: ubi: block: use notifier to create ubiblock from parameter
Use UBI_VOLUME_ADDED notification to create ubiblock device specified
on kernel cmdline or module parameter.
This makes thing more simple and has the advantage that ubiblock devices
on volumes which are not present at the time the ubi module is probed
will still be created.

Suggested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 22:41:32 +01:00
ZhaoLong Wang
9277b3a649 ubi: Correct the number of PEBs after a volume resize failure
In the error handling path `out_acc` of `ubi_resize_volume()`,
when `pebs < 0`, it indicates that the volume table record failed to
update when the volume was shrunk. In this case, the number of `ubi->avail_pebs`
and `ubi->rsvd_pebs` should be restored to their previous values to prevent
the UBI layer from reporting an incorrect number of available PEBs.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 21:39:08 +01:00
Guo Xuenan
fbed4baed0 ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130
When using the ioctl interface to resize a UBI volume, `ubi_resize_volume`
resizes the EBA table first but does not change `vol->reserved_pebs` in
the same atomic context, which may cause concurrent access to the EBA table.

For example, when a user shrinks UBI volume A by calling `ubi_resize_volume`,
while another thread is writing to volume B and triggering wear-leveling,
which may call `ubi_write_fastmap`, under these circumstances, KASAN may
report a slab-out-of-bounds error in `ubi_eba_get_ldesc+0xfb/0x130`.

This patch fixes race conditions in `ubi_resize_volume` and
`ubi_update_fastmap` to avoid out-of-bounds reads of `eba_tbl`. First,
it ensures that updates to `eba_tbl` and `reserved_pebs` are protected
by `vol->volumes_lock`. Second, it implements a rollback mechanism in case
of resize failure. It is also worth mentioning that for volume shrinkage
failures, since part of the volume has already been shrunk and unmapped,
there is no need to recover `{rsvd/avail}_pebs`.

==================================================================
BUG: KASAN: slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 [ubi]
Read of size 4 at addr ffff88800f43f570 by task kworker/u16:0/7
CPU: 0 PID: 7 Comm: kworker/u16:0 Not tainted 5.16.0-rc7 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_address_description.constprop.0+0x41/0x60
 kasan_report.cold+0x83/0xdf
 ubi_eba_get_ldesc+0xfb/0x130 [ubi]
 ubi_update_fastmap.cold+0x60f/0xc7d [ubi]
 ubi_wl_get_peb+0x25b/0x4f0 [ubi]
 try_write_vid_and_data+0x9a/0x4d0 [ubi]
 ubi_eba_write_leb+0x7e4/0x17d0 [ubi]
 ubi_leb_map+0x1a0/0x2c0 [ubi]
 ubifs_leb_map+0x139/0x270 [ubifs]
 ubifs_add_bud_to_log+0xb40/0xf30 [ubifs]
 make_reservation+0x86e/0xb00 [ubifs]
 ubifs_jnl_write_data+0x430/0x9d0 [ubifs]
 do_writepage+0x1d1/0x550 [ubifs]
 ubifs_writepage+0x37c/0x670 [ubifs]
 __writepage+0x67/0x170
 write_cache_pages+0x259/0xa90
 do_writepages+0x277/0x5d0
 __writeback_single_inode+0xb8/0x850
 writeback_sb_inodes+0x4b3/0xb20
 __writeback_inodes_wb+0xc1/0x220
 wb_writeback+0x59f/0x740
 wb_workfn+0x6d0/0xca0
 process_one_work+0x711/0xfc0
 worker_thread+0x95/0xd00
 kthread+0x3a6/0x490
 ret_from_fork+0x1f/0x30
 </TASK>

Allocated by task 711:
 kasan_save_stack+0x1e/0x50
 __kasan_kmalloc+0x81/0xa0
 ubi_eba_create_table+0x88/0x1a0 [ubi]
 ubi_resize_volume.cold+0x175/0xae7 [ubi]
 ubi_cdev_ioctl+0x57f/0x1a60 [ubi]
 __x64_sys_ioctl+0x13a/0x1c0
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Last potentially related work creation:
 kasan_save_stack+0x1e/0x50
 __kasan_record_aux_stack+0xb7/0xc0
 call_rcu+0xd6/0x1000
 blk_stat_free_callback+0x28/0x30
 blk_release_queue+0x8a/0x2e0
 kobject_put+0x186/0x4c0
 scsi_device_dev_release_usercontext+0x620/0xbd0
 execute_in_process_context+0x2f/0x120
 device_release+0xa4/0x240
 kobject_put+0x186/0x4c0
 put_device+0x20/0x30
 __scsi_remove_device+0x1c3/0x300
 scsi_probe_and_add_lun+0x2140/0x2eb0
 __scsi_scan_target+0x1f2/0xbb0
 scsi_scan_channel+0x11b/0x1a0
 scsi_scan_host_selected+0x24c/0x310
 do_scsi_scan_host+0x1e0/0x250
 do_scan_async+0x45/0x490
 async_run_entry_fn+0xa2/0x530
 process_one_work+0x711/0xfc0
 worker_thread+0x95/0xd00
 kthread+0x3a6/0x490
 ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff88800f43f500
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 112 bytes inside of
 128-byte region [ffff88800f43f500, ffff88800f43f580)
The buggy address belongs to the page:
page:ffffea00003d0f00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xf43c
head:ffffea00003d0f00 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 001fffff80010200 ffffea000046ba08 ffffea0000457208 ffff88810004d1c0
raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff88800f43f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800f43f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88800f43f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
                                                             ^
 ffff88800f43f580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800f43f600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

The following steps can used to reproduce:
Process 1: write and trigger ubi wear-leveling
    ubimkvol /dev/ubi0 -s 5000MiB -N v1
    ubimkvol /dev/ubi0 -s 2000MiB -N v2
    ubimkvol /dev/ubi0 -s 10MiB -N v3
    mount -t ubifs /dev/ubi0_0 /mnt/ubifs
    while true;
    do
        filename=/mnt/ubifs/$((RANDOM))
        dd if=/dev/random of=${filename} bs=1M count=$((RANDOM % 1000))
        rm -rf ${filename}
        sync /mnt/ubifs/
    done

Process 2: do random resize
    struct ubi_rsvol_req req;
    req.vol_id = 1;
    req.bytes = (rand() % 50) * 512KB;
    ioctl(fd, UBI_IOCRSVOL, &req);

V3:
 - Fix the commit message error.

V2:
 - Add volumes_lock in ubi_eba_copy_leb() to avoid race caused by
   updating eba_tbl.

V1:
 - Rebase the patch on the latest mainline.

Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 21:38:41 +01:00
Zhang Yi
7f174ae4f3 ubi: correct the calculation of fastmap size
Now that the calculation of fastmap size in ubi_calc_fm_size() is
incorrect since it miss each user volume's ubi_fm_eba structure and the
Internal UBI volume info. Let's correct the calculation.

Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-02-25 21:30:15 +01:00
Richard Weinberger
68a24aba7c ubi: Check for too small LEB size in VTBL code
If the LEB size is smaller than a volume table record we cannot
have volumes.
In this case abort attaching.

Cc: Chenyuan Yang <cy54@illinois.edu>
Cc: stable@vger.kernel.org
Fixes: 801c135ce7 ("UBI: Unsorted Block Images")
Reported-by: Chenyuan Yang <cy54@illinois.edu>
Closes: https://lore.kernel.org/linux-mtd/1433EB7A-FC89-47D6-8F47-23BE41B263B3@illinois.edu/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-02-25 21:19:42 +01:00
Christoph Hellwig
21b700c081 ubiblock: pass queue_limits to blk_mq_alloc_disk
Pass the few limits ubiblock imposes directly to blk_mq_alloc_disk
instead of setting them one at a time.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20240215070300.2200308-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-19 16:59:32 -07:00
Christoph Hellwig
27e32cd23f block: pass a queue_limits argument to blk_mq_alloc_disk
Pass a queue_limits to blk_mq_alloc_disk and apply it if non-NULL.  This
will allow allocating queues with valid queue limits instead of setting
the values one at a time later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20240213073425.1621680-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-13 08:56:59 -07:00
Li Nan
adbf4c4954 ubi: block: fix memleak in ubiblock_create()
If idr_alloc() fails, dev->gd will be put after goto out_cleanup_disk in
ubiblock_create(), but dev->gd has not been assigned yet at this time, and
'gd' will not be put anymore. Fix it by putting 'gd' directly.

Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06 23:52:51 +01:00
ZhaoLong Wang
4d0deb380a ubi: Reserve sufficient buffer length for the input mask
Because the mask received by the emulate_failures interface
is a 32-bit unsigned integer, ensure that there is sufficient
buffer length to receive and display this value.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06 23:45:44 +01:00
ZhaoLong Wang
7cd8d1f847 ubi: Add six fault injection type for testing
This commit adds six fault injection type for testing to cover the
abnormal path of the UBI driver.

Inject the following faults when the UBI reads the LEB:
 +----------------------------+-----------------------------------+
 |    Interface name          |       emulate behavior            |
 +----------------------------+-----------------------------------+
 |  emulate_eccerr            | ECC error                         |
 +----------------------------+-----------------------------------+
 |  emulate_read_failure      | read failure                      |
 |----------------------------+-----------------------------------+
 |  emulate_io_ff             | read content as all FF            |
 |----------------------------+-----------------------------------+
 |  emulate_io_ff_bitflips    | content FF with MTD err reported  |
 +----------------------------+-----------------------------------+
 |  emulate_bad_hdr           | bad leb header                    |
 |----------------------------+-----------------------------------+
 |  emulate_bad_hdr_ebadmsg   | bad header with ECC err           |
 +----------------------------+-----------------------------------+

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06 23:41:40 +01:00
ZhaoLong Wang
e30948f7c0 ubi: Split io_failures into write_failure and erase_failure
The emulate_io_failures debugfs entry controls both write
failure and erase failure. This patch split io_failures
to write_failure and erase_failure.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06 23:39:29 +01:00
ZhaoLong Wang
6931fb4485 ubi: Use the fault injection framework to enhance the fault injection capability
To make debug parameters configurable at run time, use the
fault injection framework to reconstruct the debugfs interface,
and retain the legacy fault injection interface.

Now, the file emulate_failures and fault_attr files control whether
to enable fault emmulation.

The file emulate_failures receives a mask that controls type and
process of fault injection. Generally, for ease of use, you can
directly enter a mask with all 1s.

echo 0xffff > /sys/kernel/debug/ubi/ubi0/emulate_failures

And you need to configure other fault-injection capabilities for
testing purpose:

echo 100 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/probability
echo 15 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/space
echo 2 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/verbose
echo -1 > /sys/kernel/debug/ubi/fault_inject/emulate_power_cut/times

The CONFIG_MTD_UBI_FAULT_INJECTION to enable the Fault Injection is
added to kconfig.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-06 23:38:55 +01:00
ZhaoLong Wang
d07cec9c23 ubi: block: Fix use-after-free in ubiblock_cleanup
The following BUG is reported when a ubiblock is removed:

 ==================================================================
 BUG: KASAN: slab-use-after-free in ubiblock_cleanup+0x88/0xa0 [ubi]
 Read of size 4 at addr ffff88810c8f3804 by task ubiblock/1716

 CPU: 5 PID: 1716 Comm: ubiblock Not tainted 6.6.0-rc2+ #135
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x37/0x50
  print_report+0xd0/0x620
  kasan_report+0xb6/0xf0
  ubiblock_cleanup+0x88/0xa0 [ubi]
  ubiblock_remove+0x121/0x190 [ubi]
  vol_cdev_ioctl+0x355/0x630 [ubi]
  __x64_sys_ioctl+0xc7/0x100
  do_syscall_64+0x3f/0x90
  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
 RIP: 0033:0x7f08d7445577
 Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 8
 RSP: 002b:00007ffde05a3018 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f08d7445577
 RDX: 0000000000000000 RSI: 0000000000004f08 RDI: 0000000000000003
 RBP: 0000000000816010 R08: 00000000008163a7 R09: 0000000000000000
 R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000003
 R13: 00007ffde05a3130 R14: 0000000000000000 R15: 0000000000000000
  </TASK>

 Allocated by task 1715:
  kasan_save_stack+0x22/0x50
  kasan_set_track+0x25/0x30
  __kasan_kmalloc+0x7f/0x90
  __alloc_disk_node+0x40/0x2b0
  __blk_mq_alloc_disk+0x3e/0xb0
  ubiblock_create+0x2ba/0x620 [ubi]
  vol_cdev_ioctl+0x581/0x630 [ubi]
  __x64_sys_ioctl+0xc7/0x100
  do_syscall_64+0x3f/0x90
  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

 Freed by task 0:
  kasan_save_stack+0x22/0x50
  kasan_set_track+0x25/0x30
  kasan_save_free_info+0x2b/0x50
  __kasan_slab_free+0x10e/0x190
  __kmem_cache_free+0x96/0x220
  bdev_free_inode+0xa4/0xf0
  rcu_core+0x496/0xec0
  __do_softirq+0xeb/0x384

 The buggy address belongs to the object at ffff88810c8f3800
  which belongs to the cache kmalloc-1k of size 1024
 The buggy address is located 4 bytes inside of
  freed 1024-byte region [ffff88810c8f3800, ffff88810c8f3c00)

 The buggy address belongs to the physical page:
 page:00000000d03de848 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10c8f0
 head:00000000d03de848 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 flags: 0x200000000000840(slab|head|node=0|zone=2)
 page_type: 0xffffffff()
 raw: 0200000000000840 ffff888100042dc0 ffffea0004244400 dead000000000002
 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff88810c8f3700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff88810c8f3780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff88810c8f3800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                    ^
  ffff88810c8f3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff88810c8f3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

Fix it by using a local variable to record the gendisk ID.

Fixes: 77567b25ab ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk")
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 23:18:39 +02:00
Zhihao Cheng
ac085cfe57 ubi: fastmap: Add control in 'UBI_IOCATT' ioctl to reserve PEBs for filling pools
This patch imports a new field 'need_resv_pool' in struct 'ubi_attach_req'
to control whether or not reserving free PEBs for filling pool/wl_pool.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 23:16:00 +02:00
Zhihao Cheng
d4c48e5b58 ubi: fastmap: Add module parameter to control reserving filling pool PEBs
Adding 6th module parameter in 'mtd=xxx' to control whether or not
reserving PEBs for filling pool/wl_pool.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 23:15:44 +02:00
Zhihao Cheng
90e0be5614 ubi: fastmap: Fix lapsed wear leveling for first 64 PEBs
The anchor PEB must be picked from first 64 PEBs, these PEBs could have
large erase counter greater than other PEBs especially when free space
is nearly running out.
The ubi_update_fastmap will be called as long as pool/wl_pool is empty,
old anchor PEB is erased when updating fastmap. Given an UBI device with
N PEBs, free PEBs is nearly running out and pool will be filled with 1
PEB every time ubi_update_fastmap invoked. So t=N/POOL_SIZE[1]/64 means
that in worst case the erase counter of first 64 PEBs is t times greater
than other PEBs in theory.
After running fsstress for 24h, the erase counter statistics for two UBI
devices shown as follow(CONFIG_MTD_UBI_WL_THRESHOLD=128):

Device A(1024 PEBs, pool=50, wl_pool=25):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:        0        0        0        0
1000     ..     9999:        0        0        0        0
10000    ..    99999:      960    29224    29282    29362
100000   ..      inf:       64   117897   117934   117940
---------------------------------------------------------
Total               :     1024    29224    34822   117940

Device B(8192 PEBs, pool=256, wl_pool=128):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:        0        0        0        0
1000     ..     9999:     8128     2253     2321     2387
10000    ..    99999:       64    35387    35387    35388
100000   ..      inf:        0        0        0        0
---------------------------------------------------------
Total               :     8192     2253     2579    35388

The key point is reducing fastmap updating frequency by enlarging
POOL_SIZE, so let UBI reserve ubi->fm_pool.max_size PEBs during
attaching. Then POOL_SIZE will become ubi->fm_pool.max_size/2 even
in free space running out case.
Given an UBI device with 8192 PEBs(16384\8192\4096 is common
large-capacity flash), t=8192/128/64=1. The fastmap updating will
happen in either wl_pool or pool is empty, so setting fm_pool_rsv_cnt
as ubi->fm_pool.max_size can fill wl_pool in full state.

After pool reservation, running fsstress for 24h:

Device A(1024 PEBs, pool=50, wl_pool=25):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:        0        0        0        0
1000     ..     9999:        0        0        0        0
10000    ..    99999:     1024    33801    33997    34056
100000   ..      inf:        0        0        0        0
---------------------------------------------------------
Total               :     1024    33801    33997    34056

Device B(8192 PEBs, pool=256, wl_pool=128):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:        0        0        0        0
1000     ..     9999:     8192     2205     2397     2460
10000    ..    99999:        0        0        0        0
100000   ..      inf:        0        0        0        0
---------------------------------------------------------
Total               :     8192     2205     2397     2460

The difference of erase counter between first 64 PEBs and others is
under WL_FREE_MAX_DIFF(2*UBI_WL_THRESHOLD=2*128=256).
  Device A: 34056 - 33801 = 255
  Device B: 2460 - 2205 = 255

Next patch will add a switch to control whether UBI needs to reserve
PEBs for filling pool.

Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 23:14:55 +02:00
Zhihao Cheng
761893bd49 ubi: fastmap: Get wl PEB even ec beyonds the 'max' if free PEBs are run out
This is the part 2 to fix cyclically reusing single fastmap data PEBs.

Consider one situation, if there are four free PEBs for fm_anchor, pool,
wl_pool and fastmap data PEB with erase counter 100, 100, 100, 5096
(ubi->beb_rsvd_pebs is 0). PEB with erase counter 5096 is always picked
for fastmap data according to the realization of find_wl_entry(), since
fastmap data PEB is not scheduled for wl, finally there are two PEBs
(fm data) with great erase counter than other PEBS.
Get wl PEB even its erase counter exceeds the 'max' in find_wl_entry()
when free PEBs are run out after filling pools and fm data. Then the PEB
with biggest erase conter is taken as wl PEB, it can be scheduled for wl.

Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 23:07:54 +02:00
Zhihao Cheng
eada823e6a ubi: fastmap: may_reserve_for_fm: Don't reserve PEB if fm_anchor exists
This is the part 1 to fix cyclically reusing single fastmap data PEBs.

After running fsstress on UBIFS for a while, UBI (16384 blocks, fastmap
takes 2 blocks) has an erase block(PEB: 8031) with big erase counter
greater than any other pebs:

=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:      532       84       92       99
100      ..      999:    15787      100      147      229
1000     ..     9999:       64     4699     4765     4826
10000    ..    99999:        0        0        0        0
100000   ..      inf:        1   272935   272935   272935
---------------------------------------------------------
Total               :    16384       84      180   272935

Not like fm_anchor, there is no candidate PEBs for fastmap data area,
so old fastmap data pebs will be reused after all free pebs are filled
into pool/wl_pool:
ubi_update_fastmap
 for (i = 1; i < new_fm->used_blocks; i++)
  erase_block(ubi, old_fm->e[i]->pnum)
  new_fm->e[i] = old_fm->e[i]

According to wear leveling algorithm, UBI selects one small erase
counter PEB from ubi->used and one big erase counter PEB from wl_pool,
the reused fastmap data PEB is not in these trees. UBI won't schedule
this PEB for wl even it is in ubi->used because wl algorithm expects
small erase counter for used PEB.

Don't reserve PEB for fastmap in may_reserve_for_fm() if fm_anchor
already exists. Otherwise, when UBI is running out of free PEBs,
the only one free PEB (pnum < 64) will be skipped and fastmap data
will be written on the same old PEB.

Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 22:49:14 +02:00
Zhihao Cheng
415e4723c4 ubi: fastmap: Remove unneeded break condition while filling pools
Change pool filling stop condition. Commit d09e9a2bdd ("ubi:
fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool
not empty") reserves fastmap data PEBs after filling 1 PEB in
wl_pool. Now wait_free_pebs_for_pool() makes enough free PEBs
before filling pool, there will still be at least 1 PEB in pool
and 1 PEB in wl_pool after doing ubi_refill_pools().

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 22:47:43 +02:00
Zhihao Cheng
a2ea69dac6 ubi: fastmap: Wait until there are enough free PEBs before filling pools
Wait until there are enough free PEBs before filling pool/wl_pool,
sometimes erase_worker is not scheduled in time, which causes two
situations:
 A. There are few PEBs filled in pool, which makes ubi_update_fastmap
    is frequently called and leads first 64 PEBs are erased more times
    than other PEBs. So waiting free PEBs before filling pool reduces
    fastmap updating frequency and prolongs flash service life.
 B. In situation that space is nearly running out, ubi_refill_pools()
    cannot make sure pool and wl_pool are filled with free PEBs, caused
    by the delay of erase_worker. After this patch applied, there must
    exist free PEBs in pool after one call of ubi_update_fastmap.

Besides, this patch is a preparetion for fixing large erase counter in
fastmap data block and fixing lapsed wear leveling for first 64 PEBs.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 22:43:40 +02:00
Zhihao Cheng
8ff4e620ac ubi: fastmap: Use free pebs reserved for bad block handling
If new bad PEBs occur, UBI firstly consumes ubi->beb_rsvd_pebs, and then
ubi->avail_pebs, finally UBI becomes read-only if above two items are 0,
which means that the amount of PEBs for user volumes is not effected.
Besides, UBI reserves count of free PBEs is ubi->beb_rsvd_pebs while
filling wl pool or getting free PEBs, but ubi->avail_pebs is not reserved.
So ubi->beb_rsvd_pebs and ubi->avail_pebs have nothing to do with the
usage of free PEBs, UBI can use all free PEBs.

Commit 78d6d497a6 ("UBI: Move fastmap specific functions out of wl.c")
has removed beb_rsvd_pebs checking while filling pool. Now, don't reserve
ubi->beb_rsvd_pebs while filling wl_pool. This will fill more PEBs in pool
and also reduce fastmap updating frequency.

Also remove beb_rsvd_pebs checking in ubi_wl_get_fm_peb.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-10-28 22:41:01 +02:00