Commit Graph

3 Commits

Author SHA1 Message Date
fan.yu9@zte.com.cn
d92dccd05a delaytop: enhance error logging and add PSI feature description
This patch improves error diagnostics and documentation for delaytop:

1) Enhanced error logging:
   - Added explicit error messages in critical failure paths
   - Implemented BOOL_FPRINT macro for robust output handling

2) PSI feature documentation:
   - Updated header comment to reflect PSI monitoring capability
   - Improved output formatting for PSI information

System Pressure Information: (avg10/avg60/avg300/total)
CPU some:       0.0%/   0.0%/   0.0%/     345(ms)
CPU full:       0.0%/   0.0%/   0.0%/       0(ms)
Memory full:    0.0%/   0.0%/   0.0%/       0(ms)
Memory some:    0.0%/   0.0%/   0.0%/       0(ms)
IO full:        0.0%/   0.0%/   0.0%/      65(ms)
IO some:        0.0%/   0.0%/   0.0%/      79(ms)
IRQ full:       0.0%/   0.0%/   0.0%/       0(ms)

Link: https://lkml.kernel.org/r/202507281628341752gMXCMN7S-Vz_LHYHum9r@zte.com.cn
Signed-off-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Acked-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: Fan Yu <fan.yu9@zte.com.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-02 12:01:41 -07:00
Wang Yaxin
6b47c9f8ee delaytop: add psi info to show system delay
Support showing whole delay of system by reading PSI, just like the first
few lines of information output by the top command.  the output of
delaytop includes both system-wide delay and delay of individual tasks,
providing a more comprehensive reflection of system latency status.

Use case
========
bash# ./delaytop
System Pressure Information: (avg10/avg60/avg300/total)
CPU:    full:    0.0%/   0.0%/   0.0%/0           some:    0.1%/   0.0%/   0.0%/14216596
Memory: full:    0.0%/   0.0%/   0.0%/34010659    some:    0.0%/   0.0%/   0.0%/35406492
IO:     full:    0.1%/   0.0%/   0.0%/51029453    some:    0.1%/   0.0%/   0.0%/55330465
IRQ:    full:    0.0%/   0.0%/   0.0%/0

Top 20 processes (sorted by CPU delay):

  PID   TGID  COMMAND            CPU(ms)  IO(ms)        SWAP(ms) RCL(ms) THR(ms)  CMP(ms)  WP(ms)  IRQ(ms)
---------------------------------------------------------------------------------------------
   32     32  kworker/2:0H-sy   23.65     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  497    497  kworker/R-scsi_    1.20     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  495    495  kworker/R-scsi_    1.13     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  494    494  scsi_eh_0          1.12     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  485    485  kworker/R-ata_s    0.90     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  574    574  kworker/R-kdmfl    0.36     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   34     34  idle_inject/3      0.33     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1123   1123  nde-netfilter      0.28     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   60     60  ksoftirqd/7        0.25     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  114    114  kworker/0:2-cgr    0.25     0.00     0.00     0.00    0.00     0.00     0.00     0.00
  496    496  scsi_eh_1          0.24     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   51     51  cpuhp/6            0.24     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1667   1667  atd                0.24     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   45     45  cpuhp/5            0.23     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1102   1102  nde-backupservi    0.22     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1098   1098  systemsettings     0.21     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1100   1100  audit-monitor      0.20     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   53     53  migration/6        0.20     0.00     0.00     0.00    0.00     0.00     0.00     0.00
 1482   1482  sshd               0.19     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   39     39  cpuhp/4            0.19     0.00     0.00     0.00    0.00     0.00     0.00     0.00

Link: https://lkml.kernel.org/r/20250710135451340_5pOgpIFi0M5AE7H44W1D@zte.com.cn
Co-developed-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peilin He <he.peilin@zte.com.cn>
Cc: Qiang Tu <tu.qiang35@zte.com.cn>
Cc: wangyong <wang.yong12@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:29 -07:00
Yaxin Wang
01bda05819 tools/accounting/delaytop: add delaytop to record top-n task delay
Problem
=======

The "getdelays" can only display the latency of a single task
by specifying a PID, but it has the following limitations:

1. single-task perspective: only supports querying the latency (CPU, I/O,
memory, etc.) of an individual task via PID and cannot provide a global
analysis of high-latency processes across the system.

2. lack of High-Latency process awareness: when the overall system
   latency is high (e.g., a spike in CPU latency), there is no way to
   quickly identify the top N processes contributing to the highest
   latency.

3. poor interactivity: It lacks dynamic sorting and refresh
   capabilities (similar to top), making it difficult to monitor latency
   changes in real time.

Solution
========

To address these limitations, we introduce the "delaytop" with the
following capabilities:

1. system view: monitors latency metrics (CPU, I/O, memory, IRQ, etc.)
   for all system processes

2. supports field-based sorting (e.g., default sort by CPU latency in
   descending order)

3. dynamic interactive interface: focus on specific processes with
   --pid; limit displayed entries with --processes 20; control monitoring
   duration with --iterations;

Use case
========
bash# ./delaytop
Top 20 processes (sorted by CPU delay):

  PID   TGID  COMMAND            CPU(ms)  IO(ms)   SWAP(ms) RCL(ms) THR(ms)  CMP(ms)  WP(ms)  IRQ(ms)
---------------------------------------------------------------------------------------------
   26     26  kworker/1:0H       5.55     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   32     32  kworker/2:0H-kb    2.93     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   38     38  kworker/3:0H-ev    2.88     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   84     84  kworker/R-vfio-    1.62     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   24     24  ksoftirqd/1        1.43     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   19     19  idle_inject/0      0.99     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   16     16  rcu_exp_par_gp_    0.87     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   11     11  kworker/0:1        0.87     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   22     22  idle_inject/1      0.80     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    3      3  pool_workqueue_    0.74     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   81     81  scsi_eh_1          0.59     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   30     30  ksoftirqd/2        0.42     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   36     36  ksoftirqd/3        0.37     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    9      9  kworker/0:0-eve    0.36     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    8      8  kworker/R-netns    0.34     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   76     76  kworker/1:1-pm     0.32     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   21     21  cpuhp/1            0.30     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    4      4  kworker/R-rcu_g    0.21     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   12     12  kworker/u16:0-i    0.20     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    1      1  init               0.18     0.00     0.00     0.00    0.00     0.00     0.08     0.00

Link: https://lkml.kernel.org/r/20250619211843633h05gWrBDMFkEH6xAVm_5y@zte.com.cn
Co-developed-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Yaxin Wang <wang.yaxin@zte.com.cn>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peilin He <he.peilin@zte.com.cn>
Cc: Qiang Tu <tu.qiang35@zte.com.cn>
Cc: wangyong <wang.yong12@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: ye xingchen <ye.xingchen@zte.com.cn>
Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-09 22:57:56 -07:00