mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2026-01-22 13:17:23 +00:00
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Intel Knights Landing support. (Harish Chegondi)
- Intel Broadwell-EP uncore PMU support. (Kan Liang)
- Core code improvements. (Peter Zijlstra.)
- Event filter, LBR and PEBS fixes. (Stephane Eranian)
- Enable cycles:pp on Intel Atom. (Stephane Eranian)
- Add cycles:ppp support for Skylake. (Andi Kleen)
- Various x86 NMI overhead optimizations. (Andi Kleen)
- Intel PT enhancements. (Takao Indoh)
- AMD cache events fix. (Vince Weaver)
Tons of tooling changes:
- Show random perf tool tips in the 'perf report' bottom line
(Namhyung Kim)
- perf report now defaults to --group if the perf.data file has
grouped events, try it with:
# perf record -e '{cycles,instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
# perf report
# Samples: 1K of event 'anon group { cycles, instructions }'
# Event count (approx.): 1955219195
#
# Overhead Command Shared Object Symbol
2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1>
0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
0.62% 1.27% firefox libxul.so [.] js::GetIterator
0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining
- Introduce the 'perf stat record/report' workflow:
Generate perf.data files from 'perf stat', to tap into the
scripting capabilities perf has instead of defining a 'perf stat'
specific scripting support to calculate event ratios, etc.
Simple example:
$ perf stat record -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$ perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$
It generates PERF_RECORD_ userspace records to store the details:
$ perf report -D | grep PERF_RECORD
0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x12a [0x40]: PERF_RECORD_STAT_CONFIG
0x16a [0x30]: PERF_RECORD_STAT
-1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x1da [0x18]: PERF_RECORD_STAT_ROUND
[acme@ssdandy linux]$
An effort was made to make perf.data files generated like this to
not generate cryptic messages when processed by older tools.
The 'perf script' bits need rebasing, will go up later.
- Make command line options always available, even when they depend
on some feature being enabled, warning the user about use of such
options (Wang Nan)
- Support hw breakpoint events (mem:0xAddress) in the default output
mode in 'perf script' (Wang Nan)
- Fixes and improvements for supporting annotating ARM binaries,
support ARM call and jump instructions, more work needed to have
arch specific stuff separated into tools/perf/arch/*/annotate/
(Russell King)
- Add initial 'perf config' command, for now just with a --list
command to the contents of the configuration file in use and a
basic man page describing its format, commands for doing edits and
detailed documentation are being reviewed and proof-read. (Taeung
Song)
- Allows BPF scriptlets specify arguments to be fetched using DWARF
info, using a prologue generated at compile/build time (He Kuang,
Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang
Nan)
- BPF programs now can specify 'perf probe' tunables via its section
name, separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.
# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#
- Use 'perf probe' various options to list functions, see what
variables can be collected at any given point, experiment first
collecting without a filter, then filter, use it together with
'perf trace', 'perf top', with or without callchains, if it
explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry,
facilitating 'perf report' output processing by other tools, such
as Brendan Gregg's flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#
The user can also select one of "count", "period" or "percent" as
the first column.
... and lots of infrastructure enhancements, plus fixes and other
changes, features I failed to list - see the shortlog and the git log
for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits)
perf evlist: Add --trace-fields option to show trace fields
perf record: Store data mmaps for dwarf unwind
perf libdw: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Use find_map function in access_dso_mem
perf evlist: Remove perf_evlist__(enable|disable)_event functions
perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
perf report: Show random usage tip on the help line
perf hists: Export a couple of hist functions
perf diff: Use perf_hpp__register_sort_field interface
perf tools: Add overhead/overhead_children keys defaults via string
perf tools: Remove list entry from struct sort_entry
perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
perf script: Align event name properly
perf tools: Add missing headers in perf's MANIFEST
perf tools: Do not show trace command if it's not compiled in
perf report: Change default to use event group view
perf top: Decay periods in callchains
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
tools lib: Sync tools/lib/find_bit.c with the kernel
...
|
||
|---|---|---|
| .. | ||
| android | ||
| byteorder | ||
| caif | ||
| can | ||
| cifs | ||
| dvb | ||
| genwqe | ||
| hdlc | ||
| hsi | ||
| iio | ||
| isdn | ||
| mmc | ||
| netfilter | ||
| netfilter_arp | ||
| netfilter_bridge | ||
| netfilter_ipv4 | ||
| netfilter_ipv6 | ||
| nfsd | ||
| raid | ||
| spi | ||
| sunrpc | ||
| tc_act | ||
| tc_ematch | ||
| usb | ||
| wimax | ||
| a.out.h | ||
| acct.h | ||
| adb.h | ||
| adfs_fs.h | ||
| affs_hardblocks.h | ||
| agpgart.h | ||
| aio_abi.h | ||
| am437x-vpfe.h | ||
| apm_bios.h | ||
| arcfb.h | ||
| atalk.h | ||
| atm_eni.h | ||
| atm_he.h | ||
| atm_idt77105.h | ||
| atm_nicstar.h | ||
| atm_tcp.h | ||
| atm_zatm.h | ||
| atm.h | ||
| atmapi.h | ||
| atmarp.h | ||
| atmbr2684.h | ||
| atmclip.h | ||
| atmdev.h | ||
| atmioc.h | ||
| atmlec.h | ||
| atmmpc.h | ||
| atmppp.h | ||
| atmsap.h | ||
| atmsvc.h | ||
| audit.h | ||
| auto_fs4.h | ||
| auto_fs.h | ||
| auxvec.h | ||
| ax25.h | ||
| b1lli.h | ||
| baycom.h | ||
| bcache.h | ||
| bcm933xx_hcs.h | ||
| bfs_fs.h | ||
| binfmts.h | ||
| blkpg.h | ||
| blktrace_api.h | ||
| bpf_common.h | ||
| bpf.h | ||
| bpqether.h | ||
| bsg.h | ||
| btrfs.h | ||
| can.h | ||
| capability.h | ||
| capi.h | ||
| cciss_defs.h | ||
| cciss_ioctl.h | ||
| cdrom.h | ||
| cgroupstats.h | ||
| chio.h | ||
| cm4000_cs.h | ||
| cn_proc.h | ||
| coda_psdev.h | ||
| coda.h | ||
| coff.h | ||
| connector.h | ||
| const.h | ||
| cramfs_fs.h | ||
| cryptouser.h | ||
| cuda.h | ||
| cyclades.h | ||
| cycx_cfm.h | ||
| dcbnl.h | ||
| dccp.h | ||
| dlm_device.h | ||
| dlm_netlink.h | ||
| dlm_plock.h | ||
| dlm.h | ||
| dlmconstants.h | ||
| dm-ioctl.h | ||
| dm-log-userspace.h | ||
| dn.h | ||
| dqblk_xfs.h | ||
| edd.h | ||
| efs_fs_sb.h | ||
| elf-em.h | ||
| elf-fdpic.h | ||
| elf.h | ||
| elfcore.h | ||
| errno.h | ||
| errqueue.h | ||
| ethtool.h | ||
| eventpoll.h | ||
| fadvise.h | ||
| falloc.h | ||
| fanotify.h | ||
| fb.h | ||
| fcntl.h | ||
| fd.h | ||
| fdreg.h | ||
| fib_rules.h | ||
| fiemap.h | ||
| filter.h | ||
| firewire-cdev.h | ||
| firewire-constants.h | ||
| flat.h | ||
| fou.h | ||
| fs.h | ||
| fsl_hypervisor.h | ||
| fuse.h | ||
| futex.h | ||
| gameport.h | ||
| gen_stats.h | ||
| genetlink.h | ||
| gfs2_ondisk.h | ||
| gigaset_dev.h | ||
| gsmmux.h | ||
| hash_info.h | ||
| hdlc.h | ||
| hdlcdrv.h | ||
| hdreg.h | ||
| hid.h | ||
| hiddev.h | ||
| hidraw.h | ||
| hpet.h | ||
| hsr_netlink.h | ||
| hw_breakpoint.h | ||
| hyperv.h | ||
| hysdn_if.h | ||
| i2c-dev.h | ||
| i2c.h | ||
| i2o-dev.h | ||
| i8k.h | ||
| icmp.h | ||
| icmpv6.h | ||
| if_addr.h | ||
| if_addrlabel.h | ||
| if_alg.h | ||
| if_arcnet.h | ||
| if_arp.h | ||
| if_bonding.h | ||
| if_bridge.h | ||
| if_cablemodem.h | ||
| if_eql.h | ||
| if_ether.h | ||
| if_fc.h | ||
| if_fddi.h | ||
| if_frad.h | ||
| if_hippi.h | ||
| if_infiniband.h | ||
| if_link.h | ||
| if_ltalk.h | ||
| if_packet.h | ||
| if_phonet.h | ||
| if_plip.h | ||
| if_ppp.h | ||
| if_pppol2tp.h | ||
| if_pppox.h | ||
| if_slip.h | ||
| if_team.h | ||
| if_tun.h | ||
| if_tunnel.h | ||
| if_vlan.h | ||
| if_x25.h | ||
| if.h | ||
| igmp.h | ||
| ila.h | ||
| in6.h | ||
| in_route.h | ||
| in.h | ||
| inet_diag.h | ||
| inotify.h | ||
| input-event-codes.h | ||
| input.h | ||
| ioctl.h | ||
| ip6_tunnel.h | ||
| ip_vs.h | ||
| ip.h | ||
| ipc.h | ||
| ipmi_msgdefs.h | ||
| ipmi.h | ||
| ipsec.h | ||
| ipv6_route.h | ||
| ipv6.h | ||
| ipx.h | ||
| irda.h | ||
| irqnr.h | ||
| isdn_divertif.h | ||
| isdn_ppp.h | ||
| isdn.h | ||
| isdnif.h | ||
| iso_fs.h | ||
| ivtv.h | ||
| ivtvfb.h | ||
| ixjuser.h | ||
| jffs2.h | ||
| joystick.h | ||
| Kbuild | ||
| kcmp.h | ||
| kd.h | ||
| kdev_t.h | ||
| kernel-page-flags.h | ||
| kernel.h | ||
| kernelcapi.h | ||
| kexec.h | ||
| keyboard.h | ||
| keyctl.h | ||
| kfd_ioctl.h | ||
| kvm_para.h | ||
| kvm.h | ||
| l2tp.h | ||
| libc-compat.h | ||
| lightnvm.h | ||
| limits.h | ||
| llc.h | ||
| loop.h | ||
| lp.h | ||
| lwtunnel.h | ||
| magic.h | ||
| major.h | ||
| map_to_7segment.h | ||
| matroxfb.h | ||
| mdio.h | ||
| media-bus-format.h | ||
| media.h | ||
| mei.h | ||
| membarrier.h | ||
| memfd.h | ||
| mempolicy.h | ||
| meye.h | ||
| mic_common.h | ||
| mic_ioctl.h | ||
| mii.h | ||
| minix_fs.h | ||
| mman.h | ||
| mmtimer.h | ||
| module.h | ||
| mpls_iptunnel.h | ||
| mpls.h | ||
| mqueue.h | ||
| mroute6.h | ||
| mroute.h | ||
| msdos_fs.h | ||
| msg.h | ||
| mtio.h | ||
| n_r3964.h | ||
| nbd.h | ||
| ncp_fs.h | ||
| ncp_mount.h | ||
| ncp_no.h | ||
| ncp.h | ||
| ndctl.h | ||
| neighbour.h | ||
| net_dropmon.h | ||
| net_namespace.h | ||
| net_tstamp.h | ||
| net.h | ||
| netconf.h | ||
| netdevice.h | ||
| netfilter_arp.h | ||
| netfilter_bridge.h | ||
| netfilter_decnet.h | ||
| netfilter_ipv4.h | ||
| netfilter_ipv6.h | ||
| netfilter.h | ||
| netlink_diag.h | ||
| netlink.h | ||
| netrom.h | ||
| nfc.h | ||
| nfs2.h | ||
| nfs3.h | ||
| nfs4_mount.h | ||
| nfs4.h | ||
| nfs_fs.h | ||
| nfs_idmap.h | ||
| nfs_mount.h | ||
| nfs.h | ||
| nfsacl.h | ||
| nl80211.h | ||
| nubus.h | ||
| nvme_ioctl.h | ||
| nvram.h | ||
| omap3isp.h | ||
| omapfb.h | ||
| oom.h | ||
| openvswitch.h | ||
| packet_diag.h | ||
| param.h | ||
| parport.h | ||
| patchkey.h | ||
| pci_regs.h | ||
| pci.h | ||
| perf_event.h | ||
| personality.h | ||
| pfkeyv2.h | ||
| pg.h | ||
| phantom.h | ||
| phonet.h | ||
| pkt_cls.h | ||
| pkt_sched.h | ||
| pktcdvd.h | ||
| pmu.h | ||
| poll.h | ||
| posix_types.h | ||
| ppdev.h | ||
| ppp_defs.h | ||
| ppp-comp.h | ||
| ppp-ioctl.h | ||
| pps.h | ||
| pr.h | ||
| prctl.h | ||
| psci.h | ||
| ptp_clock.h | ||
| ptrace.h | ||
| qnx4_fs.h | ||
| qnxtypes.h | ||
| quota.h | ||
| radeonfb.h | ||
| random.h | ||
| raw.h | ||
| rds.h | ||
| reboot.h | ||
| reiserfs_fs.h | ||
| reiserfs_xattr.h | ||
| resource.h | ||
| rfkill.h | ||
| romfs_fs.h | ||
| rose.h | ||
| route.h | ||
| rtc.h | ||
| rtnetlink.h | ||
| scc.h | ||
| sched.h | ||
| scif_ioctl.h | ||
| screen_info.h | ||
| sctp.h | ||
| sdla.h | ||
| seccomp.h | ||
| securebits.h | ||
| selinux_netlink.h | ||
| sem.h | ||
| serial_core.h | ||
| serial_reg.h | ||
| serial.h | ||
| serio.h | ||
| shm.h | ||
| signal.h | ||
| signalfd.h | ||
| smiapp.h | ||
| snmp.h | ||
| sock_diag.h | ||
| socket.h | ||
| sockios.h | ||
| sonet.h | ||
| sonypi.h | ||
| sound.h | ||
| soundcard.h | ||
| stat.h | ||
| stddef.h | ||
| stm.h | ||
| string.h | ||
| suspend_ioctls.h | ||
| swab.h | ||
| synclink.h | ||
| sysctl.h | ||
| sysinfo.h | ||
| target_core_user.h | ||
| taskstats.h | ||
| tcp_metrics.h | ||
| tcp.h | ||
| telephony.h | ||
| termios.h | ||
| thermal.h | ||
| time.h | ||
| times.h | ||
| timex.h | ||
| tiocl.h | ||
| tipc_config.h | ||
| tipc_netlink.h | ||
| tipc.h | ||
| toshiba.h | ||
| tty_flags.h | ||
| tty.h | ||
| types.h | ||
| udf_fs_i.h | ||
| udp.h | ||
| uhid.h | ||
| uinput.h | ||
| uio.h | ||
| ultrasound.h | ||
| un.h | ||
| unistd.h | ||
| unix_diag.h | ||
| usbdevice_fs.h | ||
| usbip.h | ||
| userfaultfd.h | ||
| userio.h | ||
| utime.h | ||
| utsname.h | ||
| uuid.h | ||
| uvcvideo.h | ||
| v4l2-common.h | ||
| v4l2-controls.h | ||
| v4l2-dv-timings.h | ||
| v4l2-mediabus.h | ||
| v4l2-subdev.h | ||
| veth.h | ||
| vfio.h | ||
| vhost.h | ||
| videodev2.h | ||
| virtio_9p.h | ||
| virtio_balloon.h | ||
| virtio_blk.h | ||
| virtio_config.h | ||
| virtio_console.h | ||
| virtio_gpu.h | ||
| virtio_ids.h | ||
| virtio_input.h | ||
| virtio_net.h | ||
| virtio_pci.h | ||
| virtio_ring.h | ||
| virtio_rng.h | ||
| virtio_scsi.h | ||
| virtio_types.h | ||
| vm_sockets.h | ||
| vsp1.h | ||
| vt.h | ||
| wait.h | ||
| wanrouter.h | ||
| watchdog.h | ||
| wil6210_uapi.h | ||
| wimax.h | ||
| wireless.h | ||
| x25.h | ||
| xattr.h | ||
| xfrm.h | ||
| xilinx-v4l2-controls.h | ||
| zorro_ids.h | ||
| zorro.h | ||