Activated has the specific meaning of a sink that's selected for use by
the user via sysfs. But comments in some code that's shared by Perf use
the same word, so in those cases change them to just say "selected"
instead. With selected implying either via Perf or "activated" via
sysfs.
coresight_get_enabled_sink() doesn't actually get an enabled sink, it
only gets an activated one, so change that too.
And change the activated variable name to include "sysfs" so it can't
be confused as a general status.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20240129154050.569566-3-james.clark@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
The linked commit reverts the change that accidentally used some sysfs
enable/disable functions from Perf which broke the refcounting, but it
also removes the fact that the sysfs disable function disabled the
helpers.
Add a new wrapper function that does both which is used by both Perf and
sysfs, and label the sysfs disable function appropriately. The naming of
all of the functions will be tidied up later to avoid this happening
again.
Fixes: 287e82cf69 ("coresight: Fix crash when Perf and sysfs modes are used concurrently")
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20240129154050.569566-2-james.clark@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Now that the driver core can properly handle constant struct bus_type,
move the coresight_bustype variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/2024010531-tinfoil-avert-4a57@gregkh
Similarly to drivers/gpu/drm/amd/amdgpu/Makefile and
fs/btrfs/Makefile, copy the current set of W=1 warnings from
Makefile.extrawarn to the coresight makefile to make them default.
Unfortunately there is no easy way to do this without copying.
In addition to the default set of warnings, add -Wno-sign-compare to
disable that warning. That's because Makefile.extrawarn does some extra
steps to disable some -Wextra warnings unless W=2 or W=3 are used.
That's the only one that's needed for Coresight, so disable it.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231123120459.287578-5-james.clark@arm.com
Including the header with the declarations fixes the following warning
with a C=1 build:
coresight-cfg-afdo.c:102:27: warning: symbol 'strobe_etm4x' was not declared. Should it be static?
coresight-cfg-afdo.c:141:26: warning: symbol 'afdo_etm4x' was not declared. Should it be static?
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231123120459.287578-4-james.clark@arm.com
The missing * in the comment block causes the following warning, so fix
it:
hwtracing/coresight/coresight-etm3x-core.c:118: warning: bad line:
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231123120459.287578-3-james.clark@arm.com
These warnings would be hit with the following W=1 build change so
initialize all structs properly.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231123120459.287578-2-james.clark@arm.com
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231114133346.30489-5-hejunhao3@huawei.com
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-14-u.kleine-koenig@pengutronix.de
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-13-u.kleine-koenig@pengutronix.de
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-12-u.kleine-koenig@pengutronix.de
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-11-u.kleine-koenig@pengutronix.de
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-10-u.kleine-koenig@pengutronix.de
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231116173301.708873-9-u.kleine-koenig@pengutronix.de
CCITMIN is a 12 bit field and doesn't fit in a u8, so extend it to u16.
This probably wasn't an issue previously because values higher than 255
never occurred.
But since commit 4aff040bcc ("coresight: etm: Override TRCIDR3.CCITMIN
on errata affected cpus"), a comparison with 256 was done to enable the
errata, generating the following W=1 build error:
coresight-etm4x-core.c:1188:24: error: result of comparison of
constant 256 with expression of type 'u8' (aka 'unsigned char') is
always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (drvdata->ccitmin == 256)
Cc: stable@vger.kernel.org
Fixes: 2e1cdfe184 ("coresight-etm4x: Adding CoreSight ETM4x driver")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310302043.as36UFED-lkp@intel.com/
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231101115206.70810-1-james.clark@arm.com
Correct the property name of the DSB MSR number that needs to be
read in TPDM driver. The right property name is
"qcom,dsb-msrs-num".
Fixes: 350ba15ae1 ("coresight-tpdm: Add nodes for dsb msr support")
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
[ Fix checkpatch failure in the commit description ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1698128353-31157-1-git-send-email-quic_taozha@quicinc.com
Add the nodes for DSB subunit MSR(mux select register) support.
The TPDM MSR (mux select register) interface is an optional
interface and associated bank of registers per TPDM subunit.
The intent of mux select registers is to control muxing structures
driving the TPDM’s’ various subunit interfaces.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-14-git-send-email-quic_taozha@quicinc.com
Add nodes to configure the timestamp request based on input
pattern match. Each TPDM that support DSB subunit has maximum of
n(n<7) TPR registers to configure value for timestamp request
based on input pattern match. Eight 32 bit registers providing
DSB interface timestamp request pattern match comparison. And
each TPDM that support DSB subunit has maximum of m(m<7) TPMR
registers to configure pattern mask for timestamp request. Eight
32 bit registers providing DSB interface timestamp request
pattern match mask generation. Add nodes to enable/disable
pattern timestamp and set pattern timestamp type.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-12-git-send-email-quic_taozha@quicinc.com
Add nodes to configure trigger pattern and trigger pattern mask.
Each DSB subunit TPDM has maximum of n(n<7) XPR registers to
configure trigger pattern match output. Eight 32 bit registers
providing DSB interface trigger output pattern match comparison.
And each DSB subunit TPDM has maximum of m(m<7) XPMR registers to
configure trigger pattern mask match output. Eight 32 bit
registers providing DSB interface trigger output pattern match
mask.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-11-git-send-email-quic_taozha@quicinc.com
Add the nodes to set value for DSB edge control and DSB edge
control mask. Each DSB subunit TPDM has maximum of n(n<16) EDCR
resgisters to configure edge control. DSB edge detection control
00: Rising edge detection
01: Falling edge detection
10: Rising and falling edge detection (toggle detection)
And each DSB subunit TPDM has maximum of m(m<8) ECDMR registers to
configure mask. Eight 32 bit registers providing DSB interface
edge detection mask control.
Add the nodes to configure DSB edge control and DSB edge control
mask. Each DSB subunit TPDM maximum of 256 edge detections can be
configured. The index and value sysfs files need to be paired and
written to order. The index sysfs file is to set the index number
of the edge detection which needs to be configured. And the value
sysfs file is to set the control or mask for the edge detection.
DSB edge detection control should be set as the following values.
00: Rising edge detection
01: Falling edge detection
10: Rising and falling edge detection (toggle detection)
And DSB edge mask should be set as 0 or 1.
Each DSB subunit TPDM has maximum of n(n<16) EDCR resgisters to
configure edge control. And each DSB subunit TPDM has maximum of
m(m<8) ECDMR registers to configure mask.
Add the nodes to read a set of the edge control value and mask
of the DSB in TPDM.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-10-git-send-email-quic_taozha@quicinc.com
Add node to set and show programming mode for TPDM DSB subunit.
Once the DSB programming mode is set, it will be written to the
register DSB_CR.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-9-git-send-email-quic_taozha@quicinc.com
The nodes are needed to set or show the trigger timestamp and
trigger type. This change is to add these nodes to achieve these
function.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-8-git-send-email-quic_taozha@quicinc.com
TPDM device need a node to reset the configurations and status of
it. This change provides a node to reset the configurations and
disable the TPDM if it has been enabled.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-7-git-send-email-quic_taozha@quicinc.com
DSB is used for monitoring “events”. Events are something that
occurs at some point in time. It could be a state decode, the
act of writing/reading a particular address, a FIFO being empty,
etc. This decoding of the event desired is done outside TPDM.
DSB subunit need to be configured in enablement and disablement.
A struct that specifics associated to dsb dataset is needed. It
saves the configuration and parameters of the dsb datasets. This
change is to add this struct and initialize the configuration of
DSB subunit.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1695882586-10306-6-git-send-email-quic_taozha@quicinc.com
Currently TMC-ETR automatically selects the buffer mode from all available
methods in the following sequentially fallback manner - also in that order.
1. FLAT mode with or without IOMMU
2. TMC-ETR-SG (scatter gather) mode when available
3. CATU mode when available
But this order might not be ideal for all situations. For example if there
is a CATU connected to ETR, it may be better to use TMC-ETR scatter gather
method, rather than CATU. But hard coding such order changes will prevent
us from testing or using a particular mode. This change provides following
new sysfs tunables for the user to control TMC-ETR buffer mode explicitly,
if required. This adds following new sysfs files for buffer mode selection
purpose explicitly in the user space.
/sys/bus/coresight/devices/tmc_etr<N>/buf_modes_available
/sys/bus/coresight/devices/tmc_etr<N>/buf_mode_preferred
$ cat buf_modes_available
auto flat tmc-sg catu ------------------> Supported TMC-ETR buffer modes
$ echo catu > buf_mode_preferred -------> Explicit buffer mode request
But explicit user request has to be within supported ETR buffer modes only.
These sysfs interface files are exclussive to ETR, and hence these are not
available for other TMC devices such as ETB or ETF etc.
A new auto' mode (i.e ETR_MODE_AUTO) has been added to help fallback to the
existing default behaviour, when user provided preferred buffer mode fails.
ETR_MODE_FLAT and ETR_MODE_AUTO are always available as preferred modes.
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[Fixup year in sysfs ABI documentation]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230818082112.554638-1-anshuman.khandual@arm.com
When cycle counting is enabled, we use a default threshold value i.e 0x100
for the instruction trace cycle counting.
This patch makes the cycle threshold user configurable via perf event
attributes( 'cc_threshold' => event->attr.config3[11:0] ), falling back
to the current default if unspecified.
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230921033631.1298723-3-anshuman.khandual@arm.com
This work arounds errata 1490853 on Cortex-A76, and Neoverse-N1, errata
1491015 on Cortex-A77, errata 1502854 on Cortex-X1, and errata 1619801 on
Neoverse-V1, based affected cpus, where software read for TRCIDR3.CCITMIN
field in ETM gets an wrong value.
If software uses the value returned by the TRCIDR3.CCITMIN register field,
then it will limit the range which could be used for programming the ETM.
In reality, the ETM could be programmed with a much smaller value than what
is indicated by the TRCIDR3.CCITMIN field and still function correctly.
If software reads the TRCIDR3.CCITMIN register field, corresponding to the
instruction trace counting minimum threshold, observe the value 0x100 or a
minimum cycle count threshold of 256. The correct value should be 0x4 or a
minimum cycle count threshold of 4.
This work arounds the problem via storing 4 in drvdata->ccitmin on affected
systems where the TRCIDR3.CCITMIN has been 256, thus preserving cycle count
threshold granularity.
These errata information has been updated in Documentation/arch/arm64/silicon-errata.rst,
but without their corresponding configs because these have been implemented
directly in the driver.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[ Fixed location of silicon-errata.rst in commit description ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230921033631.1298723-2-anshuman.khandual@arm.com
TRBE coresight devices do not need regular connections information, as the
paths get built between all percpu source and their respective percpu sink
devices. Please refer 'commit 2cd87a7b29 ("coresight: core: Add support
for dedicated percpu sinks")' which added support for percpu sink devices.
coresight_register() expect device connections via the platform_data. TRBE
devices do not have any graph connections and thus is empty. With upcoming
ACPI support for TRBE, we do not get a real acpi_device and thus
coresight_get_platform_dat() will end up in failures. Hence this allocates
a zeroed coresight_platform_data structure and assigns that back into the
device.
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230829135405.1159449-2-anshuman.khandual@arm.com
In smb_reset_buffer, the sdb->buf_hw_base variable is uninitialized
before use, which initializes it in smb_init_data_buffer. And the SMB
regiester are set in smb_config_inport.
So move the call after smb_config_inport.
Fixes: 06f5c2926a ("drivers/coresight: Add UltraSoc System Memory Buffer driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231114133346.30489-4-hejunhao3@huawei.com
The SMB dirver register the enable/disable sysfs interface in function
smb_register_sink(), however the buffer depends on the following
configuration to work well. So it'll be possible for user to access an
unreset one.
Move the config buffer operation to before register_sink().
Ignore the return value, if smb_config_inport() fails. That will
cause the hardwares disable trace path to fail, should not affect
SMB driver remove. So we make smb_remove() return success,
Fixes: 06f5c2926a ("drivers/coresight: Add UltraSoc System Memory Buffer driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231114133346.30489-3-hejunhao3@huawei.com
When we to enable the SMB by perf, the perf sched will call perf_ctx_lock()
to close system preempt in event_function_call(). But SMB::enable_smb() use
mutex to lock the critical section, which may sleep.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 153023, name: perf
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48
softirqs last enabled at (0): [<ffffa2983f5c5f40>] copy_process+0xae8/0x2b48
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 2 PID: 153023 Comm: perf Kdump: loaded Tainted: G W O 6.5.0-rc4+ #1
Call trace:
...
__mutex_lock+0xbc/0xa70
mutex_lock_nested+0x34/0x48
smb_update_buffer+0x58/0x360 [ultrasoc_smb]
etm_event_stop+0x204/0x2d8 [coresight]
etm_event_del+0x1c/0x30 [coresight]
event_sched_out+0x17c/0x3b8
group_sched_out.part.0+0x5c/0x208
__perf_event_disable+0x15c/0x210
event_function+0xe0/0x230
remote_function+0xb4/0xe8
generic_exec_single+0x160/0x268
smp_call_function_single+0x20c/0x2a0
event_function_call+0x20c/0x220
_perf_event_disable+0x5c/0x90
perf_event_for_each_child+0x58/0xc0
_perf_ioctl+0x34c/0x1250
perf_ioctl+0x64/0x98
...
Use spinlock to replace mutex to control driver data access to one at a
time. The function copy_to_user() may sleep, it cannot be in a spinlock
context, so we can't simply replace it in smb_read(). But we can ensure
that only one user gets the SMB device fd by smb_open(), so remove the
locks from smb_read() and buffer synchronization is guaranteed by the user.
Fixes: 06f5c2926a ("drivers/coresight: Add UltraSoc System Memory Buffer driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231114133346.30489-2-hejunhao3@huawei.com
Partially revert the change in commit 6148652807 ("coresight: Enable
and disable helper devices adjacent to the path") which changed the bare
call from source_ops(csdev)->enable() to coresight_enable_source() for
Perf sessions. It was missed that coresight_enable_source() is
specifically for the sysfs interface, rather than being a generic call.
This interferes with the sysfs reference counting to cause the following
crash:
$ perf record -e cs_etm/@tmc_etr0/ -C 0 &
$ echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
$ echo 1 > /sys/bus/coresight/devices/etm0/enable_source
$ echo 0 > /sys/bus/coresight/devices/etm0/enable_source
Unable to handle kernel NULL pointer dereference at virtual
address 00000000000001d0
Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
...
Call trace:
etm4_disable+0x54/0x150 [coresight_etm4x]
coresight_disable_source+0x6c/0x98 [coresight]
coresight_disable+0x74/0x1c0 [coresight]
enable_source_store+0x88/0xa0 [coresight]
dev_attr_store+0x20/0x40
sysfs_kf_write+0x4c/0x68
kernfs_fop_write_iter+0x120/0x1b8
vfs_write+0x2dc/0x3b0
ksys_write+0x70/0x108
__arm64_sys_write+0x24/0x38
invoke_syscall+0x50/0x128
el0_svc_common.constprop.0+0x104/0x130
do_el0_svc+0x40/0xb8
el0_svc+0x2c/0xb8
el0t_64_sync_handler+0xc0/0xc8
el0t_64_sync+0x1a4/0x1a8
Code: d53cd042 91002000 b9402a81 b8626800 (f940ead5)
---[ end trace 0000000000000000 ]---
This commit linked below also fixes the issue, but has unlocked updates
to the mode which could potentially race. So until we come up with a
more complete solution that takes all locking and interaction between
both modes into account, just revert back to the old behavior for Perf.
Reported-by: Junhao He <hejunhao3@huawei.com>
Closes: https://lore.kernel.org/linux-arm-kernel/20230921132904.60996-1-hejunhao3@huawei.com/
Fixes: 6148652807 ("coresight: Enable and disable helper devices adjacent to the path")
Tested-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231006131452.646721-1-james.clark@arm.com
etm4_platform_driver (which lives in ".data" contains a reference to
etm4_remove_platform_dev(). So the latter must not be marked with __exit
which results in the function being discarded for a build with
CONFIG_CORESIGHT_SOURCE_ETM4X=y which in turn makes the remove pointer
contain invalid data.
etm4x_amba_driver referencing etm4_remove_amba() has the same issue.
Drop the __exit annotations for the two affected functions and a third
one that is called by the other two.
For reasons I don't understand this isn't catched by building with
CONFIG_DEBUG_SECTION_MISMATCH=y.
Fixes: c23bc382ef ("coresight: etm4x: Refactor probing routine")
Fixes: 5214b56358 ("coresight: etm4x: Add support for sysreg only devices")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/all/20230929081540.yija47lsj35xtj4v@pengutronix.de/
Link: https://lore.kernel.org/r/20230929081637.2377335-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
smp_call_function_single() will allocate an IPI interrupt vector to
the target processor and send a function call request to the interrupt
vector. After the target processor receives the IPI interrupt, it will
execute arm_trbe_remove_coresight_cpu() call request in the interrupt
handler.
According to the device_unregister() stack information, if other process
is useing the device, the down_write() may sleep, and trigger deadlocks
or unexpected errors.
arm_trbe_remove_coresight_cpu
coresight_unregister
device_unregister
device_del
kobject_del
__kobject_del
sysfs_remove_dir
kernfs_remove
down_write ---------> it may sleep
Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
per TRBE.
Simply call arm_trbe_remove_coresight_cpu() directly without useing the
smp_call_function_single(), which is the same as registering the TRBE
coresight device.
Fixes: 3fbf7f011f ("coresight: sink: Add TRBE driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Link: https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com
[ Remove duplicate cpumask checks during removal ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ v3 - Remove the operation of assigning NULL to cpudata->drvdata ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230818084052.10116-1-hejunhao3@huawei.com
There are memory leaks reported by kmemleak:
...
unreferenced object 0xffff00213c141000 (size 1024):
comm "systemd-udevd", pid 2123, jiffies 4294909467 (age 6062.160s)
hex dump (first 32 bytes):
04 00 00 00 02 00 00 00 18 10 14 3c 21 00 ff ff ...........<!...
00 00 00 00 00 00 00 00 03 00 00 00 10 00 00 00 ................
backtrace:
[<000000004b7c9001>] __kmem_cache_alloc_node+0x2f8/0x348
[<00000000b0fc7ceb>] __kmalloc+0x58/0x108
[<0000000064ff4695>] acpi_os_allocate+0x2c/0x68
[<000000007d57d116>] acpi_ut_initialize_buffer+0x54/0xe0
[<0000000024583908>] acpi_evaluate_object+0x388/0x438
[<0000000017b2e72b>] acpi_evaluate_object_typed+0xe8/0x240
[<000000005df0eac2>] coresight_get_platform_data+0x1b4/0x988 [coresight]
...
The ACPI buffer memory (buf.pointer) should be freed. But the buffer
is also used after returning from acpi_get_dsd_graph().
Move the temporary variables buf to acpi_coresight_parse_graph(),
and free it before the function return to prevent memory leak.
Fixes: 76ffa5ab5b ("coresight: Support for ACPI bindings")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230817085937.55590-2-hejunhao3@huawei.com
Coresight TRBE driver shares a single platform data (which is empty btw).
However, with the commit 4e8fe7e5c3
("coresight: Store pointers to connections rather than an array of them")
the coresight core would free up the pdata, resulting in multiple attempts
to free the same pdata for TRBE instances. Fix this by allocating a pdata per
coresight_device.
Fixes: 4e8fe7e5c3 ("coresight: Store pointers to connections rather than an array of them")
Link: https://lore.kernel.org/r/20230814093813.19152-3-hejunhao3@huawei.com
Reported-by: Junhao He <hejunhao3@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Clark <james.clark@arm.com>
Tested-by: Junhao He <hejunhao3@huawei.com>
Link: https://lore.kernel.org/r/20230816141008.535450-2-suzuki.poulose@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
The init/exit() of driver only calls platform_driver_register/unregister,
it can be simpilfied with module_platform_driver.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230804092709.1359264-1-yangyingliang@huawei.com
Perf cs_etm session executed unexpectedly when AUX buffer > 1G.
perf record -C 0 -m ,2G -e cs_etm// -- <workload>
[ perf record: Captured and wrote 2.615 MB perf.data ]
Perf only collect about 2M perf data rather than 2G. This is becasuse
the operation, "nr_pages << PAGE_SHIFT", in coresight tmc driver, will
overflow when nr_pages >= 0x80000(correspond to 1G AUX buffer). The
overflow cause buffer allocation to fail, and TMC driver will alloc
minimal buffer size(1M). You can just get about 2M perf data(1M AUX
buffer + perf data header) at least.
Explicit convert nr_pages to 64 bit to avoid overflow.
Fixes: 22f429f19c ("coresight: etm-perf: Add support for ETR backend")
Fixes: 99443ea19e ("coresight: Add generic TMC sg table framework")
Fixes: 2e499bbc1a ("coresight: tmc: implementing TMC-ETF AUX space API")
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230804081514.120171-2-tianruidong@linux.alibaba.com
is_trbe_available() checks for the TRBE support via extracting TraceBuffer
field value from ID_AA64DFR0_EL1, and ensures that it is implemented. This
replaces the open encoding '0b0001' with 'ID_AA64DFR0_EL1_TraceBuffer_IMP'
which is now available via sysreg tools. Functional change is not intended.
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230802063658.1069813-1-anshuman.khandual@arm.com
The kernel test robot looks for new warnings in a W=1 build, so fix all
the existing warnings to make it easier to spot new ones when building
locally.
The fixes are for undocumented function arguments and an incorrect doc
style.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230725140604.1350406-1-james.clark@arm.com
Drop ETM4X ACPI ID from the AMBA ACPI device list, and instead just move it
inside the new ACPI devices list detected and used via platform driver.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: linux-acpi@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for ACPI specific changes)
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20230710062500.45147-7-anshuman.khandual@arm.com
Some components may not have graph connections for describing
the trace path. e.g., ETE, where it could directly use the per
CPU TRBE. Ignore the absence of graph connections
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-6-anshuman.khandual@arm.com
Add support for handling MMIO based devices via platform driver. We need to
make sure that :
1) The APB clock, if present is enabled at probe and via runtime_pm ops
2) Use the ETM4x architecture or CoreSight architecture registers to
identify a device as CoreSight ETM4x, instead of relying a white list of
"Peripheral IDs"
The driver doesn't get to handle the devices yet, until we wire the ACPI
changes to move the devices to be handled via platform driver than the
etm4_amba driver.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-5-anshuman.khandual@arm.com
Coresight device pid can be retrieved from its iomem base address, which is
stored in 'struct etm4x_drvdata'. This drops pid argument from etm4_probe()
and 'struct etm4_init_arg'. Instead etm4_check_arch_features() derives the
coresight device pid with a new helper coresight_get_pid(), right before it
is consumed in etm4_hisi_match_pid().
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-4-anshuman.khandual@arm.com
'struct etm4_drvdata' itself can carry the base address before etm4_probe()
gets called. Just drop that redundant argument from etm4_probe().
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-3-anshuman.khandual@arm.com
Allocate and device assign 'struct etmv4_drvdata' earlier during the driver
probe, ensuring that it can be retrieved in power management based runtime
callbacks if required. This will also help in dropping iomem base address
argument from the function etm4_probe() later.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230710062500.45147-2-anshuman.khandual@arm.com
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230718143124.1065949-1-robh@kernel.org
Here is the big set of char/misc and other driver subsystem updates for
6.5-rc1.
Lots of different, tiny, stuff in here, from a range of smaller driver
subsystems, including pulls from some substems directly:
- IIO driver updates and additions
- W1 driver updates and fixes (and a new maintainer!)
- FPGA driver updates and fixes
- Counter driver updates
- Extcon driver updates
- Interconnect driver updates
- Coresight driver updates
- mfd tree tag merge needed for other updates
on top of that, lots of small driver updates as patches, including:
- static const updates for class structures
- nvmem driver updates
- pcmcia driver fix
- lots of other small driver updates and fixes
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZKKNMw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylhlQCfZrtz8RIbau8zbzh/CKpKBOmvHp4An3V64hbz
recBPLH0ZACKl0wPl4iZ
=A83A
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull Char/Misc updates from Greg KH:
"Here is the big set of char/misc and other driver subsystem updates
for 6.5-rc1.
Lots of different, tiny, stuff in here, from a range of smaller driver
subsystems, including pulls from some substems directly:
- IIO driver updates and additions
- W1 driver updates and fixes (and a new maintainer!)
- FPGA driver updates and fixes
- Counter driver updates
- Extcon driver updates
- Interconnect driver updates
- Coresight driver updates
- mfd tree tag merge needed for other updates on top of that, lots of
small driver updates as patches, including:
- static const updates for class structures
- nvmem driver updates
- pcmcia driver fix
- lots of other small driver updates and fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (243 commits)
bsr: fix build problem with bsr_class static cleanup
comedi: make all 'class' structures const
char: xillybus: make xillybus_class a static const structure
xilinx_hwicap: make icap_class a static const structure
virtio_console: make port class a static const structure
ppdev: make ppdev_class a static const structure
char: misc: make misc_class a static const structure
/dev/mem: make mem_class a static const structure
char: lp: make lp_class a static const structure
dsp56k: make dsp56k_class a static const structure
bsr: make bsr_class a static const structure
oradax: make 'cl' a static const structure
hwtracing: hisi_ptt: Fix potential sleep in atomic context
hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
hwtracing: hisi_ptt: Export available filters through sysfs
hwtracing: hisi_ptt: Add support for dynamically updating the filter list
hwtracing: hisi_ptt: Factor out filter allocation and release operation
samples: pfsm: add CC_CAN_LINK dependency
misc: fastrpc: check return value of devm_kasprintf()
coresight: dummy: Update type of mode parameter in dummy_{sink,source}_enable()
...
- Support for the Armv8.9 Permission Indirection Extensions. While this
feature doesn't add new functionality, it enables future support for
Guarded Control Stacks (GCS) and Permission Overlays.
- User-space support for the Armv8.8 memcpy/memset instructions.
- arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
cleanups.
- Removal of superfluous ISBs on context switch (following retrospective
architecture tightening).
- Decode the ISS2 register during faults for additional information to
help with debugging.
- KPTI clean-up/simplification of the trampoline exit code.
- Addressing several -Wmissing-prototype warnings.
- Kselftest improvements for signal handling and ptrace.
- Fix TPIDR2_EL0 restoring on sigreturn
- Clean-up, robustness improvements of the module allocation code.
- More sysreg conversions to the automatic register/bitfields
generation.
- CPU capabilities handling cleanup.
- Arm documentation updates: ACPI, ptdump.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmSZyXwACgkQa9axLQDI
XvEM3BAAkMzHGTDhNVNGLSO07PVmdzTiuoNFlfX7bktdIb+El76VhGXhHeEywTje
wAq9JIYBf/Src2HbgZLwuly8Fn2vCrhyp++bRJW82o9SiBnx91+0mH7zLf+XHiQ4
FHKZxvaE6PaDc9o8WXr+IeucPRb5W2HgH37mktxh7ShMLsxorwS94V1oL29A2mV9
t4XQY7/tdmrDKMKMuQnIr1DurNXBhJ1OKvDnSN/Zzm96JOU/QQ32N2wEE7Y0aHOh
bBzClksx2mguQqV515mySGFe5yy9NqaAfx2hTAciq+1rwbiCSjqQQmEswoUH8WLX
JNLylxADWT2qXThFe8W6uyFzEshSAoI1yKxlCGuOsQpu4sFJtR8oh8dDj5669g4Y
j0jR87r9rWm0iyYI5I+XDMxFVyuh2eFInvjtynRbj+mtS3f/SkO8fXG6Uya+I76C
UGLlBUKnLr/zHuIGN0LE/V4dYTqsi9EtHoc2Am2xCZsS9jqkxKJG8C93Zsm4GlJC
OcUtBSjW0rYJq+tLk0yhR6hbh59QbiRh05KnZsPpOKi8purlKSL9ZNPRi7TndLdm
HjHUY+vQwNIpPIb6pyK4aYZuTdGEQIsQykQ8CULiIGlHi7kc4g9029ouLc5bBAeU
mU8D62I2ztzPoYljYWNtO7K6g/Dq8c4lpsaMAJ+1Wp2iq2xBJjo=
=rNBK
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Notable features are user-space support for the memcpy/memset
instructions and the permission indirection extension.
- Support for the Armv8.9 Permission Indirection Extensions. While
this feature doesn't add new functionality, it enables future
support for Guarded Control Stacks (GCS) and Permission Overlays
- User-space support for the Armv8.8 memcpy/memset instructions
- arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
cleanups
- Removal of superfluous ISBs on context switch (following
retrospective architecture tightening)
- Decode the ISS2 register during faults for additional information
to help with debugging
- KPTI clean-up/simplification of the trampoline exit code
- Addressing several -Wmissing-prototype warnings
- Kselftest improvements for signal handling and ptrace
- Fix TPIDR2_EL0 restoring on sigreturn
- Clean-up, robustness improvements of the module allocation code
- More sysreg conversions to the automatic register/bitfields
generation
- CPU capabilities handling cleanup
- Arm documentation updates: ACPI, ptdump"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (124 commits)
kselftest/arm64: Add a test case for TPIDR2 restore
arm64/signal: Restore TPIDR2 register rather than memory state
arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
Documentation/arm64: Add ptdump documentation
arm64: hibernate: remove WARN_ON in save_processor_state
kselftest/arm64: Log signal code and address for unexpected signals
docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
arm64/fpsimd: Exit streaming mode when flushing tasks
docs: perf: Add new description for HiSilicon UC PMU
drivers/perf: hisi: Add support for HiSilicon UC PMU driver
drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
perf/arm-cmn: Add sysfs identifier
perf/arm-cmn: Revamp model detection
perf/arm_dmc620: Add cpumask
arm64: mm: fix VA-range sanity check
arm64/mm: remove now-superfluous ISBs from TTBR writes
Documentation/arm64: Update ACPI tables from BBR
Documentation/arm64: Update references in arm-acpi
Documentation/arm64: Update ARM and arch reference
...
Clang's kernel Control Flow Integrity (kCFI) is a compiler-based
security mitigation that ensures the target of an indirect function call
matches the expected type of the call and trapping if they do not match
exactly. The warning -Wincompatible-function-pointer-types-strict aims
to catch these issues at compile time, which reveals:
drivers/hwtracing/coresight/coresight-dummy.c:53:12: error: incompatible function pointer types initializing 'int (*)(struct coresight_device *, struct perf_event *, enum cs_mode)' with an expression of type 'int (struct coresight_device *, struct perf_event *, u32)' (aka 'int (struct coresight_device *, struct perf_event *, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict]
53 | .enable = dummy_source_enable,
| ^~~~~~~~~~~~~~~~~~~
drivers/hwtracing/coresight/coresight-dummy.c:62:12: error: incompatible function pointer types initializing 'int (*)(struct coresight_device *, enum cs_mode, void *)' with an expression of type 'int (struct coresight_device *, u32, void *)' (aka 'int (struct coresight_device *, unsigned int, void *)') [-Werror,-Wincompatible-function-pointer-types-strict]
62 | .enable = dummy_sink_enable,
| ^~~~~~~~~~~~~~~~~
2 errors generated.
Commit 9fa3682869 ("coresight: Use enum type for cs_mode wherever
possible") updated the type of the mode parameter in the prototype but
this driver was not introduced until commit 9d3ba0b6c0 ("Coresight:
Add coresight dummy driver") and 'int' is ABI compatible with 'enum
cs_mode', so there is no warning from regular
-Wincompatible-function-pointer-types.
Adjust the type of the mode parameter in the callback implementations in
the coresight dummy driver to match the prototype, clearing up the
warning and avoiding kCFI failures at runtime.
Fixes: 9d3ba0b6c0 ("Coresight: Add coresight dummy driver")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230616-coresight-dummy-fix-kcfi-warnings-v1-1-c55c64f8f0f5@kernel.org
Some Coresight devices that kernel don't have permission to access or
configure. For these devices, a dummy driver is needed to register them as
Coresight devices. The module may also be used to define components that
may not have any programming interfaces, so that paths can be created
in the driver. It provides Coresight API for operations on dummy devices,
such as enabling and disabling them. It also provides the Coresight dummy
sink/source paths for debugging.
Signed-off-by: Hao Zhang <quic_hazha@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230602084149.40031-2-quic_hazha@quicinc.com
This converts TRBLIMITR_EL1 register to automatic generation without
causing any functional change.
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230614065949.146187-9-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Instead of adding the PIDs forever to the list for the new CPUs, let us detect
a component to be ETMv4 based on the CoreSight CID, DEVTYPE=PE_TRACE and
DEVARCH=ETMv4. This is already done for some of the ETMs. We can extend the PID
matching to match the PIDR2:JEDEC, BIT[3], which must be 1 (RAO) always.
Link: https://lkml.kernel.org/r/20230317030501.1811905-1-anshuman.khandual@arm.com
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: frowand.list@gmail.com
Cc: linux@armlinux.org.uk
Cc: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
[ Fixed typo in the description RA0 => RAO ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230605133031.1827626-1-suzuki.poulose@arm.com
etm4_remove_dev() returned zero unconditionally. Make it return void
instead, which makes it clear in the callers that there is no error to
handle. Simplify etm4_remove_platform_dev() accordingly.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230518201629.260672-1-u.kleine-koenig@pengutronix.de
The CTI module has some hard coded refcounting code that has a leak.
For example running perf and then trying to unload it fails:
perf record -e cs_etm// -a -- ls
rmmod coresight_cti
rmmod: ERROR: Module coresight_cti is in use
The coresight core already handles references of devices in use, so by
making CTI a normal helper device, we get working refcounting for free.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-14-james.clark@arm.com
Currently CATU is the only helper device, and its enable and disable
calls are hard coded. To allow more helper devices to be added in a
generic way, remove these hard coded calls and just enable and disable
all helper devices.
This has to apply to helpers adjacent to the path, because they will
never be in the path. CATU was already discovered in this way, so
there is no change there.
One change that is needed is for CATU to call back into ETR to allocate
the buffer. Because the enable call was previously hard coded, it was
done at a point where the buffer was already allocated, but this is no
longer the case.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-13-james.clark@arm.com
When CATU is moved to the generic enable/disable path system in the
next commit, it will need to call into ETR and get it to pre-allocate
its buffer so add a function for it.
No functional changes
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-12-james.clark@arm.com
This removes the need to do an additional lookup for the total number
of ports used and also removes the need to allocate an array of
refcounts which is just another representation of a connection array.
This was only used for link type devices, for regular devices a single
refcount on the coresight device is used.
There is a both an input and output refcount in case two link type
devices are connected together so that they don't overwrite each other's
counts.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-11-james.clark@arm.com
This will allow CATU to get its associated ETR in a generic way where
currently the enable path has some hard coded searches which avoid
the need to store input connections.
This also means that the full search for connected devices on removal
can be replaced with a loop through only the input and output devices.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-10-james.clark@arm.com
There is some duplication between coresight_fixup_device_conns() and
coresight_fixup_orphan_conns(). They both do the same thing except for
the fact that coresight_fixup_orphan_conns() can't handle iterating over
itself.
By making it able to handle fixing up it's own connections the other
function can be removed.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-9-james.clark@arm.com
This will allow the same connection object to be referenced via the
input connection list in a later commit rather than duplicating them.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-8-james.clark@arm.com
Add a function for adding connections dynamically. This also removes
the 1:1 mapping between port number and the index into the connections
array. The only place this mapping was used was in the warning for
duplicate output ports, which has been replaced by a search. Other
uses of the port number already use the port member variable.
Being able to dynamically add connections will allow other devices like
CTI to re-use the connection mechanism despite not having explicit
connections described in the DT.
The connections array is now no longer sparse, so child_fwnode doesn't
need to be checked as all connections have a target node. Because the
array is no longer sparse, the high in and out port numbers are required
for the refcount arrays. But these will also be removed in a later
commit when the refcount is made a property of the connection.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-7-james.clark@arm.com
When input connections are added they will use the same connection
object as the output so parent and child could be misinterpreted. Making
the direction unambiguous in the names should improve readability.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-6-james.clark@arm.com
Rename to avoid confusion between port number and the index in the
connection array. The port number is already stored in the connection,
and in a later commit the connection array will be appended to, so
the length of it will no longer reflect the number of ports.
No functional changes.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-5-james.clark@arm.com
conns is actually for output connections. Change the name to make it
clearer and so that we can add input connections later.
No functional changes.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-4-james.clark@arm.com
mode is stored as a local_t, but it is also passed around a lot as a
plain u32, so use the correct type wherever local_t isn't currently
used. This helps a little bit with readability.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-3-james.clark@arm.com
child_fwnode should be a read only property based on the DT or ACPI. If
it's cleared on the parent device when a child is unloaded, then when
the child is loaded again the connection won't be remade.
child_dev should be cleared instead which signifies that the connection
should be remade when the child_fwnode registers a new coresight_device.
Similarly the reference count shouldn't be decremented as long as the
parent device exists. The correct place to drop the reference is in
coresight_release_platform_data() which is already done.
Reproducible on Juno with the following steps:
# load all coresight modules.
$ cd /sys/bus/coresight/devices/
$ echo 1 > tmc_etr0/enable_sink
$ echo 1 > etm0/enable_source
# Works fine ^
$ echo 0 > etm0/enable_source
$ rmmod coresight-funnel
$ modprobe coresight-funnel
$ echo 1 > etm0/enable_source
-bash: echo: write error: Invalid argument
Fixes: 37ea1ffddf ("coresight: Use fwnode handle instead of device names")
Fixes: 2af89ebacf ("coresight: Clear the connection field properly")
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425143542.2305069-2-james.clark@arm.com
Error handler for etm_setup_aux can not release coresight path because
cpu mask was cleared when coresight_trace_id_get_cpu_id failed.
Call coresight_release_path function explicitly when alloc trace id filed.
Fixes: 4ff1fdb412 ("coresight: perf: traceid: Add perf ID allocation and notifiers")
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230425032416.125542-1-tianruidong@linux.alibaba.com
This code generates a Smatch warning:
drivers/hwtracing/coresight/coresight-tmc-etr.c:947 tmc_etr_buf_insert_barrier_packet()
error: uninitialized symbol 'bufp'.
The problem is that if tmc_sg_table_get_data() returns -EINVAL, then
when we test if "len < CORESIGHT_BARRIER_PKT_SIZE", the negative "len"
value is type promoted to a high unsigned long value which is greater
than CORESIGHT_BARRIER_PKT_SIZE. Fix this bug by adding an explicit
check for error codes.
Fixes: 75f4e3619f ("coresight: tmc-etr: Add transparent buffer management")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/7d33e244-d8b9-4c27-9653-883a13534b01@kili.mountain
This is a relatively smaller update for CoreSight tracing subsystem targeting
v6.4, with the following changes:
- Removing Mathieu Poirier as MAINTAINER for the subsystem, with updates to
CREDITS for his contributions.
- Fix CoreSight ETM PMU to set the module field
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuFy0byloRoXZHaWBxcXRZPKyBqEFAmQ+yC4ACgkQxcXRZPKy
BqGyfA//bHy5yKth4vNc8Z53bId9B5W1nLCq3jd9eCn242lCpu8VUW/qlSFsFvmG
YDVcSLO0XPJnwg+IhokA6xNaDKWgdHGYxNDW1YaCgm1Br0RLnz0gOi0bsg+cOeND
X7cZmsgCRm3tAXgWVFszkDRRAla1iB/SgEPCB+d+l0qQNick8xcsGNtrJqn3d9Hr
6wI88sk0RU46Y6ovFKtd8XphNcZjmAhccFE7t7hF7IuD2Pgt7x0+Uyia92vFOJax
8oXnhOd43UxThYZjBYhmcIlPw9/nU+G5pZd4QKzKnexvtBgFpt6AXNaC6iLcUH4I
tDRWR0NmWtJoOWP2a8pbh0wGPdiM4Qf9UEctVZMihT9tXYJ8nMT/l2DNlIq6BlbO
rRDlQR93cGDy+YBRScimK1bW24gKVvUACwK48ubSsC4jZCsHithkZcXZ6GgUFAft
9zk/HLbTGlgXKxqA8TEdQeoV0VBjpjvro9Ag1XaWV0W/tyDTumn4T7HzhlhFSqPw
hkL0sXMwsR42hUaBfoD5ccerMsKh00ufI4a4RRPw0NOA66QMdKvUuIJowWBQCFuo
yrhWsuxRGbXL/wCxZF2oNzQ9PHTlZieMDkU2MU3nuNGlmQ6oT8Pd1/PaZFRZbmoF
TeUuvXHXIJQUuqNv9+a5kQeIJW/ZCsJArhKUB0RwD9KQ+lELTzA=
=YYDU
-----END PGP SIGNATURE-----
Merge tag 'coresight-next-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next
Suzuki writes:
coresight: Updates for v6.4
This is a relatively smaller update for CoreSight tracing subsystem targeting
v6.4, with the following changes:
- Removing Mathieu Poirier as MAINTAINER for the subsystem, with updates to
CREDITS for his contributions.
- Fix CoreSight ETM PMU to set the module field
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
* tag 'coresight-next-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux:
coresight: etm_pmu: Set the module field
MAINTAINERS: Remove Mathieu Poirier as coresight maintainer
struct pmu::module must be set to the module owning the PMU driver.
Set this for the coresight etm_pmu.
Fixes: 8e264c52e1 ("coresight: core: Allow the coresight core driver to be built as a module")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230405094922.667834-1-suzuki.poulose@arm.com
CoreSight ETM4x architecture clearly provides ways to identify a device
via registers in the "Management" class, TRCDEVARCH and TRCDEVTYPE. These
registers can be accessed without the Trace domain being powered on.
We additionally added TRCIDR1 as fallback in order to cover for any
ETMs that may not have implemented TRCDEVARCH. So far, nobody has
reported hitting a WARNING we placed to catch such systems.
Also, more importantly it is problematic to access TRCIDR1, which is a
"Trace" register via MMIO access, without clearing the OSLK. But we cannot
mess with the OSLK until we know for sure that this is an ETMv4 device.
Thus, this kind of creates a chicken and egg problem unnecessarily for
systems "which are compliant" to the ETMv4 architecture.
Let us remove the TRCIDR1 fall back check and rely only on TRCDEVARCH.
Fixes: 8b94db1eda ("coresight: etm4x: Use TRCDEVARCH for component discovery")
Cc: stable@vger.kernel.org
Reported-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
Link: https://lore.kernel.org/all/143540e5623d4c7393d24833f2b80600d8d745d2.1677881753.git.scclevenger@os.amperecomputing.com/
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230321104530.1547136-1-suzuki.poulose@arm.com
If TMC ETR is enabled without being ready, in later use we may
see AXI bus errors caused by accessing invalid addresses.
Signed-off-by: Yabin Cui <yabinc@google.com>
[ Tweak error message ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230127231001.1920947-1-yabinc@google.com
devm_ioremap_resource() never returns NULL pointer, it
will return ERR_PTR() when it fails, so replace the check
with IS_ERR().
Fixes: 5b7916625c ("Coresight: Add TPDA link driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
[ Fix return value to the PTR_ERR(base) ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230129084246.537694-1-yangyingliang@huawei.com
'remove' callbacks get called whenever a device is unbound from
the driver, which can get triggered from user space.
Putting it into the __exit section means that the function gets
dropped in for built-in drivers, as pointed out by this build
warning:
`tpda_remove' referenced in section `.data' of drivers/hwtracing/coresight/coresight-tpda.o: defined in discarded section `.exit.text' of drivers/hwtracing/coresight/coresight-tpda.o
`tpdm_remove' referenced in section `.data' of drivers/hwtracing/coresight/coresight-tpdm.o: defined in discarded section `.exit.text' of drivers/hwtracing/coresight/coresight-tpdm.o
Fixes: 5b7916625c ("Coresight: Add TPDA link driver")
Fixes: b3c71626a9 ("Coresight: Add coresight TPDM source driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230126163530.3495413-1-arnd@kernel.org
With the dynamic traceid allocation scheme in, we output the
AUX_OUTPUT_HWID packet every time event->start() is called.
This could cause too many such records in the perf.data,
while only one per CPU throughout the life time of
the event is required. Make sure we only output it once.
Before this patch:
$ perf report -D | grep OUTPUT_HW_ID
...
AUX_OUTPUT_HW_ID events: 55 (18.3%)
After this patch:
$ perf report -D | grep OUTPUT_HW_ID
...
AUX_OUTPUT_HW_ID events: 5 ( 1.9%)
Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20230120103434.864318-1-suzuki.poulose@arm.com
Kernel test robot reports:
drivers/hwtracing/coresight/coresight-core.c:1176:7: warning: variable
'hash' is used uninitialized whenever switch case is taken
[-Wsometimes-uninitialized]
case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC:
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/hwtracing/coresight/coresight-core.c:1195:24: note: uninitialized
use occurs here
idr_remove(&path_idr, hash);
^~~~
Fix this by moving the usage of the hash variable to where it actually
should have been.
Cc: Mao Jinlong <quic_jinlmao@quicinc.com>
Link: https://lkml.kernel.org/r/202301211339.9mU0dccO-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lkml.kernel.org/r/20230123164700.1074064-1-suzuki.poulose@arm.com
Integration test for tpdm can help to generate the data for
verification of the topology during TPDM software bring up.
Sample:
echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
cat /dev/tmc_etf0 > /data/etf-tpdm0.bin
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230117145708.16739-6-quic_jinlmao@quicinc.com
TPDM serves as data collection component for various dataset types.
DSB(Discrete Single Bit) is one of the dataset types. DSB subunit
can be enabled for data collection by writing 1 to the first bit of
DSB_CR register. This change is to add enable/disable function for
DSB dataset by writing DSB_CR register.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230117145708.16739-5-quic_jinlmao@quicinc.com
Add driver to support Coresight device TPDM (Trace, Profiling and
Diagnostics Monitor). TPDM is a monitor to collect data from
different datasets. This change is to add probe/enable/disable
functions for tpdm source.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230120095301.30792-1-quic_jinlmao@quicinc.com
Except stm, there could be other sources which are not associated
with cpus. Use IDR to store and search these sources' paths.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230117145708.16739-2-quic_jinlmao@quicinc.com
Adds in a number of pr_debug macros to allow the debugging and test of
the trace ID allocation system.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-15-mike.leach@linaro.org
Use the perf_report_aux_output_id() call to output the CoreSight trace ID
and associated CPU as a PERF_RECORD_AUX_OUTPUT_HW_ID record in the
perf.data file.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-14-mike.leach@linaro.org
CoreSight sources provide a callback (.trace_id) in the standard source
ops which returns the ID to the core code. This was used to check that
sources all had a unique Trace ID.
Uniqueness is now gauranteed by the Trace ID allocation system, and the
check code has been removed from the core.
This patch removes the unneeded and unused .trace_id source ops
from the ops structure and implementations in etm3x, etm4x and stm.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-8-mike.leach@linaro.org
Use the TraceID API to allocate ETM trace IDs dynamically.
As with the etm4x we allocate on enable / disable for perf,
allocate on enable / reset for sysfs.
Additionally we allocate on sysfs file read as both perf and sysfs
can read the ID before enabling the hardware.
Remove sysfs option to write trace ID - which is inconsistent with
both the dynamic allocation method and the fixed allocation method
previously used.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-7-mike.leach@linaro.org
The trace ID API is now used to allocate trace IDs for ETM4.x / ETE
devices.
For perf sessions, these will be allocated on enable, and released on
disable.
For sysfs sessions, these will be allocated on enable, but only released
on reset. This allows the sysfs session to interrogate the Trace ID used
after the session is over - maintaining functional consistency with the
previous allocation scheme.
The trace ID will also be allocated on read of the mgmt/trctraceid file.
This ensures that if perf or sysfs read this before enabling trace, the
value will be the one used for the trace session.
Trace ID initialisation is removed from the _probe() function.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-6-mike.leach@linaro.org
Updates the STM driver to use the trace ID allocation API.
This uses the _system_id calls to allocate an ID on device poll,
and release on device remove.
The sysfs access to the STMTRACEIDR register has been changed from RW
to RO. Having this value as writable is not appropriate for the new
Trace ID scheme - and had potential to cause errors in the previous
scheme if values clashed with other sources.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-5-mike.leach@linaro.org
Adds in calls to allocate and release Trace ID for the CPUs in use
by the perf session.
Adds in notifier calls to the trace ID allocator that perf
events are starting and stopping.
This ensures that Trace IDs associated with CPUs remain the same
throughout the perf session, and are only released when all perf
sessions are complete.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-4-mike.leach@linaro.org
The checks for sources to have unique IDs has been removed - this is now
guaranteed by the ID allocation mechanisms, and inappropriate where
multiple ID maps are in use in larger systems
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-3-mike.leach@linaro.org
The existing mechanism to assign Trace ID values to sources is limited
and does not scale for larger multicore / multi trace source systems.
The API introduces functions that reserve IDs based on availabilty
represented by a coresight_trace_id_map structure. This records the
used and free IDs in a bitmap.
CPU bound sources such as ETMs use the coresight_trace_id_get_cpu_id
coresight_trace_id_put_cpu_id pair of functions. The API will record
the ID associated with the CPU. This ensures that the same ID will be
re-used while perf events are active on the CPU. The put_cpu_id function
will pend release of the ID until all perf cs_etm sessions are complete.
For backward compatibility the functions will attempt to use the same
CPU IDs as the legacy system would have used if these are still available.
Non-cpu sources, such as the STM can use coresight_trace_id_get_system_id /
coresight_trace_id_put_system_id.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
[ Fix checkpatch warning in drivers/hwtracing/coresight/coresight-trace-id.c ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230116124928.5440-2-mike.leach@linaro.org
platform_get_resource() returns NULL pointer not PTR_ERR(), replace
the IS_ERR() check with NULL pointer check.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230118074920.1772141-1-yangyingliang@huawei.com
Add driver for UltraSoc SMB(System Memory Buffer) device.
SMB provides a way to buffer messages from ETM, and store
these "CPU instructions trace" in system memory.
The SMB device is identifier as ACPI HID "HISI03A1". Device
system memory address resources are allocated using the _CRS
method and buffer modes is the circular buffer mode.
SMB is developed by UltraSoc technology, which is acquired by
Siemens, and we still use "UltraSoc" to name driver.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Tested-by: JunHao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230114101302.62320-2-hejunhao3@huawei.com
enable_req_count is only ever accessed inside the spinlock, so to avoid
confusion that there are concurrent accesses and simplify the code,
change it to an int.
One access outside of the spinlock is in enable_show() which appears to
allow partially written data to be displayed between enable_req_count,
powered and enabled so move this one inside the spin lock too.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230110110736.2709917-4-james.clark@arm.com
In commit 6746eae4bb ("coresight: cti: Fix hang in cti_disable_hw()")
PM runtime calls are removed from cti_enable_hw/cti_disable_hw. When
enabling CTI by writing enable sysfs node, clock for accessing CTI
register won't be enabled. Device will crash due to register access
issue. Add PM runtime call in enable_store to fix this issue.
Fixes: 6746eae4bb ("coresight: cti: Fix hang in cti_disable_hw()")
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
[Change to only call pm_runtime_put if a disable happened]
Tested-by: Jinlong Mao <quic_jinlmao@quicinc.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230110110736.2709917-3-james.clark@arm.com
Writing 0 to the enable control repeatedly results in a negative value
for enable_req_count. After this, writing 1 to the enable control
appears to not work until the count returns to positive.
Change it so that it's impossible for enable_req_count to be < 0.
Return an error to indicate that the disable request was invalid.
Fixes: 835d722ba1 ("coresight: cti: Initial CoreSight CTI Driver")
Tested-by: Jinlong Mao <quic_jinlmao@quicinc.com>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230110110736.2709917-2-james.clark@arm.com
The TRCSEQRSTEVR and TRCSEQSTR registers are not implemented if the
TRCIDR5.NUMSEQSTATE == 0. Skip accessing the registers in such cases.
Fixes: 2e1cdfe184 ("coresight-etm4x: Adding CoreSight ETM4x driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230114091632.60095-1-hejunhao3@huawei.com
cpuhp_state_add_instance() and cpuhp_state_remove_instance() should
be used in pairs. Or there will lead to the warn on
cpuhp_remove_multi_state() since the cpuhp_step list is not empty.
The following is the error log with 'rmmod coresight-trbe':
Error: Removing state 215 which has instances left.
Call trace:
__cpuhp_remove_state_cpuslocked+0x144/0x160
__cpuhp_remove_state+0xac/0x100
arm_trbe_device_remove+0x2c/0x60 [coresight_trbe]
platform_remove+0x34/0x70
device_remove+0x54/0x90
device_release_driver_internal+0x1e4/0x250
driver_detach+0x5c/0xb0
bus_remove_driver+0x64/0xc0
driver_unregister+0x3c/0x70
platform_driver_unregister+0x20/0x30
arm_trbe_exit+0x1c/0x658 [coresight_trbe]
__arm64_sys_delete_module+0x1ac/0x24c
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x58/0x1a0
do_el0_svc+0x38/0xd0
el0_svc+0x2c/0xc0
el0t_64_sync_handler+0x1ac/0x1b0
el0t_64_sync+0x19c/0x1a0
---[ end trace 0000000000000000 ]---
Fixes: 3fbf7f011f ("coresight: sink: Add TRBE driver")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Yang Shen <shenyang39@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20221122090355.23533-1-shenyang39@huawei.com
etm4x devices cannot be successfully probed when their CPU is offline.
For example, when booting with maxcpus=n, ETM probing will fail on
CPUs >n, and the probing won't be reattempted once the CPUs come
online. This will leave those CPUs unable to make use of ETM.
This change adds a mechanism to delay the probing if the corresponding
CPU is offline, and to try it again when the CPU comes online.
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220705145935.24679-1-tamas.zsoldos@arm.com
cti_enable_hw() and cti_disable_hw() are called from an atomic context
so shouldn't use runtime PM because it can result in a sleep when
communicating with firmware.
Since commit 3c66563378 ("Revert "firmware: arm_scmi: Add clock
management to the SCMI power domain""), this causes a hang on Juno when
running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put()
was called with the wrong device until commit 692c9a499b ("coresight:
cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022
Call trace:
dump_backtrace+0x134/0x140
show_stack+0x20/0x58
dump_stack_lvl+0x8c/0xb8
dump_stack+0x18/0x34
__might_resched+0x180/0x228
__might_sleep+0x50/0x88
__pm_runtime_resume+0xac/0xb0
cti_enable+0x44/0x120
coresight_control_assoc_ectdev+0xc0/0x150
coresight_enable_path+0xb4/0x288
etm_event_start+0x138/0x170
etm_event_add+0x48/0x70
event_sched_in.isra.122+0xb4/0x280
merge_sched_in+0x1fc/0x3d0
visit_groups_merge.constprop.137+0x16c/0x4b0
ctx_sched_in+0x114/0x1f0
perf_event_sched_in+0x60/0x90
ctx_resched+0x68/0xb0
perf_event_exec+0x138/0x508
begin_new_exec+0x52c/0xd40
load_elf_binary+0x6b8/0x17d0
bprm_execve+0x360/0x7f8
do_execveat_common.isra.47+0x218/0x238
__arm64_sys_execve+0x48/0x60
invoke_syscall+0x4c/0x110
el0_svc_common.constprop.4+0xfc/0x120
do_el0_svc+0x34/0xc0
el0_svc+0x40/0x98
el0t_64_sync_handler+0x98/0xc0
el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not
needed here because it must have already been done when building the
path for a trace.
Fixes: 835d722ba1 ("coresight: cti: Initial CoreSight CTI Driver")
Cc: stable <stable@kernel.org>
Reported-by: Aishwarya TCV <Aishwarya.TCV@arm.com>
Reported-by: Cristian Marussi <Cristian.Marussi@arm.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Tested-by: Mike Leach <mike.leach@linaro.org>
[ Fix build warnings ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20221025131032.1149459-1-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cti_enable_hw() and cti_disable_hw() are called from an atomic context
so shouldn't use runtime PM because it can result in a sleep when
communicating with firmware.
Since commit 3c66563378 ("Revert "firmware: arm_scmi: Add clock
management to the SCMI power domain""), this causes a hang on Juno when
running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put()
was called with the wrong device until commit 692c9a499b ("coresight:
cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022
Call trace:
dump_backtrace+0x134/0x140
show_stack+0x20/0x58
dump_stack_lvl+0x8c/0xb8
dump_stack+0x18/0x34
__might_resched+0x180/0x228
__might_sleep+0x50/0x88
__pm_runtime_resume+0xac/0xb0
cti_enable+0x44/0x120
coresight_control_assoc_ectdev+0xc0/0x150
coresight_enable_path+0xb4/0x288
etm_event_start+0x138/0x170
etm_event_add+0x48/0x70
event_sched_in.isra.122+0xb4/0x280
merge_sched_in+0x1fc/0x3d0
visit_groups_merge.constprop.137+0x16c/0x4b0
ctx_sched_in+0x114/0x1f0
perf_event_sched_in+0x60/0x90
ctx_resched+0x68/0xb0
perf_event_exec+0x138/0x508
begin_new_exec+0x52c/0xd40
load_elf_binary+0x6b8/0x17d0
bprm_execve+0x360/0x7f8
do_execveat_common.isra.47+0x218/0x238
__arm64_sys_execve+0x48/0x60
invoke_syscall+0x4c/0x110
el0_svc_common.constprop.4+0xfc/0x120
do_el0_svc+0x34/0xc0
el0_svc+0x40/0x98
el0t_64_sync_handler+0x98/0xc0
el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not
needed here because it must have already been done when building the
path for a trace.
Fixes: 835d722ba1 ("coresight: cti: Initial CoreSight CTI Driver")
Reported-by: Aishwarya TCV <Aishwarya.TCV@arm.com>
Reported-by: Cristian Marussi <Cristian.Marussi@arm.com>
Suggested-by: Suzuki Poulose <Suzuki.Poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Tested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20221005131452.1506328-1-james.clark@arm.com
With lockdeps enabled, we get the following warning:
======================================================
WARNING: possible circular locking dependency detected
------------------------------------------------------
kworker/u12:1/53 is trying to acquire lock:
ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
but task is already holding lock:
ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (ect_mutex){+.+.}-{4:4}:
__mutex_lock_common+0xd8/0xe60
mutex_lock_nested+0x44/0x50
cti_add_assoc_to_csdev+0x4c/0x184
coresight_register+0x2f0/0x314
tmc_probe+0x33c/0x414
-> #0 (coresight_mutex){+.+.}-{4:4}:
__lock_acquire+0x1a20/0x32d0
lock_acquire+0x160/0x308
__mutex_lock_common+0xd8/0xe60
mutex_lock_nested+0x44/0x50
coresight_set_assoc_ectdev_mutex+0x3c/0x5c
cti_update_conn_xrefs+0x6c/0xf8
cti_probe+0x33c/0x394
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(ect_mutex);
lock(coresight_mutex);
lock(ect_mutex);
lock(coresight_mutex);
*** DEADLOCK ***
4 locks held by kworker/u12:1/53:
#0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
#1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
#2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
#3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
To fix the same, call cti_add_assoc_to_csdev without the holding
coresight_mutex and confine the locking while setting the associated
ect / cti device using coresight_set_assoc_ectdev_mutex().
Fixes: 177af8285b ("coresight: cti: Enable CTI associated with devices")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220721130329.3787211-1-sudeep.holla@arm.com
Here is the large set of char/misc and other small driver subsystem
changes for 6.1-rc1. Loads of different things in here:
- IIO driver updates, additions, and changes. Probably the largest
part of the diffstat
- habanalabs driver update with support for new hardware and features,
the second largest part of the diff.
- fpga subsystem driver updates and additions
- mhi subsystem updates
- Coresight driver updates
- gnss subsystem updates
- extcon driver updates
- icc subsystem updates
- fsi subsystem updates
- nvmem subsystem and driver updates
- misc driver updates
- speakup driver additions for new features
- lots of tiny driver updates and cleanups
All of these have been in the linux-next tree for a while with no
reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0GQmA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylyVQCeNJjZ3hy+Wz8WkPSY+NkehuIhyCIAnjXMOJP8
5G/JQ+rpcclr7VOXlS66
=zVkU
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the large set of char/misc and other small driver subsystem
changes for 6.1-rc1. Loads of different things in here:
- IIO driver updates, additions, and changes. Probably the largest
part of the diffstat
- habanalabs driver update with support for new hardware and
features, the second largest part of the diff.
- fpga subsystem driver updates and additions
- mhi subsystem updates
- Coresight driver updates
- gnss subsystem updates
- extcon driver updates
- icc subsystem updates
- fsi subsystem updates
- nvmem subsystem and driver updates
- misc driver updates
- speakup driver additions for new features
- lots of tiny driver updates and cleanups
All of these have been in the linux-next tree for a while with no
reported issues"
* tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits)
w1: Split memcpy() of struct cn_msg flexible array
spmi: pmic-arb: increase SPMI transaction timeout delay
spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
spmi: pmic-arb: add support to dispatch interrupt based on IRQ status
spmi: pmic-arb: check apid against limits before calling irq handler
spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq
spmi: pmic-arb: handle spurious interrupt
spmi: pmic-arb: add a print in cleanup_irq
drivers: spmi: Directly use ida_alloc()/free()
MAINTAINERS: add TI ECAP driver info
counter: ti-ecap-capture: capture driver support for ECAP
Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items
dt-bindings: counter: add ti,am62-ecap-capture.yaml
counter: Introduce the COUNTER_COMP_ARRAY component type
counter: Consolidate Counter extension sysfs attribute creation
counter: Introduce the Count capture component
counter: 104-quad-8: Add Signal polarity component
counter: Introduce the Signal polarity component
counter: interrupt-cnt: Implement watch_validate callback
...
After the conversion to automatically generating the ID_AA64DFR0_EL1
definition names, the build fails in a few different places because some
of the definitions were not changed to their new names along the way.
Update the names to resolve the build errors.
Fixes: c0357a73fa ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220919160928.3905780-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When building without CONFIG_CORESIGHT_CTI_INTEGRATION_REGS, there is a
warning about coresight_cti_reg_store() being unused in the file:
drivers/hwtracing/coresight/coresight-cti-sysfs.c:184:16: warning: 'coresight_cti_reg_store' defined but not used [-Wunused-function]
184 | static ssize_t coresight_cti_reg_store(struct device *dev,
| ^~~~~~~~~~~~~~~~~~~~~~~
This is expected as coresight_cti_reg_store() is only used in the
coresight_cti_reg_rw macro, which is only used in a block guarded by
CONFIG_CORESIGHT_CTI_INTEGRATION_REGS. Mark coresight_cti_reg_store() as
__maybe_unused to clearly indicate that the function may be unused
depending on the configuration.
Fixes: fbca79e554 ("coresight: cti-sysfs: Re-use same functions for similar sysfs register accessors")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20220901195055.1932340-1-nathan@kernel.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
New csdev_access functions were added as part of the previous
refactor. In order to make them more consistent with the
existing ones, change any signed offset types to be unsigned.
Now that they are unsigned, stop using hi_off = -1 to signify
a single 32bit access. Instead just call the existing 32bit
accessors. This is also applied to other parts of the codebase,
and the coresight_{read,write}_reg_pair() functions can be
deleted.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220830172614.340962-6-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Currently each accessor macro creates an identical function which wastes
space in the text area and pollutes the ftrace function name list.
Change it so that the same function is used, but the register to access
is passed in as parameter rather than baked into each function.
Note that only the single accessor is used here and not
csdev_access_relaxed_read_pair() like in the previous commit, so
so a single unsigned offset value is stored instead.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220830172614.340962-5-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Currently each accessor macro creates an identical function which wastes
space in the text area and pollutes the ftrace function names. Change it
so that the same function is used, but the register to access is passed
in as parameter rather than baked into each function.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220830172614.340962-4-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The coresight_device struct is available in the sysfs accessor, and this
contains a csdev_access struct which can be used to access registers.
Use this instead of passing in the type of each drvdata so that a common
function can be shared between all the cs drivers.
No functional changes.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220830172614.340962-3-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The ability to use a custom function in this sysfs show function isn't
used so remove it.
No functional changes.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220830172614.340962-2-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Add a new sysfs interface in /sys/bus/coresight/devices/etm<N>/ts_source
indicating the configured timestamp source when the ETM device driver
was probed.
The perf tool will use this information to detect if the trace data
timestamp matches the kernel time, enabling correlation of CoreSight
trace with perf events.
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20220823160650.455823-2-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
There are three independent sets of changes:
- Sai Prakash Ranjan adds tracing support to the asm-generic
version of the MMIO accessors, which is intended to help
understand problems with device drivers and has been part
of Qualcomm's vendor kernels for many years.
- A patch from Sebastian Siewior to rework the handling of
IRQ stacks in softirqs across architectures, which is
needed for enabling PREEMPT_RT.
- The last patch to remove the CONFIG_VIRT_TO_BUS option and
some of the code behind that, after the last users of this
old interface made it in through the netdev, scsi, media and
staging trees.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
=aTR0
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"There are three independent sets of changes:
- Sai Prakash Ranjan adds tracing support to the asm-generic version
of the MMIO accessors, which is intended to help understand
problems with device drivers and has been part of Qualcomm's vendor
kernels for many years
- A patch from Sebastian Siewior to rework the handling of IRQ stacks
in softirqs across architectures, which is needed for enabling
PREEMPT_RT
- The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
the code behind that, after the last users of this old interface
made it in through the netdev, scsi, media and staging trees"
* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
uapi: asm-generic: fcntl: Fix typo 'the the' in comment
arch/*/: remove CONFIG_VIRT_TO_BUS
soc: qcom: geni: Disable MMIO tracing for GENI SE
serial: qcom_geni_serial: Disable MMIO tracing for geni serial
asm-generic/io: Add logging support for MMIO accessors
KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
lib: Add register read/write tracing support
drm/meson: Fix overflow implicit truncation warnings
irqchip/tegra: Fix overflow implicit truncation warnings
coresight: etm4x: Use asm-generic IO memory barriers
arm64: io: Use asm-generic high level MMIO accessors
arch/*: Disable softirq stacks on PREEMPT_RT.
When the following configs are enabled:
* CORESIGHT
* CORESIGHT_SOURCE_ETM4X
* UBSAN
* UBSAN_TRAP
Clang fails assemble the kernel with the error:
<instantiation>:1:7: error: expected constant expression in '.inst' directive
.inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
^
drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in
macro instantiation
etm4x_relaxed_read32(csa, TRCCNTVRn(i));
^
drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from
macro 'etm4x_relaxed_read32'
read_etm4x_sysreg_offset((offset), false)))
^
drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded
from macro 'read_etm4x_sysreg_offset'
__val = read_etm4x_sysreg_const_offset((offset)); \
^
drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from
macro 'read_etm4x_sysreg_const_offset'
READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
^
drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from
macro 'READ_ETM4x_REG'
read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
^
arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro
'read_sysreg_s'
asm volatile(__mrs_s("%0", r) : "=r" (__val)); \
^
arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
" mrs_s " v ", " __stringify(r) "\n" \
^
Consider the definitions of TRCSSCSRn and TRCCNTVRn:
drivers/hwtracing/coresight/coresight-etm4x.h:56
#define TRCCNTVRn(n) (0x160 + (n * 4))
drivers/hwtracing/coresight/coresight-etm4x.h:81
#define TRCSSCSRn(n) (0x2A0 + (n * 4))
Where the macro parameter is expanded to i; a loop induction variable
from etm4_disable_hw.
When any compiler can determine that loops may be unrolled, then the
__builtin_constant_p check in read_etm4x_sysreg_offset() defined in
drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
can lead to the expression `(0x160 + (i * 4))` being passed to
read_etm4x_sysreg_const_offset. Via the trace above, this is passed
through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
is string-ified and used directly in inline asm.
Regardless of which compiler or compiler options determine whether a
loop can or can't be unrolled, which determines whether
__builtin_constant_p evaluates to true when passed an expression using a
loop induction variable, it is NEVER safe to allow the preprocessor to
construct inline asm like:
asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
^ expected constant expression
Instead of read_etm4x_sysreg_offset() using __builtin_constant_p(), use
__is_constexpr from include/linux/const.h instead to ensure only
expressions that are valid integer constant expressions get passed
through to read_sysreg_s().
This is not a bug in clang; it's a potentially unsafe use of the macro
arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.
Link: https://github.com/ClangBuiltLinux/linux/issues/1310
Reported-by: Arnd Bergmann <arnd@kernel.org>
Reported-by: Tao Zhang <quic_taozha@quicinc.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220708231520.3958391-1-ndesaulniers@google.com
When enabled, all taken branch addresses are output, even if the branch
was because of a direct branch instruction. This enables reconstruction
of the program flow without having access to the memory image of the
code being executed.
Use bit 8 for the config option which would be the correct bit for
programming ETMv3. Although branch broadcast can't be enabled on ETMv3
because it's not in the define ETM3X_SUPPORTED_OPTIONS, using the
correct bit might help prevent future collisions or allow it to be
enabled if needed.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220511144601.2257870-2-james.clark@arm.com
The configfs system is a source of access to the config information in the
configuration and feature lists.
This can result in additional LOCKDEP issues as a result of the mutex
ordering between the config list mutex (cscfg_mutex) and the configfs
system mutexes.
As such we need to adjust how load/unload operations work to ensure correct
operation.
1) Previously the cscfg_mutex was held throughout the load/unload
operation. This is now only held during configuration list manipulations,
resulting in a multi-stage load/unload process.
2) All operations that manipulate the configfs representation of the
configurations and features are now separated out and run without the
cscfg_mutex being held. This avoids circular lock_dep issue with the
built-in configfs mutexes and semaphores
3) As the load and unload is now multi-stage, some parts under the
cscfg_mutex and others not:
i) A flag indicating a load / unload operation in progress is used to
serialise load / unload operations.
ii) activating any configuration not possible when unload is in progress.
iii) Configurations have an "available" flag set only after the last load
stage for the configuration is complete. Activation of the configuration
not possible till flag is set.
4) Following load/unload rules remain:
i) Unload prevented while any configuration is active remains.
ii) Unload in strict reverse order of load.
iii) Existing configurations can be activated while a new load operation
is underway. (by definition there can be no dependencies between an
existing configuration and a new loading one due to ii) above.)
Fixes: eb2ec49606 ("coresight: syscfg: Update load API for config loadable modules")
Reported-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220628173004.30002-3-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Any loaded configurations must be correctly unloaded on coresight module
exit, or issues can arise with nested locking in the configfs directory
code if built with CONFIG_LOCKDEP.
Prior to this patch, the preloaded configuration configfs directory entries
were being unloaded by the recursive code in
configfs_unregister_subsystem().
However, when built with CONFIG_LOCKDEP, this caused a nested lock warning,
which was not mitigated by the LOCKDEP dependent code in fs/configfs/dir.c
designed to prevent this, due to the different directory levels for the
root of the directory being removed.
As the preloaded (and all other) configurations are registered after
configfs_register_subsystem(), we now explicitly unload them before the
call to configfs_unregister_subsystem().
The new routine cscfg_unload_cfgs_on_exit() iterates through the load
owner list to unload any remaining configurations that were not unloaded
by the user before the module exits. This covers both the
CSCFG_OWNER_PRELOAD and CSCFG_OWNER_MODULE owner types, and will be
extended to cover future load owner types for CoreSight configurations.
Fixes: eb2ec49606 ("coresight: syscfg: Update load API for config loadable modules")
Reported-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220628173004.30002-2-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
coresight devices track their connections (output connections) and
hold a reference to the fwnode. When a device goes away, we walk through
the devices on the coresight bus and make sure that the references
are dropped. This happens both ways:
a) For all output connections from the device, drop the reference to
the target device via coresight_release_platform_data()
b) Iterate over all the devices on the coresight bus and drop the
reference to fwnode if *this* device is the target of the output
connection, via coresight_remove_conns()->coresight_remove_match().
However, the coresight_remove_match() doesn't clear the fwnode field,
after dropping the reference, this causes use-after-free and
additional refcount drops on the fwnode.
e.g., if we have two devices, A and B, with a connection, A -> B.
If we remove B first, B would clear the reference on B, from A
via coresight_remove_match(). But when A is removed, it still has
a connection with fwnode still pointing to B. Thus it tries to drops
the reference in coresight_release_platform_data(), raising the bells
like :
[ 91.990153] ------------[ cut here ]------------
[ 91.990163] refcount_t: addition on 0; use-after-free.
[ 91.990212] WARNING: CPU: 0 PID: 461 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x144
[ 91.990260] Modules linked in: coresight_funnel coresight_replicator coresight_etm4x(-)
crct10dif_ce coresight ip_tables x_tables ipv6 [last unloaded: coresight_cpu_debug]
[ 91.990398] CPU: 0 PID: 461 Comm: rmmod Tainted: G W T 5.19.0-rc2+ #53
[ 91.990418] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb 1 2019
[ 91.990434] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 91.990454] pc : refcount_warn_saturate+0xa0/0x144
[ 91.990476] lr : refcount_warn_saturate+0xa0/0x144
[ 91.990496] sp : ffff80000c843640
[ 91.990509] x29: ffff80000c843640 x28: ffff800009957c28 x27: ffff80000c8439a8
[ 91.990560] x26: ffff00097eff1990 x25: ffff8000092b6ad8 x24: ffff00097eff19a8
[ 91.990610] x23: ffff80000c8439a8 x22: 0000000000000000 x21: ffff80000c8439c2
[ 91.990659] x20: 0000000000000000 x19: ffff00097eff1a10 x18: ffff80000ab99c40
[ 91.990708] x17: 0000000000000000 x16: 0000000000000000 x15: ffff80000abf6fa0
[ 91.990756] x14: 000000000000001d x13: 0a2e656572662d72 x12: 657466612d657375
[ 91.990805] x11: 203b30206e6f206e x10: 6f69746964646120 x9 : ffff8000081aba28
[ 91.990854] x8 : 206e6f206e6f6974 x7 : 69646461203a745f x6 : 746e756f63666572
[ 91.990903] x5 : ffff00097648ec58 x4 : 0000000000000000 x3 : 0000000000000027
[ 91.990952] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00080260ba00
[ 91.991000] Call trace:
[ 91.991012] refcount_warn_saturate+0xa0/0x144
[ 91.991034] kobject_get+0xac/0xb0
[ 91.991055] of_node_get+0x2c/0x40
[ 91.991076] of_fwnode_get+0x40/0x60
[ 91.991094] fwnode_handle_get+0x3c/0x60
[ 91.991116] fwnode_get_nth_parent+0xf4/0x110
[ 91.991137] fwnode_full_name_string+0x48/0xc0
[ 91.991158] device_node_string+0x41c/0x530
[ 91.991178] pointer+0x320/0x3ec
[ 91.991198] vsnprintf+0x23c/0x750
[ 91.991217] vprintk_store+0x104/0x4b0
[ 91.991238] vprintk_emit+0x8c/0x360
[ 91.991257] vprintk_default+0x44/0x50
[ 91.991276] vprintk+0xcc/0xf0
[ 91.991295] _printk+0x68/0x90
[ 91.991315] of_node_release+0x13c/0x14c
[ 91.991334] kobject_put+0x98/0x114
[ 91.991354] of_node_put+0x24/0x34
[ 91.991372] of_fwnode_put+0x40/0x5c
[ 91.991390] fwnode_handle_put+0x38/0x50
[ 91.991411] coresight_release_platform_data+0x74/0xb0 [coresight]
[ 91.991472] coresight_unregister+0x64/0xcc [coresight]
[ 91.991525] etm4_remove_dev+0x64/0x78 [coresight_etm4x]
[ 91.991563] etm4_remove_amba+0x1c/0x2c [coresight_etm4x]
[ 91.991598] amba_remove+0x3c/0x19c
Reproducible by: (Build all coresight components as modules):
#!/bin/sh
while true
do
for m in tmc stm cpu_debug etm4x replicator funnel
do
modprobe coresight_${m}
done
for m in tmc stm cpu_debug etm4x replicator funnel
do
rmmode coresight_${m}
done
done
Cc: stable@vger.kernel.org
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Fixes: 37ea1ffddf ("coresight: Use fwnode handle instead of device names")
Link: https://lore.kernel.org/r/20220614214024.3005275-1-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Per discussion in [1], it was decided to move to using architecture
independent/asm-generic IO memory barriers to have just one set of
them and deprecate use of arm64 specific IO memory barriers in driver
code. So replace current usage of __io_rmb()/__iowmb() in drivers to
__io_ar()/__io_bw().
[1] https://lore.kernel.org/lkml/CAK8P3a0L2tLeF1Q0+0ijUxhGNaw+Z0fyPC1oW6_ELQfn0=i4iw@mail.gmail.com/
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The panic notifier infrastructure executes registered callbacks when
a panic event happens - such callbacks are executed in atomic context,
with interrupts and preemption disabled in the running CPU and all other
CPUs disabled. That said, mutexes in such context are not a good idea.
This patch replaces a regular mutex with a mutex_trylock safer approach;
given the nature of the mutex used in the driver, it should be pretty
uncommon being unable to acquire such mutex in the panic path, hence
no functional change should be observed (and if it is, that would be
likely a deadlock with the regular mutex).
Fixes: 2227b7c746 ("coresight: add support for CPU debug module")
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220427224924.592546-10-gpiccoli@igalia.com
It is possibe that probe failure issue happens when the device
and its child_device's probe happens at the same time.
In coresight_make_links, has_conns_grp is true for parent, but
has_conns_grp is false for child device as has_conns_grp is set
to true in coresight_create_conns_sysfs_group. The probe of parent
device will fail at this condition. Add has_conns_grp check for
child device before make the links and make the process from
device_register to connection_create be atomic to avoid this
probe failure issue.
Cc: stable@vger.kernel.org
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Suggested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
Link: https://lore.kernel.org/r/20220309142206.15632-1-quic_jinlmao@quicinc.com
[ Added Cc stable ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-16-james.clark@arm.com
/* Removed extra new lines */
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-15-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-14-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-13-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-12-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. These fields already have macros
to define them so use them instead of magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-11-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-10-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-9-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-8-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-7-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-6-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-5-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-4-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-3-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This is a no-op change for style and consistency and has no effect on
the binary output by the compiler. In sysreg.h fields are defined as
the register name followed by the field name and then _MASK. This
allows for grepping for fields by name rather than using magic numbers.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20220304171913.2292458-2-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Here is the big set of char/misc and other small driver subsystem
updates for 5.18-rc1.
Included in here are merges from driver subsystems which contain:
- iio driver updates and new drivers
- fsi driver updates
- fpga driver updates
- habanalabs driver updates and support for new hardware
- soundwire driver updates and new drivers
- phy driver updates and new drivers
- coresight driver updates
- icc driver updates
Individual changes include:
- mei driver updates
- interconnect driver updates
- new PECI driver subsystem added
- vmci driver updates
- lots of tiny misc/char driver updates
There will be two merge conflicts with your tree, one in MAINTAINERS
which is obvious to fix up, and one in drivers/phy/freescale/Kconfig
which also should be easy to resolve.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYkG3fQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykNEgCfaRG8CRxewDXOO4+GSeA3NGK+AIoAnR89donC
R4bgCjfg8BWIBcVVXg3/
=WWXC
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the big set of char/misc and other small driver subsystem
updates for 5.18-rc1.
Included in here are merges from driver subsystems which contain:
- iio driver updates and new drivers
- fsi driver updates
- fpga driver updates
- habanalabs driver updates and support for new hardware
- soundwire driver updates and new drivers
- phy driver updates and new drivers
- coresight driver updates
- icc driver updates
Individual changes include:
- mei driver updates
- interconnect driver updates
- new PECI driver subsystem added
- vmci driver updates
- lots of tiny misc/char driver updates
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits)
firmware: google: Properly state IOMEM dependency
kgdbts: fix return value of __setup handler
firmware: sysfb: fix platform-device leak in error path
firmware: stratix10-svc: add missing callback parameter on RSU
arm64: dts: qcom: add non-secure domain property to fastrpc nodes
misc: fastrpc: Add dma handle implementation
misc: fastrpc: Add fdlist implementation
misc: fastrpc: Add helper function to get list and page
misc: fastrpc: Add support to secure memory map
dt-bindings: misc: add fastrpc domain vmid property
misc: fastrpc: check before loading process to the DSP
misc: fastrpc: add secure domain support
dt-bindings: misc: add property to support non-secure DSP
misc: fastrpc: Add support to get DSP capabilities
misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP
misc: fastrpc: separate fastrpc device from channel context
dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells
dt-bindings: nvmem: make "reg" property optional
nvmem: brcm_nvram: parse NVRAM content into NVMEM cells
nvmem: dt-bindings: Fix the error of dt-bindings check
...
CORESIGHT_DEV_TYPE_NONE/CORESIGHT_DEV_SUBTYPE_XXXX_NONE values are not used
any where. Actual enumeration can start from 0. Just drop these unused enum
values.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1645005118-10561-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
ETMv3 driver enables PID tracing by directly using perf config from
userspace, this means the tracer will capture PID packets from root
namespace but the profiling session runs in non-root PID namespace.
Finally, the recorded packets can mislead perf reporting with the
mismatched PID values.
This patch changes to only enable PID tracing for root PID namespace.
Note, the hardware supports VMID tracing from ETMv3.5, but the driver
never enables VMID trace, this patch doesn't handle VMID trace (bit 30
in ETMCR register) particularly.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220204152403.71775-5-leo.yan@linaro.org
As commented in the function ctxid_pid_store(), it can cause the PID
values mismatching between context ID tracing and PID allocated in a
non-root namespace.
For this reason, when a process runs in non-root PID namespace, the
driver doesn't allow PID tracing and returns failure when access
contextID related sysfs nodes.
VMID works for virtual contextID when the kernel runs in EL2 mode with
VHE; on the other hand, the driver doesn't prevent users from accessing
it when programs run in the non-root namespace. Thus this can lead
to same issues with contextID described above.
This patch imposes the checking on VMID related sysfs knobs and returns
failure if current process runs in non-root PID namespace.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220204152403.71775-3-leo.yan@linaro.org
Updates to the values and the index are protected via the spinlock.
Ensure we use the same lock to read the value safely.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220204152403.71775-2-leo.yan@linaro.org
Currently with the check present in the module initialisation, it shouts
on all the systems irrespective of presence of coresight trace buffer
extensions.
Similar to Arm SPE perf driver, move the check for kernel page table
isolation from EL0 to the device probe stage instead of the module
initialisation so that it complains only on the systems that support TRBE.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: coresight@lists.linaro.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220203190159.3145272-1-sudeep.holla@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
The spec says this:
P0 tracing support field. The permitted values are:
0b00 Tracing of load and store instructions as P0 elements is not
supported.
0b11 Tracing of load and store instructions as P0 elements is
supported, so TRCCONFIGR.INSTP0 is supported.
All other values are reserved.
The value we are looking for is 0b11 so simplify this. The double read
and && was a bit obfuscated.
Suggested-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20220203115336.119735-2-james.clark@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Replace acpi_bus_get_device() that is going to be dropped with
acpi_fetch_acpi_dev().
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/5790600.lOV4Wx5bFT@kreacher
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
device_register() calls device_initialize(),
according to doc of device_initialize:
Use put_device() to give up your reference instead of freeing
* @dev directly once you have called this function.
To prevent potential memleak, use put_device() for error handling.
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Fixes: 85e2414c51 ("coresight: syscfg: Initial coresight system configuration")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220124124121.8888-1-linmq006@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
It's impossible to program a valid value for TRCCONFIGR.QE
when TRCIDR0.QSUPP==0b10. In that case the following is true:
Q element support is implemented, and only supports Q elements without
instruction counts. TRCCONFIGR.QE can only take the values 0b00 or 0b11.
Currently the low bit of QSUPP is checked to see if the low bit of QE can
be written to, but as you can see when QSUPP==0b10 the low bit is cleared
making it impossible to ever write the only valid value of 0b11 to QE.
0b10 would be written instead, which is a reserved QE value even for all
values of QSUPP.
The fix is to allow writing the low bit of QE for any non zero value of
QSUPP.
This change also ensures that the low bit is always set, even when the
user attempts to only set the high bit.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Fixes: d8c6696208 ("coresight-etm4x: Controls pertaining to the reset, mode, pe and events")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220120113047.2839622-2-james.clark@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
TRBE implementations affected by Arm erratum #1902691 might corrupt trace
data or deadlock, when it's being written into the memory. Workaround this
problem in the driver, by preventing TRBE initialization on affected cpus.
The firmware must have disabled the access to TRBE for the kernel on such
implementations. This will cover the kernel for any firmware that doesn't
do this already. This just updates the TRBE driver as required.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-8-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
TRBE implementations affected by Arm erratum #2038923 might get TRBE into
an inconsistent view on whether trace is prohibited within the CPU. As a
result, the trace buffer or trace buffer state might be corrupted. This
happens after TRBE buffer has been enabled by setting TRBLIMITR_EL1.E,
followed by just a single context synchronization event before execution
changes from a context, in which trace is prohibited to one where it isn't,
or vice versa. In these mentioned conditions, the view of whether trace is
prohibited is inconsistent between parts of the CPU, and the trace buffer
or the trace buffer state might be corrupted.
Work around this problem in the TRBE driver by preventing an inconsistent
view of whether the trace is prohibited or not based on TRBLIMITR_EL1.E by
immediately following a change to TRBLIMITR_EL1.E with at least one ISB
instruction before an ERET, or two ISB instructions if no ERET is to take
place. This just updates the TRBE driver as required.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-7-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
TRBE implementations affected by Arm erratum #2064142 might fail to write
into certain system registers after the TRBE has been disabled. Under some
conditions after TRBE has been disabled, writes into certain TRBE registers
TRBLIMITR_EL1, TRBPTR_EL1, TRBBASER_EL1, TRBSR_EL1 and TRBTRG_EL1 will be
ignored and not be effected.
Work around this problem in the TRBE driver by executing TSB CSYNC and DSB
just after the trace collection has stopped and before performing a system
register write to one of the affected registers. This just updates the TRBE
driver as required.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-6-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].
This code was transformed with the help of Coccinelle:
(next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)
@@
identifier S, member, array;
type T1, T2;
@@
struct S {
...
T1 member;
T2 array[
- 0
];
};
UAPI and wireless changes were intentionally excluded from this patch
and will be sent out separately.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/78
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
The double `the' in the comment in line 732 is repeated. Remove one
of them from the comment.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Link: https://lore.kernel.org/r/20211211090221.241529-1-wangborong@cdjrlc.com
[Fixed capital letter in title]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Adds configfs attributes to allow a configuration to be enabled for use
when sysfs is used to control CoreSight.
perf retains independent enabling of configurations.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20211124200038.28662-6-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
CoreSight configurations and features can be added as kernel loadable
modules. This patch updates the load owner API to ensure that the module
cannot be unloaded either:
1) if the config it supplies is in use
2) if the module is not the last in the load order list.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20211124200038.28662-4-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Expand the configuration API to allow dynamic runtime load and unload of
configurations and features.
On load, configurations and features are tagged with a "load owner" that
is used to determine sets that were loaded together in a given API call.
To unload the API uses the load owner to unload all elements previously
loaded by that owner.
The API also records the order in which different owners loaded
their elements into the system. Later loading configurations can use
previously loaded features, creating load dependencies. Therefore unload
is enforced strictly in the reverse order to load.
A load owner will be an additional loadable module, or a configuration
loaded via configfs.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20211124200038.28662-3-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Update the existing load API to introduce a "load owner" concept.
This allows the tracking of the loaded configurations and features against
the loading owner type, to allow later unload according to owner.
A list of loaded configurations by owner is created.
The load owner infrastructure will be used in following patches
to implement dynanic load and unload, alongside dependency tracking.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20211124200038.28662-2-mike.leach@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
TRBE implementations affected by Arm erratum (2253138 or 2224489), could
write to the next address after the TRBLIMITR.LIMIT, instead of wrapping
to the TRBBASER. This implies that the TRBE could potentially corrupt :
- A page used by the rest of the kernel/user (if the LIMIT = end of
perf ring buffer)
- A page within the ring buffer, but outside the driver's range.
[head, head + size]. This may contain some trace data, may be
consumed by the userspace.
We workaround this erratum by :
- Making sure that there is at least an extra PAGE space left in the
TRBE's range than we normally assign. This will be additional to other
restrictions (e.g, the TRBE alignment for working around
TRBE_WORKAROUND_OVERWRITE_IN_FILL_MODE, where there is a minimum of
PAGE_SIZE. Thus we would have 2 * PAGE_SIZE)
- Adjust the LIMIT to leave the last PAGE_SIZE out of the TRBE's allowed
range (i.e, TRBEBASER...TRBLIMITR.LIMIT), by :
TRBLIMITR.LIMIT -= PAGE_SIZE
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-14-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The TRBE driver makes sure that there is enough space for a meaningful
run, otherwise pads the given space and restarts the offset calculation
once. But there is no guarantee that we may find space or hit "no space".
Make sure that we repeat the step until, either :
- We have the minimum space
OR
- There is NO space at all.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-13-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
For the TRBE to operate, we need a minimum space available to collect
meaningful trace session. This is currently a few bytes, but we may need
to extend this for working around errata. So, abstract this into a helper
function.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-12-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
an erratum, which when triggered, might cause the TRBE to overwrite
the trace data already collected in FILL mode, in the event of a WRAP.
i.e, the TRBE doesn't stop writing the data, instead wraps to the base
and could write upto 3 cache line size worth trace. Thus, this could
corrupt the trace at the "BASE" pointer.
The workaround is to program the write pointer 256bytes from the
base, such that if the erratum is triggered, it doesn't overwrite
the trace data that was captured. This skipped region could be
padded with ignore packets at the end of the session, so that
the decoder sees a continuous buffer with some padding at the
beginning. The trace data written at the base is considered
lost as the limit could have been in the middle of the perf
ring buffer, and jumping to the "base" is not acceptable.
We set the flags already to indicate that some amount of trace
was lost during the FILL event IRQ. So this is fine.
One important change with the work around is, we program the
TRBBASER_EL1 to current page where we are allowed to write.
Otherwise, it could overwrite a region that may be consumed
by the perf. Towards this, we always make sure that the
"handle->head" and thus the trbe_write is PAGE_SIZE aligned,
so that we can set the BASE to the PAGE base and move the
TRBPTR to the 256bytes offset.
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-11-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Add a minimal infrastructure to keep track of the errata
affecting the given TRBE instance. Given that we have
heterogeneous CPUs, we have to manage the list per-TRBE
instance to be able to apply the work around as needed.
Thus we will need to check if individual CPUs are affected
by the erratum.
We rely on the arm64 errata framework for the actual
description and the discovery of a given erratum, to
keep the Erratum work around at a central place and
benefit from the code and the advertisement from the
kernel. Though we could reuse the "this_cpu_has_cap()"
to apply an erratum work around, it is a bit of a heavy
operation, as it must go through the "erratum" detection
check on the CPU every time it is called (e.g, scanning
through a table of affected MIDRs). Since we need
to do this check for every session, may be multiple
times (depending on the wrok around), we could save
the cycles by caching the affected errata per-CPU
instance in the per-CPU struct trbe_cpudata.
Since we are only interested in the errata affecting
the TRBE driver, we only need to track a very few of them
per-CPU. Thus we use a local mapping of the CPUCAP for the
erratum to avoid bloating up a bitmap for trbe_cpudata.
i.e, each arm64 TRBE erratum bit is assigned a "index"
within the driver to track. Each trbe instance updates
the list of affected erratum at probe time on the CPU.
This makes sure that we can easily access the list of
errata on a given TRBE instance without much overhead.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-10-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The TRBE hardware mandates a minimum alignment for the TRBPTR_EL1,
advertised via the TRBIDR_EL1. This is used by the driver to
align the buffer write head. This patch allows the driver to
choose a different alignment from that of the hardware, by
decoupling the alignment tracking. This will be useful for
working around errata.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-9-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
We always set the TRBBASER_EL1 to the base of the virtual ring
buffer. We are about to change this for working around an erratum.
So, in preparation to that, allow the driver to choose a different
base for the TRBBASER_EL1 (which is within the buffer range).
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-8-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Refactor the helper to pad a given AUX buffer area to allow
"filling" ignore packets, without moving any handle pointers.
This will be useful in working around errata, where we may
have to fill the buffer after a session.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-7-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
We collect the trace from the TRBE on FILL event from IRQ context
and via update_buffer(), when the event is stopped. Let us
consolidate how we calculate the trace generated into a helper.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-6-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The TRBE driver wrongly treats the aux private data as the TRBE driver
specific buffer for a given perf handle, while it is the ETM PMU's
event specific data. Fix this by correcting the instance to use
appropriate helper.
Cc: stable <stable@vger.kernel.org>
Fixes: 3fbf7f011f ("coresight: sink: Add TRBE driver")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210921134121.2423546-2-suzuki.poulose@arm.com
[Fixed 13 character SHA down to 12]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Add ETM PID for Kryo-5XX to the list of supported ETMs.
Otherwise, Kryo-5XX ETMs will not be initialized successfully.
e.g.
This change can be verified on qrb5165-rb5 board. ETM4-ETM7 nodes
will not be visible without this change.
Signed-off-by: Tao Zhang <quic_taozha@quicinc.com>
Link: https://lore.kernel.org/r/1632477981-13632-2-git-send-email-quic_taozha@quicinc.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When the TRBE generates an IRQ, we stop the TRBE, collect the trace
and then reprogram the TRBE with the updated buffer pointers, whenever
possible. We might also leave the TRBE disabled, if there is not
enough space left in the buffer. However, we do not touch the ETE at
all during all of this. This means the ETE is only disabled when
the event is disabled later (via irq_work). This is incorrect, as the
ETE trace is still ON without actually being captured and may be routed
to the ATB (even if it is for a short duration).
So, we move the CPU into trace prohibited state always before disabling
the TRBE, upon entering the IRQ handler. The state is restored if the
TRBE is enabled back. Otherwise the trace remains prohibited.
Since, the ETM/ETE driver now controls the TRFCR_EL1 per session, the
tracing can be restored/enabled back when the event is rescheduled
in.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210923143919.2944311-6-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When we detect that there isn't enough space left to start a meaningful
session, we disable the TRBE, marking the buffer as TRUNCATED. But we delay
the notification to the perf layer by perf_aux_output_end() until the event
is scheduled out, triggered from the kernel perf layer. This will cause
significant black outs in the trace. Now that the CoreSight PMU layer can
handle a closed "AUX" handle properly, we can close the handle as soon as
we detect the case, allowing the userspace to collect and re-enable the
event.
Also, while in the IRQ handler, move the irq_work_run() after we have
updated the handle, to make sure the "TRUNCATED" flag causes the event to
be disabled as soon as possible.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210923143919.2944311-5-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The TRBE driver marks the AUX buffer as TRUNCATED when we get an IRQ
on FILL event. This has rather unwanted side-effect of the event
being disabled when there may be more space in the ring buffer.
So, instead of TRUNCATE we need a different flag to indicate
that the trace may have lost a few bytes (i.e from the point of
generating the FILL event until the IRQ is consumed). Anyways, the
userspace must use the size from RECORD_AUX headers to restrict
the "trace" decoding.
Using PARTIAL flag causes the perf tool to generate the
following warning:
Warning:
AUX data had gaps in it XX times out of YY!
Are you running a KVM guest in the background?
which is pointlessly scary for a user. The other remaining options
are :
- COLLISION - Use by SPE to indicate samples collided
- Add a new flag - Specifically for CoreSight, doesn't sound
so good, if we can re-use something.
Given that we don't already use the "COLLISION" flag, the above
behavior can be notified using this flag for CoreSight.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210923143919.2944311-4-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
On a spurious IRQ, right now we disable the TRBE and then re-enable
it back, resetting the "buffer" pointers(i.e BASE, LIMIT and more
importantly WRITE) to the original pointers from the AUX handle.
This implies that we overwrite any trace that was written so far,
(by overwriting TRBPTR) while we should have ignored the IRQ.
On detecting a spurious IRQ after examining the TRBSR we simply
re-enable the TRBE without touching the other parameters.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210923143919.2944311-3-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The IRQ handler of the TRBE driver could race against the update_buffer()
in consuming the IRQ. So, if the update_buffer() gets to processing the
TRBE irq, the TRBSR will be cleared. Thus by the time IRQ handler is
triggered, there is nothing to do there. Handle these cases and do not
disable the TRBE unnecessarily. Since the TRBSR can be read without
stopping the TRBE, we can check that before disabling the TRBE.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210923143919.2944311-2-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Unify the sequence of enabling the TRBE. We do this from
event_start and also from the TRBE IRQ handler. Lets move
this to a common helper. The only minor functional change
is returning an error when we fail to enable the TRBE.
This should be handled already.
Since we now have unique entry point to trying to enable TRBE,
move the format flag setting to the central place.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-9-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
We mark the buffer as TRUNCATED when there is no space left
in the buffer. But we do it at different points.
__trbe_normal_offset()
and also, at all the callers of the above function via
compute_trbe_buffer_limit(), when the limit == base (i.e
offset = 0 as returned by the __trbe_normal_offset()).
So, given that the callers already mark the buffer as TRUNCATED
drop the caller inside the __trbe_normal_offset().
This is in preparation to moving the handling of TRUNCATED
into a central place.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-6-suzuki.poulose@arm.com
[Moved comment as Anshuman requested]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When the TRBE is stopped on truncating an event, we may not
set the FORMAT flag, even though the size of the record is 0.
Let us be consistent and not confuse the user.
To ensure that the format flag is always set on all the
records generated by TRBE, set the flag when we have a
new handle. Rather than deferring to the "end" operation,
which makes it clear. So, we can do this from
- arm_trbe_enable() -> When a new handle is provided by the
CoreSight PMU, triggered via etm_event_start()
- trbe_handle_overflow() -> When we begin a new handle after
closing the previous on overflow.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-5-suzuki.poulose@arm.com
[Fixed inverted words in title]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The ETM perf infrastructure closes out a handle during event_stop
or on an error in starting the event. In either case, it is possible
for a "sink" to update/close the handle, under certain circumstances.
(e.g no space in ring buffer.). So, ensure that we handle this
gracefully in the PMU driver by verifying the handle is still valid.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-4-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The Trace Filtering support (FEAT_TRF) ensures that the ETM
can be prohibited from generating any trace for a given EL.
This is much stricter knob, than the TRCVICTLR exception level
masks, which doesn't prevent the ETM from generating Context
packets for an "excluded" EL. At the moment, we do a onetime
enable trace at user and kernel and leave it untouched for the
kernel life time. This implies that the ETM could potentially
generate trace packets containing the kernel addresses, and
thus leaking the kernel virtual address in the trace.
This patch makes the switch dynamic, by honoring the filters
set by the user and enforcing them in the TRFCR controls.
We also rename the cpu_enable_tracing() appropriately to
cpu_detect_trace_filtering() and the drvdata member
trfc => trfcr to indicate the "value" of the TRFCR_EL1.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-3-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When the CPU enters a low power mode, the TRFCR_EL1 contents could be
reset. Thus we need to save/restore the TRFCR_EL1 along with the ETM4x
registers to allow the tracing.
The TRFCR related helpers are in a new header file, as we need to use
them for TRBE in the later patches.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210914102641.1852544-2-suzuki.poulose@arm.com
[Fixed cosmetic details]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When a traced process runs on a CPU that can't reach the selected sink,
the event will be stopped with PERF_HES_STOPPED. This means that even if
the process migrates to a valid CPU, tracing will not resume.
This can be reproduced (on N1SDP) by using taskset to start the process
on CPU 0, and then switching it to CPU 2 (ETF 1 is only reachable from
CPU 2):
taskset --cpu-list 0 ./perf record -e cs_etm/@tmc_etf1/ --per-thread -- taskset --cpu-list 2 ls
This produces a single 0 length AUX record, and then no more trace:
0x3c8 [0x30]: PERF_RECORD_AUX offset: 0 size: 0 flags: 0x1 [T]
After the fix, the same command produces normal AUX records. The perf
self test "89: Check Arm CoreSight trace data recording and synthesized
samples" no longer fails intermittently. This was because the taskset in
the test is after the fork, so there is a period where the task is
scheduled on a random CPU rather than forced to a valid one.
Specifically selecting an invalid CPU will still result in a failure to
open the event because it will never produce trace:
./perf record -C 2 -e cs_etm/@tmc_etf0/
failed to mmap with 12 (Cannot allocate memory)
The only scenario that has changed is if the CPU mask has a valid CPU
sink combo in it.
Testing
=======
* Coresight self test passes consistently:
./perf test Coresight
* CPU wide mode still produces trace:
./perf record -e cs_etm// -a
* Invalid -C options still fail to open:
./perf record -C 2,3 -e cs_etm/@tmc_etf0/
failed to mmap with 12 (Cannot allocate memory)
* Migrating a task to a valid sink/CPU now produces trace:
taskset --cpu-list 0 ./perf record -e cs_etm/@tmc_etf1/ --per-thread -- taskset --cpu-list 2 ls
* If the task remains on an invalid CPU, no trace is emitted:
taskset --cpu-list 0 ./perf record -e cs_etm/@tmc_etf1/ --per-thread -- ls
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210922125144.133872-2-james.clark@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
The AUX bounce buffer is allocated with API dma_alloc_coherent(), in the
low level's architecture code, e.g. for Arm64, it maps the memory with
the attribution "Normal non-cacheable"; this can be concluded from the
definition for pgprot_dmacoherent() in arch/arm64/include/asm/pgtable.h.
Later when access the AUX bounce buffer, since the memory mapping is
non-cacheable, it's low efficiency due to every load instruction must
reach out DRAM.
This patch changes to allocate pages with dma_alloc_noncoherent(), the
driver can access the memory via cacheable mapping; therefore, load
instructions can fetch data from cache lines rather than always read
data from DRAM, the driver can boost memory performance. After using
the cacheable mapping, the driver uses dma_sync_single_for_cpu() to
invalidate cacheline prior to read bounce buffer so can avoid read stale
trace data.
By measurement the duration for function tmc_update_etr_buffer() with
ftrace function_graph tracer, it shows the performance significant
improvement for copying 4MiB data from bounce buffer:
# echo tmc_etr_get_data_flat_buf > set_graph_notrace // avoid noise
# echo tmc_update_etr_buffer > set_graph_function
# echo function_graph > current_tracer
before:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 8148.320 us | }
after:
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) | tmc_update_etr_buffer() {
...
2) # 2525.420 us | }
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210905032144.966766-1-leo.yan@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Commit 2f01c200d4 ("perf cs-etm: Remove callback cs_etm_find_snapshot()")
has removed the function cs_etm_find_snapshot() from the perf tool in the
user space, now CoreSight trace directly uses the perf common function
__auxtrace_mmap__read() to calcualte the head and size for AUX trace data
in snapshot mode.
This patch updates the comments in drivers to make them generic and not
stick to any specific function from perf tool.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20210912125748.2816606-3-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
When enable the Arm CoreSight PMU event, the context for AUX ring buffer
is prepared in the structure perf_output_handle, and its field "head"
points the head of the AUX ring buffer and it is updated after filling
AUX trace data into buffer.
Current code uses an extra field etr_perf_buffer::head to maintain the
header for the AUX ring buffer which is not necessary; alternatively,
it's better to directly use perf_output_handle::head.
This patch removes the field etr_perf_buffer::head and directly uses
perf_output_handle::head for the head of AUX ring buffer.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210912125748.2816606-2-leo.yan@linaro.org
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>