Add new inference_timeout_ms parameter that allows specifying
maximum allowed duration in milliseconds that inference can take before
triggering a recovery.
Calculate maximum number of heartbeat retries based on ratio between
inference timeout and tdr timeout.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250515093128.252041-1-jacek.lawrynowicz@linux.intel.com
Increase JMS message state dump command timeout to 100 ms. On some
platforms, the FW may take a bit longer than 50 ms to dump its state
to the log buffer and we don't want to miss any debug info during TDR.
Fixes: 5e162f872d ("accel/ivpu: Add FW state dump on TDR")
Cc: stable@vger.kernel.org # v6.13+
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425092822.2194465-1-jacek.lawrynowicz@linux.intel.com
Add power_profile firmware boot param and set it to 0 by default
which is default FW power profile.
Implement IVPU_TEST_MODE_D0I2_DISABLE which is used for setting
power profile boot param value to 1 which prevents NPU from entering
d0i2 power state.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204084622.2422544-7-jacek.lawrynowicz@linux.intel.com
Add IVPU_TEST_MODE_CLK_RELINQ_[DISABLE|ENABLE] that overrides
workaround for disabling clock relinquish for testing purposes.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204084622.2422544-6-jacek.lawrynowicz@linux.intel.com
Add debugfs interface to modify following priority bands properties:
* grace period
* process grace period
* process quantum
This allows for the adjustment of hardware scheduling algorithm parameters
for each existing priority band, facilitating validation and fine-tuning.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204084622.2422544-4-jacek.lawrynowicz@linux.intel.com
Recovery now works on fpga. JSM state dump timeout needs to
be really long for the new fpga model releases.
Enable punit on fpga.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129125636.1047413-6-jacek.lawrynowicz@linux.intel.com
Call pm_runtime_mark_last_busy() in top half of IRQ handler to prevent
device from being runtime suspended before bottom half is executed on
a workqueue.
Reviewed-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129125636.1047413-3-jacek.lawrynowicz@linux.intel.com
Introduces the capability to simulate hardware faults for testing
purposes. The new `fail_hw` fault can be injected in
`ivpu_hw_reg_poll_fld()`, which is used in various parts of the driver
to wait for the hardware to reach a specific state. This allows to test
failures during NPU boot and shutdown, IPC message handling and more.
Fault injection can be enabled using debugfs or a module parameter.
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129125636.1047413-2-jacek.lawrynowicz@linux.intel.com
Use highest buttress VPU_STATUS register bits(15:13) that encode
platform type as follows:
0 - Silicon
2 - Simics
3 - FPGA
4 - Hybrid SLE
Remove old DMI based method.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-14-maciej.falkowski@linux.intel.com
Convert IRQ bottom half from the thread handler into workqueue.
This increases a stability in rare scenarios where driver on
debugging/hardening kernels processes IRQ too slow and misses
some interrupts due to it.
Workqueue handler also gives a very minor performance increase.
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-6-maciej.falkowski@linux.intel.com
Increase DMA address range to:
* 128 GB on 37xx (due to MMU limitations)
* 256 GB on other generations
Merge User and DMA ranges on 40xx and above as it is possible
to access whole 256 GBs from both FW and DMA.
Increase User range on 37xx from 255MB to 511MB
to allow loading very large models.
Do not set global_alias_pio_base/size on other generations than 37xx
as it's only used on 37xx anyway.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017145817.121590-11-jacek.lawrynowicz@linux.intel.com
The NOC firewall interrupt means that the HW prevented
unauthorized access to a protected resource, so there
is no need to trigger device reset in such case.
To facilitate security testing add firewall_irq_counter
debugfs file that tracks firewall interrupts.
Fixes: 8a27ad81f7 ("accel/ivpu: Split IP and buttress code")
Cc: stable@vger.kernel.org # v6.11+
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017144958.79327-1-jacek.lawrynowicz@linux.intel.com
With recent Simics update DVFS flows using cdyn were fixed
and it is possible to enable D0i3/D3 entry flows on autosuspend.
Set autosuspend timeout to 100 ms by default on Simics.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-11-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Send JSM state dump message at the beginning of TDR handler. This allows
FW to collect debug info in the FW log before the state of the NPU is
lost allowing to analyze the cause of a TDR.
Wait a predefined timeout (10 ms) so the FW has a chance to write debug
logs. We cannot wait for JSM response at this point because IRQs are
already disabled before TDR handler is invoked.
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-9-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
The new HW is more power efficient and there is no
need to enter the D0i3/D3 so quickly. Increasing
autosuspend delay reduces latency in certain usage
scenarios.
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-14-jacek.lawrynowicz@linux.intel.com
Add new test mode flag that will disable all timeouts
defined in timeout fields of struct ivpu_device.
Remove also reschedule_suspend field as it is unused.
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-11-jacek.lawrynowicz@linux.intel.com
Send workpoint 0 request during power up on 37xx.
This is needed in rare case where WP0 was not sent
during power down due to device hang.
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-2-jacek.lawrynowicz@linux.intel.com
The NPU device consists of two parts: NPU buttress and NPU IP.
Buttress is a platform specific part that integrates the NPU IP with
the CPU.
NPU IP is the platform agnostic part that does the inference.
This separation enables support for multiple platforms using
a single NPU IP, so for example NPU IP 37XX could be integrated into
MTL and LNL platforms.
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240515113006.457472-3-jacek.lawrynowicz@linux.intel.com