Commit Graph

12 Commits

Author SHA1 Message Date
James Zhu
4057e6ce33 drm/amdkfd: add event_age tracking when receiving interrupt
Add event_age tracking when receiving interrupt.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15 11:37:55 -04:00
Felix Kuehling
34d292d579 drm/amdkfd: Asynchronously free events
The synchronize_rcu call in destroy_events can take several ms, which
noticeably slows down applications destroying many events. Use kfree_rcu
to free the event structure asynchronously and eliminate the
synchronize_rcu call in the user thread.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-12 14:20:13 -04:00
Felix Kuehling
5273e82c5f drm/amdkfd: Improve concurrency of event handling
Use rcu_read_lock to read p->event_idr concurrently with other readers
and writers. Use p->event_mutex only for creating and destroying events
and in kfd_wait_on_events.

Protect the contents of the kfd_event structure with a per-event
spinlock that can be taken inside the rcu_read_lock critical section.

This eliminates contention of p->event_mutex in set_event, which tends
to be on the critical path for dispatch latency even when busy waiting
is used. It also eliminates lock contention in event interrupt handlers.
Since the p->event_mutex is now used much less, the impact of requiring
it in kfd_wait_on_events should also be much smaller.

This should improve event handling latency for processes using multiple
GPUs concurrently.

v2: Reschedule the worker periodically to avoid soft lockup warnings

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Sean Keely <Sean.Keely@amd.com> # v1
Tested-by: Sanjay Tripathi <sanjay.tripathi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-07 16:34:24 -04:00
Rajneesh Bhardwaj
d87f36a063 drm/amdkfd: update SPDX license header
Update the SPDX License header for all the KFD files.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14 15:08:40 -05:00
Fenghua Yu
c7b6bac9c7 drm, iommu: Change type of pasid to u32
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-09-17 19:21:16 +02:00
Shaoyun Liu
e42051d213 drm/amdkfd: Implement GPU reset handlers in KFD
Lock KFD and evict existing queues on reset. Notify user mode by
signaling hw_exception events.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:56 -04:00
Felix Kuehling
482f07775c drm/amdkfd: Simplify event ID and signal slot management
Signal slots are identical to event IDs.

Replace the used_slot_bitmap and events hash table with an IDR to
allocate and lookup event IDs and signal slots more efficiently.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:27 -04:00
Felix Kuehling
50cb7dd94c drm/amdkfd: Simplify events page allocator
The first event page is always big enough to handle all events.
Handling of multiple events pages is not supported by user mode, and
not necessary.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:26 -04:00
Felix Kuehling
74e4071665 drm/amdkfd: Use wait_queue_t to implement event waiting
Use standard wait queues for waiting and waking up waiting threads
instead of inventing our own. We still have our own wait loop
because the HSA event semantics require the ability to have one
thread waiting on multiple wait queues (events) at the same time.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:25 -04:00
Alexey Skidanov
930c5ff439 drm/amdkfd: Add bad opcode exception handling
Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:28 +03:00
Alexey Skidanov
59d3e8be87 drm/amdkfd: Add memory exception handling
This patch adds Peripheral Page Request (PPR) failure processing
and reporting.

Bad address or pointer to a system memory block with inappropriate
read/write permission cause such PPR failure during a user queue
processing. PPR request handling is done by IOMMU driver notifying
AMDKFD module on PPR failure.

The process triggering a PPR failure will be notified by
appropriate event or SIGTERM signal will be sent to it.

v3:
- Change all bool fields in struct kfd_memory_exception_failure to
  uint32_t

Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:27 +03:00
Andrew Lewycky
f3a398183f drm/amdkfd: Add the events module
This patch adds the events module (kfd_events.c) and the interrupt
handle module for Kaveri (cik_event_interrupt.c).

The patch updates the interrupt_is_wanted(), so that it now calls the
interrupt isr function specific for the device that received the
interrupt. That function(implemented in cik_event_interrupt.c)
returns whether this interrupt is of interest to us or not.

The patch also updates the interrupt_wq(), so that it now calls the
device's specific wq function, which checks the interrupt source
and tries to signal relevant events.

v2:

Increase limit of signal events to 4096 per process
Remove bitfields from struct cik_ih_ring_entry
Rename radeon_kfd_event_mmap to kfd_event_mmap
Add debug prints to allocate_free_slot and allocate_signal_page
Make allocate_event_notification_slot return a correct value
Add warning prints to create_signal_event
Remove error print from IOCTL path
Reformatted debug prints in kfd_event_mmap
Map correct size (as received from mmap) in kfd_event_mmap

v3:

Reduce limit of signal events back to 256 per process
Fix allocation of kernel memory for signal events

Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19 13:02:26 +03:00