Commit Graph

342 Commits

Author SHA1 Message Date
Suravee Suthikulpanit
457da57646 iommu/amd: Lock DTE before updating the entry with WRITE_ONCE()
When updating only within a 64-bit tuple of a DTE, just lock the DTE and
use WRITE_ONCE() because it is writing to memory read back by HW.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-9-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:42 +01:00
Suravee Suthikulpanit
66ea3f96ae iommu/amd: Modify clear_dte_entry() to avoid in-place update
By reusing the make_clear_dte() and update_dte256().

Also, there is no need to set TV bit for non-SNP system when clearing DTE
for blocked domain, and no longer need to apply erratum 63 in clear_dte()
since it is already stored in struct ivhd_dte_flags and apply in
set_dte_entry().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-8-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:42 +01:00
Suravee Suthikulpanit
a2ce608a1e iommu/amd: Introduce helper function get_dte256()
And use it in clone_alias() along with update_dte256().
Also use get_dte256() in dump_dte_entry().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-7-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:41 +01:00
Suravee Suthikulpanit
fd5dff9de4 iommu/amd: Modify set_dte_entry() to use 256-bit DTE helpers
Also, the set_dte_entry() is used to program several DTE fields (e.g.
stage1 table, stage2 table, domain id, and etc.), which is difficult
to keep track with current implementation.

Therefore, separate logic for clearing DTE (i.e. make_clear_dte) and
another function for setting up the GCR3 Table Root Pointer, GIOV, GV,
GLX, and GuestPagingMode into another function set_dte_gcr3_table().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-6-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:40 +01:00
Suravee Suthikulpanit
8b3f787338 iommu/amd: Introduce helper function to update 256-bit DTE
The current implementation does not follow 128-bit write requirement
to update DTE as specified in the AMD I/O Virtualization Techonology
(IOMMU) Specification.

Therefore, modify the struct dev_table_entry to contain union of u128 data
array, and introduce a helper functions update_dte256() to update DTE using
two 128-bit cmpxchg operations to update 256-bit DTE with the modified
structure, and take into account the DTE[V, GV] bits when programming
the DTE to ensure proper order of DTE programming and flushing.

In addition, introduce a per-DTE spin_lock struct dev_data.dte_lock to
provide synchronization when updating the DTE to prevent cmpxchg128
failure.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-5-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18 09:37:39 +01:00
Jason Gunthorpe
91da87d38c iommu/amd: Add lockdep asserts for domain->dev_list
Add an assertion to all the iteration points that don't obviously
have the lock held already. These all take the locker higher in their
call chains.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/2-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10 10:12:06 +01:00
Jason Gunthorpe
4a552f7890 iommu/amd: Put list_add/del(dev_data) back under the domain->lock
The list domain->dev_list is protected by the domain->lock spinlock.
Any iteration, addition or removal must be under the lock.

Move the list_del() up into the critical section. pdom_is_sva_capable(),
and destroy_gcr3_table() do not interact with the list element.

Wrap the list_add() in a lock, it would make more sense if this was under
the same critical section as adjusting the refcounts earlier, but that
requires more complications.

Fixes: d6b47dec36 ("iommu/amd: Reduce domain lock scope in attach device path")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10 10:12:05 +01:00
Jason Gunthorpe
d53764723e iommu: Rename ops->domain_alloc_user() to domain_alloc_paging_flags()
Now that the main domain allocating path is calling this function it
doesn't make sense to leave it named _user. Change the name to
alloc_paging_flags() to mirror the new iommu_paging_domain_alloc_flags()
function.

A driver should implement only one of ops->domain_alloc_paging() or
ops->domain_alloc_paging_flags(). The former is a simpler interface with
less boiler plate that the majority of drivers use. The latter is for
drivers with a greater feature set (PASID, multiple page table support,
advanced iommufd support, nesting, etc). Additional patches will be needed
to achieve this.

Link: https://patch.msgid.link/r/2-v1-c252ebdeb57b+329-iommu_paging_flags_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-22 14:43:45 -04:00
Vasant Hegde
18f5a6b34b iommu/amd: Improve amd_iommu_release_device()
Previous patch added ops->release_domain support. Core will attach
devices to release_domain->attach_dev() before calling this function.
Devices are already detached their current domain and attached to
blocked domain.

This is mostly dummy function now. Just throw warning if device is still
attached to domain.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-13-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:48 +01:00
Vasant Hegde
a0e086b16e iommu/amd: Add ops->release_domain
In release path, remove device from existing domain and attach it to
blocked domain. So that all DMAs from device is blocked.

Note that soon blocked_domain will support other ops like
set_dev_pasid() but release_domain supports only attach_dev ops.
Hence added separate 'release_domain' variable.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-12-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:47 +01:00
Vasant Hegde
0b136493d3 iommu/amd: Reorder attach device code
Ideally in attach device path, it should take dev_data lock before
making changes to device data including IOPF enablement. So far dev_data
was using spinlock and it was hitting lock order issue when it tries to
enable IOPF. Hence Commit 526606b0a1 ("iommu/amd: Fix Invalid wait
context issue") moved IOPF enablement outside dev_data->lock.

Previous patch converted dev_data lock to mutex. Now its safe to call
amd_iommu_iopf_add_device() with dev_data->mutex. Hence move back PCI
device capability enablement (ATS, PRI, PASID) and IOPF enablement code
inside the lock. Also in attach_device(), update 'dev_data->domain' at
the end so that error handling becomes simple.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:46 +01:00
Vasant Hegde
e843aedbeb iommu/amd: Convert dev_data lock from spinlock to mutex
Currently in attach device path it takes dev_data->spinlock. But as per
design attach device path can sleep. Also if device is PRI capable then
it adds device to IOMMU fault handler queue which takes mutex. Hence
currently PRI enablement is done outside dev_data lock.

Covert dev_data lock from spinlock to mutex so that it follows the
design and also PRI enablement can be done properly.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:45 +01:00
Vasant Hegde
4b18ef8491 iommu/amd: Rearrange attach device code
attach_device() is just holding lock and calling do_attach(). There is
not need to have another function. Just move do_attach() code to
attach_device(). Similarly move do_detach() code to detach_device().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:45 +01:00
Vasant Hegde
d6b47dec36 iommu/amd: Reduce domain lock scope in attach device path
Currently attach device path takes protection domain lock followed by
dev_data lock. Most of the operations in this function is specific to
device data except pdom_attach_iommu() where it updates protection
domain structure. Hence reduce the scope of protection domain lock.

Note that this changes the locking order. Now it takes device lock
before taking doamin lock (group->mutex -> dev_data->lock ->
pdom->lock). dev_data->lock is used only in device attachment path.
So changing order is fine. It will not create any issue.

Finally move numa node assignment to pdom_attach_iommu().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:44 +01:00
Vasant Hegde
07bbd660db iommu/amd: Do not detach devices in domain free path
All devices attached to a protection domain must be freed before
calling domain free. Hence do not try to free devices in domain
free path. Continue to throw warning if pdom->dev_list is not empty
so that any potential issues can be fixed.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:43 +01:00
Vasant Hegde
d16041124d iommu/amd: xarray to track protection_domain->iommu list
Use xarray to track IOMMU attached to protection domain instead of
static array of MAX_IOMMUS. Also add lockdep assertion.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:42 +01:00
Vasant Hegde
743a4bae9f iommu/amd: Remove protection_domain.dev_cnt variable
protection_domain->dev_list tracks list of attached devices to
domain. We can use list_* functions on dev_list to get device count.
Hence remove 'dev_cnt' variable.

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:41 +01:00
Vasant Hegde
2fcab2deeb iommu/amd: Use ida interface to manage protection domain ID
Replace custom domain ID allocator with IDA interface.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30 11:06:40 +01:00
Joerg Roedel
556af583d2 Merge branch 'core' into amd/amd-vi 2024-10-30 11:02:48 +01:00
Vasant Hegde
4402f2627d iommu/amd: Implement global identity domain
Implement global identity domain. All device groups in identity domain
will share this domain.

In attach device path, based on device capability it will allocate per
device domain ID and GCR3 table. So that it can support SVA.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241028093810.5901-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:22 +01:00
Vasant Hegde
ce2cd17546 iommu/amd: Enhance amd_iommu_domain_alloc_user()
Previous patch enhanced core layer to check device PASID capability and
pass right flags to ops->domain_alloc_user().

Enhance amd_iommu_domain_alloc_user() to allocate domain with
appropriate page table based on flags parameter.
  - If flags is empty then allocate domain with default page table type.
    This will eventually replace ops->domain_alloc().
    For UNMANAGED domain, core will call this interface with flags=0. So
    AMD driver will continue to allocate V1 page table.

  - If IOMMU_HWPT_ALLOC_PASID flags is passed then allocate domain with v2
    page table.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:22 +01:00
Vasant Hegde
a005ef62f9 iommu/amd: Pass page table type as param to pdom_setup_pgtable()
Current code forces v1 page table for UNMANAGED domain and global page
table type (amd_iommu_pgtable) for rest of paging domain.

Following patch series adds support for domain_alloc_paging() ops. Also
enhances domain_alloc_user() to allocate page table based on 'flags.

Hence pass page table type as parameter to pdomain_setup_pgtable(). So
that caller can decide right page table type. Also update
dma_max_address() to take pgtable as parameter.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:21 +01:00
Vasant Hegde
b3c989083d iommu/amd: Separate page table setup from domain allocation
Currently protection_domain_alloc() allocates domain and also sets up
page table. Page table setup is required for PAGING domain only. Domain
type like SVA doesn't need page table. Hence move page table setup code
to separate function.

Also SVA domain allocation path does not call pdom_setup_pgtable().
Hence remove IOMMU_DOMAIN_SVA type check.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241028093810.5901-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29 10:08:21 +01:00
Uros Bizjak
5ce73c524f iommu/amd: Use atomic64_inc_return() in iommu.c
Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref)
to use optimized implementation and ease register pressure around
the primitive for targets that implement optimized variant.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241007084356.47799-1-ubizjak@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-15 10:22:37 +02:00
Joerg Roedel
97162f6093 Merge branches 'fixes', 'arm/smmu', 'intel/vt-d', 'amd/amd-vi' and 'core' into next 2024-09-13 12:53:05 +02:00
Jason Gunthorpe
3ab9d8d1b5 iommu/amd: Test for PAGING domains before freeing a domain
This domain free function can be called for IDENTITY and SVA domains too,
and they don't have page tables. For now protect against this by checking
the type. Eventually the different types should have their own free
functions.

Fixes: 485534bfcc ("iommu/amd: Remove conditions from domain free paths")
Reported-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/0-v1-ad9884ee5f5b+da-amd_iopgtbl_fix_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12 09:21:40 +02:00
Eliav Bar-ilan
8386207f37 iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all()
An incorrect argument order calling amd_iommu_dev_flush_pasid_pages()
causes improper flushing of the IOMMU, leaving the old value of GCR3 from
a previous process attached to the same PASID.

The function has the signature:

void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
				     ioasid_t pasid, u64 address, size_t size)

Correct the argument order.

Cc: stable@vger.kernel.org
Fixes: 474bf01ed9 ("iommu/amd: Add support for device based TLB invalidation")
Signed-off-by: Eliav Bar-ilan <eliavb@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/0-v1-fc6bc37d8208+250b-amd_pasid_flush_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12 09:20:18 +02:00
Jason Gunthorpe
485534bfcc iommu/amd: Remove conditions from domain free paths
Don't use tlb as some flag to indicate if protection_domain_alloc()
completed. Have protection_domain_alloc() unwind itself in the normal
kernel style and require protection_domain_free() only be called on
successful results of protection_domain_alloc().

Also, the amd_iommu_domain_free() op is never called by the core code with
a NULL argument, so remove all the NULL tests as well.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:39:01 +02:00
Jason Gunthorpe
47f218d108 iommu/amd: Store the nid in io_pgtable_cfg instead of the domain
We already have memory in the union here that is being wasted in AMD's
case, use it to store the nid.

Putting the nid here further isolates the io_pgtable code from the struct
protection_domain.

Fixup protection_domain_alloc so that the NID from the device is provided,
at this point dev is never NULL for AMD so this will now allocate the
first table pointer on the correct NUMA node.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:34 +02:00
Jason Gunthorpe
977fc27ca7 iommu/amd: Remove amd_io_pgtable::pgtbl_cfg
This struct is already in iop.cfg, we don't need two.

AMD is using this API sort of wrong, the cfg is supposed to be passed in
and then the allocation function will allocate ops memory and copy the
passed config into the new memory. Keep it kind of wrong and pass in the
cfg memory that is already part of the pagetable struct.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:33 +02:00
Jason Gunthorpe
670b57796c iommu/amd: Rename struct amd_io_pgtable iopt to pgtbl
There is struct protection_domain iopt and struct amd_io_pgtable iopt.
Next patches are going to want to write domain.iopt.iopt.xx which is quite
unnatural to read.

Give one of them a different name, amd_io_pgtable has fewer references so
call it pgtbl, to match pgtbl_cfg, instead.

Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:38:32 +02:00
Jason Gunthorpe
322d889ae7 iommu/amd: Remove amd_iommu_domain_update() from page table freeing
It is a serious bug if the domain is still mapped to any DTEs when it is
freed as we immediately start freeing page table memory, so any remaining
HW touch will UAF.

If it is not mapped then dev_list is empty and amd_iommu_domain_update()
does nothing.

Remove it and add a WARN_ON() to catch this class of bug.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:43 +02:00
Jason Gunthorpe
7a41dcb52f iommu/amd: Set the pgsize_bitmap correctly
When using io_pgtable the correct pgsize_bitmap is stored in the cfg, both
v1_alloc_pgtable() and v2_alloc_pgtable() set it correctly.

This fixes a bug where the v2 pgtable had the wrong pgsize as
protection_domain_init_v2() would set it and then do_iommu_domain_alloc()
immediately resets it.

Remove the confusing ops.pgsize_bitmap since that is not used if the
driver sets domain.pgsize_bitmap.

Fixes: 134288158a ("iommu/amd: Add domain_alloc_user based domain allocation")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:42 +02:00
Jason Gunthorpe
8d00b77a52 iommu/amd: Move allocation of the top table into v1_alloc_pgtable
All the page table memory should be allocated/free within the io_pgtable
struct. The v2 path is already doing this, make it consistent.

It is hard to see but the free of the root in protection_domain_free() is
a NOP on the success path because v1_free_pgtable() does
amd_iommu_domain_clr_pt_root().

The root memory is already freed because free_sub_pt() put it on the
freelist. The free path in protection_domain_free() is only used during
error unwind of protection_domain_alloc().

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:37:41 +02:00
Vasant Hegde
89ffb2c3c2 iommu/amd: Make amd_iommu_dev_update_dte() static
As its used inside iommu.c only. Also rename function to dev_update_dte()
as its static function.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:58 +02:00
Vasant Hegde
a3303762eb iommu/amd: Rework amd_iommu_update_and_flush_device_table()
Remove separate function to update and flush the device table as only
amd_iommu_update_and_flush_device_table() calls these functions.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:57 +02:00
Vasant Hegde
964877dc26 iommu/amd: Make amd_iommu_domain_flush_complete() static
AMD driver uses amd_iommu_domain_flush_complete() function to make sure
IOMMU processed invalidation commands before proceeding. Ideally this
should be called from functions which updates DTE/invalidates caches.
There is no need to call this function explicitly. This patches makes
below changes :

- Rename amd_iommu_domain_flush_complete() -> domain_flush_complete()
  and make it as static function.

- Rearrage domain_flush_complete() to avoid forward declaration.

- Update amd_iommu_update_and_flush_device_table() to call
  domain_flush_complete().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:56 +02:00
Vasant Hegde
845bd6ac43 iommu/amd: Make amd_iommu_dev_flush_pasid_all() static
As its not used outside iommu.c. Also rename it as dev_flush_pasid_all().

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:55 +02:00
Vasant Hegde
293aa9ec69 iommu/amd: Handle error path in amd_iommu_probe_device()
Do not try to set max_pasids in error path as dev_data is not allocated.

Fixes: a0c47f233e ("iommu/amd: Introduce iommu_dev_data.max_pasids")
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:35:55 +02:00
Vasant Hegde
53f1fb0c46 iommu/amd: Make amd_iommu_is_attach_deferred() static
amd_iommu_is_attach_deferred() is a callback function called by
iommu_ops. Make it as static.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:34:12 +02:00
Vasant Hegde
fdc39b77db iommu/amd: Update event log pointer as soon as processing is complete
Update event buffer head pointer once driver completes processing. So
that IOMMU can write new log without waiting for driver to complete
processing all event logs.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240828111029.5429-2-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04 11:34:11 +02:00
Jason Gunthorpe
6c17c7d593 iommu: Allow ATS to work on VFs when the PF uses IDENTITY
PCI ATS has a global Smallest Translation Unit field that is located in
the PF but shared by all of the VFs.

The expectation is that the STU will be set to the root port's global STU
capability which is driven by the IO page table configuration of the iommu
HW. Today it becomes set when the iommu driver first enables ATS.

Thus, to enable ATS on the VF, the PF must have already had the correct
STU programmed, even if ATS is off on the PF.

Unfortunately the PF only programs the STU when the PF enables ATS. The
iommu drivers tend to leave ATS disabled when IDENTITY translation is
being used.

Thus we can get into a state where the PF is setup to use IDENTITY with
the DMA API while the VF would like to use VFIO with a PAGING domain and
have ATS turned on. This fails because the PF never loaded a PAGING domain
and so it never setup the STU, and the VF can't do it.

The simplest solution is to have the iommu driver set the ATS STU when it
probes the device. This way the ATS STU is loaded immediately at boot time
to all PFs and there is no issue when a VF comes to use it.

Add a new call pci_prepare_ats() which should be called by iommu drivers
in their probe_device() op for every PCI device if the iommu driver
supports ATS. This will setup the STU based on whatever page size
capability the iommu HW has.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/0-v1-0fb4d2ab6770+7e706-ats_vf_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-30 14:29:30 +02:00
Vasant Hegde
e5e5cc8f73 iommu/amd: Add blocked domain support
Create global blocked domain with attach device ops. It will clear the
DTE so that all DMA from device will be aborted.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240722115452.5976-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-13 10:44:35 +02:00
Vasant Hegde
c362f32a59 iommu/amd: Invalidate cache before removing device from domain list
Commit 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix
potential TLB aliasing issue") introduced per device domain ID when
domain is configured with v2 page table. And in invalidation path, it
uses per device structure (dev_data->gcr3_info.domid) to get the domain ID.

In detach_device() path, current code tries to invalidate IOMMU cache
after removing dev_data from domain device list. This means when domain
is configured with v2 page table, amd_iommu_domain_flush_all() will not be
able to invalidate cache as device is already removed from domain device
list.

This is causing change domain tests (changing domain type from identity to DMA)
to fail with IO_PAGE_FAULT issue.

Hence invalidate cache and update DTE before updating data structures.

Reported-by: FahHean Lee <fahhean.lee@amd.com>
Reported-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Fixes: 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Tested-by: Sairaj Arun Kodilkar <sairaj.arunkodilkar@amd.com>
Tested-by: FahHean Lee <fahhean.lee@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20240620060552.13984-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-06-27 12:13:48 +02:00
Vasant Hegde
526606b0a1 iommu/amd: Fix Invalid wait context issue
With commit c4cb231111 ("iommu/amd: Add support for enable/disable IOPF")
we are hitting below issue. This happens because in IOPF enablement path
it holds spin lock with irq disable and then tries to take mutex lock.

dmesg:
-----
[    0.938739] =============================
[    0.938740] [ BUG: Invalid wait context ]
[    0.938742] 6.10.0-rc1+ #1 Not tainted
[    0.938745] -----------------------------
[    0.938746] swapper/0/1 is trying to lock:
[    0.938748] ffffffff8c9f01d8 (&port_lock_key){....}-{3:3}, at: serial8250_console_write+0x78/0x4a0
[    0.938767] other info that might help us debug this:
[    0.938768] context-{5:5}
[    0.938769] 7 locks held by swapper/0/1:
[    0.938772]  #0: ffff888101a91310 (&group->mutex){+.+.}-{4:4}, at: bus_iommu_probe+0x70/0x160
[    0.938790]  #1: ffff888101d1f1b8 (&domain->lock){....}-{3:3}, at: amd_iommu_attach_device+0xa5/0x700
[    0.938799]  #2: ffff888101cc3d18 (&dev_data->lock){....}-{3:3}, at: amd_iommu_attach_device+0xc5/0x700
[    0.938806]  #3: ffff888100052830 (&iommu->lock){....}-{2:2}, at: amd_iommu_iopf_add_device+0x3f/0xa0
[    0.938813]  #4: ffffffff8945a340 (console_lock){+.+.}-{0:0}, at: _printk+0x48/0x50
[    0.938822]  #5: ffffffff8945a390 (console_srcu){....}-{0:0}, at: console_flush_all+0x58/0x4e0
[    0.938867]  #6: ffffffff82459f80 (console_owner){....}-{0:0}, at: console_flush_all+0x1f0/0x4e0
[    0.938872] stack backtrace:
[    0.938874] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc1+ #1
[    0.938877] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019

Fix above issue by re-arranging code in attach device path:
  - move device PASID/IOPF enablement outside lock in AMD IOMMU driver.
    This is safe as core layer holds group->mutex lock before calling
    iommu_ops->attach_dev.

Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Fixes: c4cb231111 ("iommu/amd: Add support for enable/disable IOPF")
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240530084801.10758-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-06-04 14:00:59 +02:00
Linus Torvalds
f0bae243b2 pci-v6.10-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmZLzNIUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwr/Q//STe2XGKI8bAKqP2wbbkzm+ISnK4A
 Lqf3FEAIXunxDRspszfXKKV2p4vaIkmOFiwIdtp/kWvd0DQn5+ATXJ/iQtp8aFX/
 R+6BQ7EZc2G7fN5fbQuK54+CvmWEpkKEMbXYbd6ivQ14Cijdb3Nbu+w+DYFjS+6C
 k2a9lS1bTW7Xcy0fyiO1w6GQiWqtmOH8U3OlQtIrI0EVkDG9OG1LsLuc92/FgkOo
 REN+sU+hX1K5fHrvm2CtjYDn/9/B6bJ/It22H1dPgUL9nKvKC67fYzosMtUCOX1M
 6XSPjZIuXOmQGeZXHhpSlVwaidxoUjYO98I7nMquxKdCy6yct3geK7ULG/xeQCgD
 ML7MGQB4+sTiSWalXUQaziKqF1FIDEvU3HMGXFWnoBL5l56eRp8KS1EI9Eqk9pU3
 pk9fJaCkcFnkzPtMFzqPOm5q9zUZ6bGbfYb0hs72TUKplmVDhFo2T1YsW2AOyHZ7
 mjuDzUYZX0H7uM1tntA56IgZX+oNOrLvhBt5L5M/BQeCsZFBBUfIcAEaYoL9LwXO
 AYgIG3jdqzHHyAUzutJF+XHKinJLMHm0XVYbFmO6saPhFzrUJSNHqT7NzW1DGGTl
 OnO8e1WNMX1EcnKvnc6fXyGmM3SgVwy45FsbG/zRnhn4uBKqKtjrh6uX/myA22LK
 CSeqSUK9XmXxFNA=
 =xjoS
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Skip E820 checks for MCFG ECAM regions for new (2016+) machines,
     since there's no requirement to describe them in E820 and some
     platforms require ECAM to work (Bjorn Helgaas)

   - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien
     Le Moal)

   - Remove last user and pci_enable_device_io() (Heiner Kallweit)

   - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen)

   - Skip waiting for devices that have been disconnected while
     suspended (Ilpo Järvinen)

   - Clear Secondary Status errors after enumeration since Master Aborts
     and Unsupported Request errors are an expected part of enumeration
     (Vidya Sagar)

  MSI:

   - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas)

  Error handling:

   - Mask Genesys GL975x SD host controller Replay Timer Timeout
     correctable errors caused by a hardware defect; the errors cause
     interrupts that prevent system suspend (Kai-Heng Feng)

   - Fix EDR-related _DSM support, which previously evaluated revision 5
     but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan)

  ASPM:

   - Simplify link state definitions and mask calculation (Ilpo
     Järvinen)

  Power management:

   - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS
     apparently doesn't know how to put them back in D0 (Mario
     Limonciello)

  CXL:

   - Support resetting CXL devices; special handling required because
     CXL Ports mask Secondary Bus Reset by default (Dave Jiang)

  DOE:

   - Support DOE Discovery Version 2 (Alexey Kardashevskiy)

  Endpoint framework:

   - Set endpoint BAR to be 64-bit if the driver says that's all the
     device supports, in addition to doing so if the size is >2GB
     (Niklas Cassel)

   - Simplify endpoint BAR allocation and setting interfaces (Niklas
     Cassel)

  Cadence PCIe controller driver:

   - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof
     Kozlowski)

  Cadence PCIe endpoint driver:

   - Configure endpoint BARs to be 64-bit based on the BAR type, not the
     BAR value (Niklas Cassel)

  Freescale Layerscape PCIe controller driver:

   - Convert DT binding to YAML (Frank Li)

  MediaTek MT7621 PCIe controller driver:

   - Add DT binding missing 'reg' property for child Root Ports
     (Krzysztof Kozlowski)

   - Fix theoretical string truncation in PHY name (Sergio Paracuellos)

  NVIDIA Tegra194 PCIe controller driver:

   - Return success for endpoint probe instead of falling through to the
     failure path (Vidya Sagar)

  Renesas R-Car PCIe controller driver:

   - Add DT binding missing IOMMU properties (Geert Uytterhoeven)

   - Add DT binding R-Car V4H compatible for host and endpoint mode
     (Yoshihiro Shimoda)

  Rockchip PCIe controller driver:

   - Configure endpoint BARs to be 64-bit based on the BAR type, not the
     BAR value (Niklas Cassel)

   - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski)

   - Set the Subsystem Vendor ID, which was previously zero because it
     was masked incorrectly (Rick Wertenbroek)

  Synopsys DesignWare PCIe controller driver:

   - Restructure DBI register access to accommodate devices where this
     requires Refclk to be active (Manivannan Sadhasivam)

   - Remove the deinit() callback, which was only need by the
     pcie-rcar-gen4, and do it directly in that driver (Manivannan
     Sadhasivam)

   - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean
     up things like eDMA (Manivannan Sadhasivam)

   - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel
     to dw_pcie_ep_init() (Manivannan Sadhasivam)

   - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to
     reflect the actual functionality (Manivannan Sadhasivam)

   - Call dw_pcie_ep_init_registers() directly from all the glue
     drivers, not just those that require active Refclk from the host
     (Manivannan Sadhasivam)

   - Remove the "core_init_notifier" flag, which was an obscure way for
     glue drivers to indicate that they depend on Refclk from the host
     (Manivannan Sadhasivam)

  TI J721E PCIe driver:

   - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli)

   - Add DT binding J722S SoC support (Siddharth Vadapalli)

  TI Keystone PCIe controller driver:

   - Add DT binding missing num-viewport, phys and phy-name properties
     (Jan Kiszka)

  Miscellaneous:

   - Constify and annotate with __ro_after_init (Heiner Kallweit)

   - Convert DT bindings to YAML (Krzysztof Kozlowski)

   - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming
     Zhou)"

* tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits)
  PCI: Do not wait for disconnected devices when resuming
  x86/pci: Skip early E820 check for ECAM region
  PCI: Remove unused pci_enable_device_io()
  ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
  PCI: Update pci_find_capability() stub return types
  PCI: Remove PCI_IRQ_LEGACY
  scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
  scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios
  Revert "genirq/msi: Provide constants for PCI/IMS support"
  Revert "x86/apic/msi: Enable PCI/IMS"
  Revert "iommu/vt-d: Enable PCI/IMS"
  Revert "iommu/amd: Enable PCI/IMS"
  Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support"
  ...
2024-05-21 10:09:28 -07:00
Bjorn Helgaas
72860ff3bb Revert "iommu/amd: Enable PCI/IMS"
This reverts commit fa5745aca1.

IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.

Remove it for now.  We can add it back when a user comes along.

Link: https://lore.kernel.org/r/20240410221307.2162676-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2024-05-15 17:01:58 -05:00
Joerg Roedel
2bd5059c6c Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' into next 2024-05-13 14:06:54 +02:00
Joerg Roedel
5dc72c8a14 Merge branch 'memory-observability' into x86/amd 2024-04-26 12:54:13 +02:00
Joerg Roedel
a4eecd7205 Merge branch 'iommu/fixes' into x86/amd 2024-04-26 12:16:17 +02:00
Vasant Hegde
a5a91e5484 iommu/amd: Add SVA domain support
- Allocate SVA domain and setup mmu notifier. In free path unregister
  mmu notifier and free protection domain.

- Add mmu notifier callback function. It will retrieve SVA protection
  domain and invalidates IO/TLB.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-16-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:08 +02:00
Vasant Hegde
1af95763e0 iommu/amd: Initial SVA support for AMD IOMMU
This includes :
  - Add data structure to track per protection domain dev/pasid binding details
    protection_domain->dev_data_list will track attached list of
    dev_data/PASIDs.

  - Move 'to_pdomain()' to header file

  - Add iommu_sva_set_dev_pasid(). It will check whether PASID is supported
    or not. Also adds PASID to SVA protection domain list as well as to
    device GCR3 table.

  - Add iommu_ops.remove_dev_pasid support. It will unbind PASID from
    device. Also remove pasid data from protection domain device list.

  - Add IOMMU_SVA as dependency to AMD_IOMMU driver

For a given PASID, iommu_set_dev_pasid() will bind all devices to same
SVA protection domain (1 PASID : 1 SVA protection domain : N devices).
This protection domain is different from device protection domain (one
that's mapped in attach_device() path). IOMMU uses domain ID for caching,
invalidation, etc. In SVA mode it will use per-device-domain-ID. Hence in
invalidation path we retrieve domain ID from gcr3_info_table structure and
use that for invalidation.

Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-14-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:05 +02:00
Vasant Hegde
c4cb231111 iommu/amd: Add support for enable/disable IOPF
Return success from enable_feature(IOPF) path as this interface is going
away. Instead we will enable/disable IOPF support in attach/detach device
path.

In attach device path, if device is capable of PRI, then we will add it to
per IOMMU IOPF queue and enable PPR support in IOMMU. Also it will
attach device to domain even if it fails to enable PRI or add device to
IOPF queue as device can continue to work without PRI support.

In detach device patch it follows following sequence:
  - Flush the queue for the given device
  - Disable PPR support in DTE[devid]
  - Remove device from IOPF queue
  - Disable device PRI

Also add IOMMU_IOPF as dependency to AMD_IOMMU driver.

Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-13-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:04 +02:00
Suravee Suthikulpanit
405e2f122b iommu/amd: Add support for page response
This generates AMD IOMMU COMPLETE_PPR_REQUEST for the specified device
with the specified PRI Response Code.

Also update amd_iommu_complete_ppr() to accept 'struct device' instead
of pdev as it just need device reference.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:02 +02:00
Vasant Hegde
25efbb0558 iommu/amd: Enable PCI features based on attached domain capability
Commit eda8c2860a ("iommu/amd: Enable device ATS/PASID/PRI capabilities
independently") changed the way it enables device capability while
attaching devices. I missed to account the attached domain capability.
Meaning if domain is not capable of handling PASID/PRI (ex: paging
domain with v1 page table) then enabling device feature is not required.

This patch enables PASID/PRI only if domain is capable of handling SVA.
Also move pci feature enablement to do_attach() function so that we make
SVA capability in one place. Finally make PRI enable/disable functions as
static functions.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:00 +02:00
Vasant Hegde
c9e8701132 iommu/amd: Setup GCR3 table in advance if domain is SVA capable
SVA can be supported if domain is in passthrough mode or paging domain
with v2 page table. Current code sets up GCR3 table for domain with v2
page table only. Setup GCR3 table for all SVA capable domains.

  - Move GCR3 init/destroy to separate function.

  - Change default GCR3 table to use MAX supported PASIDs. Ideally it
    should use 1 level PASID table as its using PASID zero only. But we
    don't have support to extend PASID table yet. We will fix this later.

  - When domain is configured with passthrough mode, allocate default GCR3
    table only if device is SVA capable.

Note that in attach_device() path it will not know whether device will use
SVA or not. If device is attached to passthrough domain and if it doesn't
use SVA then GCR3 table will never be used. We will endup wasting memory
allocated for GCR3 table. This is done to avoid DTE update when
attaching PASID to device.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:16:00 +02:00
Vasant Hegde
a0c47f233e iommu/amd: Introduce iommu_dev_data.max_pasids
This variable will track the number of PASIDs supported by the device.
If IOMMU or device doesn't support PASID then it will be zero.

This will be used while allocating GCR3 table to decide required number
of PASID table levels. Also in PASID bind path it will use this variable
to check whether device supports PASID or not.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:15:59 +02:00
Suravee Suthikulpanit
e08fcd901c iommu/amd: Move PPR-related functions into ppr.c
In preparation to subsequent PPR-related patches, and also remove static
declaration for certain helper functions so that it can be reused in other
files.

Also rename below functions:
  alloc_ppr_log        -> amd_iommu_alloc_ppr_log
  iommu_enable_ppr_log -> amd_iommu_enable_ppr_log
  free_ppr_log         -> amd_iommu_free_ppr_log
  iommu_poll_ppr_log   -> amd_iommu_poll_ppr_log

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:15:57 +02:00
Wei Huang
db44bd517f iommu/amd: Add support for enabling/disabling IOMMU features
Add support for struct iommu_ops.dev_{enable/disable}_feat. Please note
that the empty feature switches will be populated by subsequent patches.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:15:57 +02:00
Vasant Hegde
c5ebd09625 iommu/amd: Introduce per device DTE update function
Consolidate per device update and flush logic into separate function.
Also make it as global function as it will be used in subsequent series
to update the DTE.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:15:56 +02:00
Vasant Hegde
0f91d07957 iommu/amd: Enhance def_domain_type to handle untrusted device
Previously, IOMMU core layer was forcing IOMMU_DOMAIN_DMA domain for
untrusted device. This always took precedence over driver's
def_domain_type(). Commit 59ddce4418 ("iommu: Reorganize
iommu_get_default_domain_type() to respect def_domain_type()") changed
the behaviour. Current code calls def_domain_type() but if it doesn't
return IOMMU_DOMAIN_DMA for untrusted device it throws error. This
results in IOMMU group (and potentially IOMMU itself) in undetermined
state.

This patch adds untrusted check in AMD IOMMU driver code. So that it
allows eGPUs behind Thunderbolt work again.

Fine tuning amd_iommu_def_domain_type() will be done later.

Reported-by: Eric Wagner <ewagner12@gmail.com>
Link: https://lore.kernel.org/linux-iommu/CAHudX3zLH6CsRmLE-yb+gRjhh-v4bU5_1jW_xCcxOo_oUUZKYg@mail.gmail.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3182
Fixes: 59ddce4418 ("iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()")
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: stable@kernel.org # v6.7+
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240423111725.5813-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:09:52 +02:00
Robin Murphy
b67483b3c4 iommu/dma: Centralise iommu_setup_dma_ops()
It's somewhat hard to see, but arm64's arch_setup_dma_ops() should only
ever call iommu_setup_dma_ops() after a successful iommu_probe_device(),
which means there should be no harm in achieving the same order of
operations by running it off the back of iommu_probe_device() itself.
This then puts it in line with the x86 and s390 .probe_finalize bodges,
letting us pull it all into the main flow properly. As a bonus this lets
us fold in and de-scope the PCI workaround setup as well.

At this point we can also then pull the call up inside the group mutex,
and avoid having to think about whether iommu_group_store_type() could
theoretically race and free the domain if iommu_setup_dma_ops() ran just
*before* iommu_device_use_default_domain() claims it... Furthermore we
replace one .probe_finalize call completely, since the only remaining
implementations are now one which only needs to run once for the initial
boot-time probe, and two which themselves render that path unreachable.

This leaves us a big step closer to realistically being able to unpick
the variety of different things that iommu_setup_dma_ops() has been
muddling together, and further streamline iommu-dma into core API flows
in future.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> # For Intel IOMMU
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/bebea331c1d688b34d9862eefd5ede47503961b8.1713523152.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26 12:07:26 +02:00
Pasha Tatashin
75114cbaa1 iommu/amd: use page allocation function provided by iommu-pages.h
Convert iommu/amd/* files to use the new page allocation functions
provided in iommu-pages.h.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20240413002522.1101315-4-pasha.tatashin@soleen.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15 14:31:42 +02:00
Vasant Hegde
84b1cec4fa iommu/amd: Fix possible irq lock inversion dependency issue
LOCKDEP detector reported below warning:
----------------------------------------
[   23.796949] ========================================================
[   23.796950] WARNING: possible irq lock inversion dependency detected
[   23.796952] 6.8.0fix+ #811 Not tainted
[   23.796954] --------------------------------------------------------
[   23.796954] kworker/0:1/8 just changed the state of lock:
[   23.796956] ff365325e084a9b8 (&domain->lock){..-.}-{3:3}, at: amd_iommu_flush_iotlb_all+0x1f/0x50
[   23.796969] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   23.796970]  (pd_bitmap_lock){+.+.}-{3:3}
[   23.796972]

               and interrupts could create inverse lock ordering between them.

[   23.796973]
               other info that might help us debug this:
[   23.796974] Chain exists of:
                 &domain->lock --> &dev_data->lock --> pd_bitmap_lock

[   23.796980]  Possible interrupt unsafe locking scenario:

[   23.796981]        CPU0                    CPU1
[   23.796982]        ----                    ----
[   23.796983]   lock(pd_bitmap_lock);
[   23.796985]                                local_irq_disable();
[   23.796985]                                lock(&domain->lock);
[   23.796988]                                lock(&dev_data->lock);
[   23.796990]   <Interrupt>
[   23.796991]     lock(&domain->lock);

Fix this issue by disabling interrupt when acquiring pd_bitmap_lock.

Note that this is temporary fix. We have a plan to replace custom bitmap
allocator with IDA allocator.

Fixes: 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240404102717.6705-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-12 12:02:16 +02:00
Vasant Hegde
a0c8bf0a47 iommu/amd: Fix sleeping in atomic context
Commit cf70873e3d ("iommu/amd: Refactor GCR3 table helper functions")
changed GFP flag we use for GCR3 table. Original plan was to move GCR3
table allocation outside spinlock. But this requires complete rework of
attach device path. Hence we didn't do it as part of SVA series. For now
revert the GFP flag to ATOMIC (same as original code).

Fixes: cf70873e3d ("iommu/amd: Refactor GCR3 table helper functions")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240307052738.116035-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-03-08 08:58:24 +01:00
Vasant Hegde
87a6f1f22c iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue
With v1 page table, the AMD IOMMU spec states that the hardware must use
the domain ID to tag its internal translation caches. I/O devices with
different v1 page tables must be given different domain IDs. I/O devices
that share the same v1 page table __may__ be given the same domain ID.
This domain ID management policy is currently implemented by the AMD
IOMMU driver. In this case, only the domain ID is needed when issuing the
INVALIDATE_IOMMU_PAGES command to invalidate the IOMMU translation cache
(TLB).

With v2 page table, the hardware uses domain ID and PASID as parameters
to tag and issue the INVALIDATE_IOMMU_PAGES command. Since the GCR3 table
is setup per-device, and there is no guarantee for PASID to be unique
across multiple devices. The same PASID for different devices could
have different v2 page tables. In such case, if multiple devices share the
same domain ID, IOMMU translation cache for these devices would be polluted
due to TLB aliasing.

Hence, avoid the TLB aliasing issue with v2 page table by allocating unique
domain ID for each device even when multiple devices are sharing the same v1
page table. Please note that this fix would result in multiple
INVALIDATE_IOMMU_PAGES commands (one per domain id) when unmapping a
translation.

Domain ID can be shared until device starts using PASID. We will enhance this
code later where we will allocate per device domain ID only when its needed.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-18-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:31 +01:00
Suravee Suthikulpanit
c2a6af5e08 iommu/amd: Remove unused GCR3 table parameters from struct protection_domain
Since they are moved to struct iommu_dev_data, and the driver has been
ported to use them.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-17-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:30 +01:00
Vasant Hegde
a7b2aff313 iommu/amd: Rearrange device flush code
Consolidate all flush related code in one place so that its easy
to maintain.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-16-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:29 +01:00
Vasant Hegde
02b990253d iommu/amd: Remove unused flush pasid functions
We have removed iommu_v2 module and converted v2 page table to use
common flush functions. Also we have moved GCR3 table to per device.
PASID related functions are not used. Hence remove these unused
functions.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-15-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:29 +01:00
Suravee Suthikulpanit
cf70873e3d iommu/amd: Refactor GCR3 table helper functions
To use the new per-device struct gcr3_tbl_info. Use GFP_KERNEL flag
instead of GFP_ATOMIC for GCR3 table allocation. Also modify
set_dte_entry() to use new per device GCR3 table.

Also in free_gcr3_table() path replace BUG_ON with WARN_ON_ONCE().

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-14-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:28 +01:00
Suravee Suthikulpanit
fb575d1781 iommu/amd: Refactor protection_domain helper functions
To removes the code to setup GCR3 table, and only handle domain
create / destroy, since GCR3 is no longer part of a domain.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-13-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:28 +01:00
Suravee Suthikulpanit
4ebd4c7f25 iommu/amd: Refactor attaching / detaching device functions
If domain is configured with V2 page table then setup default GCR3
with domain GCR3 pointer. So that all devices in the domain uses same page
table for translation. Also return page table setup status from do_attach()
function.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-12-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:27 +01:00
Suravee Suthikulpanit
e8e1aac334 iommu/amd: Refactor helper function for setting / clearing GCR3
Refactor GCR3 helper functions in preparation to use per device
GCR3 table.
  * Add new function update_gcr3 to update per device GCR3 table

  * Remove per domain default GCR3 setup during v2 page table allocation.
    Subsequent patch will add support to setup default gcr3 while
    attaching device to domain.

  * Remove amd_iommu_domain_update() from V2 page table path as device
    detach path will take care of updating the domain.

  * Consolidate GCR3 table related code in one place so that its easy
    to maintain.

  * Rename functions to reflect its usage.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:26 +01:00
Vasant Hegde
b2e8a7f5d2 iommu/amd: Rearrange GCR3 table setup code
Consolidate GCR3 table related code in one place so that its easy
to maintain.

Note that this patch doesn't move __set_gcr3/__clear_gcr3. We are moving
GCR3 table from per domain to per device. Following series will rework
these functions. During that time I will move these functions as well.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:25 +01:00
Vasant Hegde
474bf01ed9 iommu/amd: Add support for device based TLB invalidation
Add support to invalidate TLB/IOTLB for the given device.

These functions will be used in subsequent patches where we will
introduce per device GCR3 table and SVA support.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:24 +01:00
Vasant Hegde
7b4e5623d8 iommu/amd: Use protection_domain.flags to check page table mode
Page table mode (v1, v2 or pt) is per domain property. Recently we have
enhanced protection_domain.pd_mode to track per domain page table mode.
Use that variable to check the page table mode instead of global
'amd_iommu_pgtable' in {map/unmap}_pages path.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:24 +01:00
Suravee Suthikulpanit
fda5108eba iommu/amd: Introduce struct protection_domain.pd_mode
This enum variable is used to track the type of page table used by the
protection domain. It will replace the protection_domain.flags in
subsequent series.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:23 +01:00
Suravee Suthikulpanit
6f35fe5d8a iommu/amd: Introduce get_amd_iommu_from_dev()
Introduce get_amd_iommu_from_dev() and get_amd_iommu_from_dev_data().
And replace rlookup_amd_iommu() with the new helper function where
applicable to avoid unnecessary loop to look up struct amd_iommu from
struct device.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:22 +01:00
Vasant Hegde
a6ffb9b3d7 iommu/amd: Pass struct iommu_dev_data to set_dte_entry()
Pass iommu_dev_data structure instead of passing indivisual variables.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-2-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:21 +01:00
Vasant Hegde
2dc9506bfb iommu/amd: Remove redundant error check in amd_iommu_probe_device()
iommu_init_device() is not returning -ENOTSUPP since commit 61289cbaf6
("iommu/amd: Remove old alias handling code").

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240118090105.5864-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 13:16:20 +01:00
Vasant Hegde
a408663766 iommu/amd: Remove unused IOVA_* macro
These macros are not used after commit ac6d704679 ("iommu/dma: Pass
address limit rather than size to iommu_setup_dma_ops()").

No functional change intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240118090105.5864-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-09 11:40:31 +01:00
Linus Torvalds
0dde2bf67b IOMMU Updates for Linux v6.8
Including:
 
 	- Core changes:
 	  - Fix race conditions in device probe path
 	  - Retire IOMMU bus_ops
 	  - Support for passing custom allocators to page table drivers
 	  - Clean up Kconfig around IOMMU_SVA
 	  - Support for sharing SVA domains with all devices bound to
 	    a mm
 	  - Firmware data parsing cleanup
 	  - Tracing improvements for iommu-dma code
 	  - Some smaller fixes and cleanups
 
 	- ARM-SMMU drivers:
 	  - Device-tree binding updates:
 	     - Add additional compatible strings for Qualcomm SoCs
 	     - Document Adreno clocks for Qualcomm's SM8350 SoC
 	  - SMMUv2:
 	    - Implement support for the ->domain_alloc_paging() callback
 	    - Ensure Secure context is restored following suspend of Qualcomm SMMU
 	      implementation
 	  - SMMUv3:
 	    - Disable stalling mode for the "quiet" context descriptor
 	    - Minor refactoring and driver cleanups
 
 	 - Intel VT-d driver:
 	   - Cleanup and refactoring
 
 	 - AMD IOMMU driver:
 	   - Improve IO TLB invalidation logic
 	   - Small cleanups and improvements
 
 	 - Rockchip IOMMU driver:
 	   - DT binding update to add Rockchip RK3588
 
 	 - Apple DART driver:
 	   - Apple M1 USB4/Thunderbolt DART support
 	   - Cleanups
 
 	 - Virtio IOMMU driver:
 	   - Add support for iotlb_sync_map
 	   - Enable deferred IO TLB flushes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmWecQoACgkQK/BELZcB
 GuN5ZxAAzC5QUKAzANx0puk7QhPpKKlbSvj6Q7iRgCLk00KJO1+VQh9v4ouCmXqF
 kn3Ko8gddjhtrgwN0OQ54F39cLUrp1SBemy71K5YOR+vu8VKtwtmawZGeeRZ+k+B
 Eohw58oaXTiR1maYvoLixLYczLrjklqyJOQ1vZ0GxFGxDqrFByAryHDgG/3OCpJx
 C9e6PsLbbfhfqA8Kv97iKcBqniGbXxAMuodqSUG0buQ3oZgfpIP6Bt3EgUzFGPGk
 3BTlYxowS/gkjUWd3fgjQFIFLTA01u9FhpA2Jb0a4v67pUCR64YxHN7rBQ6ZChtG
 kB9laQfU9re79RsHhqQzr0JT9x/eyq7pzGzjp5TV5TPW6IW+sqjMIPhzd9P08Ef7
 BclkCVobx0jSAHOhnnG4QJiKANr2Y2oM3HfsAJccMMY45RRhUKmVqM7jxMPfGn3A
 i+inlee73xTjZXJse1EWG1fmKKMLvX9LDEp4DyOfn9CqVT+7hpZvzPjfbGr937Rm
 JlwXhF3rQXEpOCagEsbt1vOf+V0e9QiCLf1Y2KpkIkDbE5wwSD/2qLm3tFhJG3oF
 fkW+J14Cid0pj+hY0afGe0kOUOIYlimu0nFmSf0pzMH+UktZdKogSfyb1gSDsy+S
 rsZRGPFhMJ832ExqhlDfxqBebqh+jsfKynlskui6Td5C9ZULaHA=
 =q751
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Fix race conditions in device probe path
   - Retire IOMMU bus_ops
   - Support for passing custom allocators to page table drivers
   - Clean up Kconfig around IOMMU_SVA
   - Support for sharing SVA domains with all devices bound to a mm
   - Firmware data parsing cleanup
   - Tracing improvements for iommu-dma code
   - Some smaller fixes and cleanups

  ARM-SMMU drivers:
   - Device-tree binding updates:
      - Add additional compatible strings for Qualcomm SoCs
      - Document Adreno clocks for Qualcomm's SM8350 SoC
   - SMMUv2:
      - Implement support for the ->domain_alloc_paging() callback
      - Ensure Secure context is restored following suspend of Qualcomm
        SMMU implementation
   - SMMUv3:
      - Disable stalling mode for the "quiet" context descriptor
      - Minor refactoring and driver cleanups

  Intel VT-d driver:
   - Cleanup and refactoring

  AMD IOMMU driver:
   - Improve IO TLB invalidation logic
   - Small cleanups and improvements

  Rockchip IOMMU driver:
   - DT binding update to add Rockchip RK3588

  Apple DART driver:
   - Apple M1 USB4/Thunderbolt DART support
   - Cleanups

  Virtio IOMMU driver:
   - Add support for iotlb_sync_map
   - Enable deferred IO TLB flushes"

* tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
  iommu: Don't reserve 0-length IOVA region
  iommu/vt-d: Move inline helpers to header files
  iommu/vt-d: Remove unused vcmd interfaces
  iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through()
  iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly
  iommu/sva: Fix memory leak in iommu_sva_bind_device()
  dt-bindings: iommu: rockchip: Add Rockchip RK3588
  iommu/dma: Trace bounce buffer usage when mapping buffers
  iommu/arm-smmu: Convert to domain_alloc_paging()
  iommu/arm-smmu: Pass arm_smmu_domain to internal functions
  iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED
  iommu/arm-smmu: Convert to a global static identity domain
  iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: disable stall for quiet_cd
  iommu/qcom: restore IOMMU state if needed
  iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
  iommu/arm-smmu-qcom: Add missing GMU entry to match table
  ...
2024-01-18 15:16:57 -08:00
Joerg Roedel
75f74f85a4 Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-01-03 09:59:32 +01:00
Jason Gunthorpe
eda1a94caf iommu: Mark dev_iommu_priv_set() with a lockdep
A perfect driver would only call dev_iommu_priv_set() from its probe
callback. We've made it functionally correct to call it from the of_xlate
by adding a lock around that call.

lockdep assert that iommu_probe_device_lock is held to discourage misuse.

Exclude PPC kernels with CONFIG_FSL_PAMU turned on because FSL_PAMU uses a
global static for its priv and abuses priv for its domain.

Remove the pointless stores of NULL, all these are on paths where the core
code will free dev->iommu after the op returns.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-12 10:18:49 +01:00
Vasant Hegde
c7fc12354b iommu/amd/pgtbl_v2: Invalidate updated page ranges only
Enhance __domain_flush_pages() to detect domain page table mode and use
that info to build invalidation commands. So that we can use
amd_iommu_domain_flush_pages() to invalidate v2 page table.

Also pass PASID, gn variable to device_flush_iotlb() so that it can build
IOTLB invalidation command for both v1 and v2 page table.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:38 +01:00
Vasant Hegde
2c535dd37d iommu/amd: Make domain_flush_pages as global function
- Rename domain_flush_pages() -> amd_iommu_domain_flush_pages() and make
  it as global function.

- Rename amd_iommu_domain_flush_tlb_pde() -> amd_iommu_domain_flush_all()
  and make it as static.

- Convert v1 page table (io_pgtble.c) to use amd_iommu_domain_flush_pages().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:37 +01:00
Vasant Hegde
8d004ac1c6 iommu/amd: Consolidate amd_iommu_domain_flush_complete() call
Call amd_iommu_domain_flush_complete() from domain_flush_pages().
That way we can remove explicit call of amd_iommu_domain_flush_complete()
from various places.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:36 +01:00
Vasant Hegde
bbf85fe10f iommu/amd: Refactor device iotlb invalidation code
build_inv_iotlb_pages() and build_inv_iotlb_pasid() pretty much duplicates
the code. Enhance build_inv_iotlb_pages() to invalidate guest IOTLB as
well. And remove build_inv_iotlb_pasid() function.

Suggested-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:36 +01:00
Vasant Hegde
4f0a600799 iommu/amd: Refactor IOMMU tlb invalidation code
build_inv_iommu_pages() and build_inv_iommu_pasid() pretty much
duplicates the code. Hence enhance build_inv_iommu_pages() to
invalidate guest pages as well. And remove build_inv_iommu_pasid().

Suggested-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:35 +01:00
Vasant Hegde
cf62924daf iommu/amd: Add support to invalidate multiple guest pages
Current interface supports invalidating single page or entire guest
translation information for a single process address space.

IOMMU CMD_INV_IOMMU_PAGES and CMD_INV_IOTLB_PAGES commands supports
invalidating range of pages. Add support to invalidate multiple pages.

This is preparatory patch before consolidating host and guest
invalidation code into single function. Following patches will
consolidation tlb invalidation code.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:35 +01:00
Vasant Hegde
a976da66e8 iommu/amd: Remove redundant passing of PDE bit
Current code always sets PDE bit in INVALIDATE_IOMMU_PAGES command.
Hence get rid of 'pde' variable across functions.

We can re-introduce this bit whenever its needed.

Suggested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:34 +01:00
Vasant Hegde
3f2571fed2 iommu/amd: Remove redundant domain flush from attach_device()
Domain flush was introduced in attach_device() path to handle kdump
scenario. Later init code was enhanced to handle kdump scenario where
it also takes care of flushing everything including TLB
(see early_enable_iommus()).

Hence remove redundant flush from attach_device() function.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:34 +01:00
Vasant Hegde
af3263758b iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches()
Rename function inline with driver naming convention.

No functional changes.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231122090215.6191-2-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:25:33 +01:00
Suravee Suthikulpanit
57cdb720ea iommu/amd: Do not flush IRTE when only updating isRun and destination fields
According to the recent update in the AMD IOMMU spec [1], the IsRun and
Destination fields of the Interrupt Remapping Table Entry (IRTE) are not
cached by the IOMMU hardware.

Therefore, do not issue the INVALIDATE_INTERRUPT_TABLE command when
updating IRTE[IsRun] and IRTE[Destination] when IRTE[GuestMode]=1, which
should help improve IOMMU AVIC/x2AVIC performance.

References:
[1] AMD IOMMU Spec Revision (Rev 3.08-PUB)
(Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf)

Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Link: https://lore.kernel.org/r/20231017144236.8287-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-11 15:21:35 +01:00
Kunwu Chan
9abe6c5535 iommu/amd: Set variable amd_dirty_ops to static
Fix the followng warning:
drivers/iommu/amd/iommu.c:67:30: warning: symbol
 'amd_dirty_ops' was not declared. Should it be static?

This variable is only used in its defining file, so it should be static.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231120095342.1102999-1-chentao@kylinos.cn
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27 10:59:23 +01:00
Andrew Cooper
07e8f88568 x86/apic: Drop apic::delivery_mode
This field is set to APIC_DELIVERY_MODE_FIXED in all cases, and is read
exactly once.  Fold the constant in uv_program_mmr() and drop the field.

Searching for the origin of the stale HyperV comment reveals commit
a31e58e129 ("x86/apic: Switch all APICs to Fixed delivery mode") which
notes:

  As a consequence of this change, the apic::irq_delivery_mode field is
  now pointless, but this needs to be cleaned up in a separate patch.

6 years is long enough for this technical debt to have survived.

  [ bp: Fold in
    https://lore.kernel.org/r/20231121123034.1442059-1-andrew.cooper3@citrix.com
  ]

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20231102-x86-apic-v1-1-bf049a2a0ed6@citrix.com
2023-11-21 16:58:54 +01:00
Linus Torvalds
4bbdb725a3 IOMMU Updates for Linux v6.7
Including:
 
 	- Core changes:
 	  - Make default-domains mandatory for all IOMMU drivers
 	  - Remove group refcounting
 	  - Add generic_single_device_group() helper and consolidate
 	    drivers
 	  - Cleanup map/unmap ops
 	  - Scaling improvements for the IOVA rcache depot
 	  - Convert dart & iommufd to the new domain_alloc_paging()
 
 	- ARM-SMMU:
 	  - Device-tree binding update:
 	    - Add qcom,sm7150-smmu-v2 for Adreno on SM7150 SoC
 	  - SMMUv2:
 	    - Support for Qualcomm SDM670 (MDSS) and SM7150 SoCs
 	  - SMMUv3:
 	    - Large refactoring of the context descriptor code to
 	      move the CD table into the master, paving the way
 	      for '->set_dev_pasid()' support on non-SVA domains
 	  - Minor cleanups to the SVA code
 
 	- Intel VT-d:
 	  - Enable debugfs to dump domain attached to a pasid
 	  - Remove an unnecessary inline function.
 
 	- AMD IOMMU:
 	  - Initial patches for SVA support (not complete yet)
 
 	- S390 IOMMU:
 	  - DMA-API conversion and optimized IOTLB flushing
 
 	- Some smaller fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmVJFcEACgkQK/BELZcB
 GuMgDxAAsnYVQjQ7wRkwR0rHARuEaJ+Lz2vkLNH+uYXjBzhFe2bT+ykMcZysAkdK
 A5PMLOFT5Etf+PAqOM0CoIGQFOefAId6uGl7S61Fp9ZWDKhMrOBFWhxGOaufA1Du
 tNvt3i66hwPSDZa82kY3wRCluYtj0aBBzmM6ZTwBwFZdQ7LABMtE8OxisqncVvq0
 H6vhV213fqvhCFSQJ6PnTAEiv70WvWBWygA+Z/gwYf9hypZQae91PNXdK9313a9z
 OvCzGBkL/R5/3KkJd88UhFwyYzyNGxq/DmH1etawYR5gYZ8UT/Z/sYpcx9hlO7qr
 eENPqeQc+YHZXpKqkaq66HBA1FSnXUqRZLl4cVaZahRRMe/yArsBM6R0W1AfkMAR
 rZxwHKoHUWeuHQLMVvmSDNL57h/GJJpTXjRc8HMxLZkVp+ScvnT5XCYHWWzRdCdx
 TcC/pJ1tet0FQ8rw09ovlwpGVA6eojWvcpVbLVLfGN8ZWViSVfvNFoPNb7HsGK6M
 iRi+L41Y7s63cyogC/Gsae2RAvYv29ZpvE91lmon2u+VBlTpMdOFX9EhWS6RqOBF
 cV30bhsw0dyCB7v5jDPtABYEOaR6l1mPLhn1gX3u0Ue/tmPhLX69k4bVWBY6wP3p
 gmmJD9ub8FuPQtFCGPE7/8ZINjGGrfiKO24DNI2Ty3XEeq21hU4=
 =UyWC
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Make default-domains mandatory for all IOMMU drivers
   - Remove group refcounting
   - Add generic_single_device_group() helper and consolidate drivers
   - Cleanup map/unmap ops
   - Scaling improvements for the IOVA rcache depot
   - Convert dart & iommufd to the new domain_alloc_paging()

  ARM-SMMU:
   - Device-tree binding update:
       - Add qcom,sm7150-smmu-v2 for Adreno on SM7150 SoC
   - SMMUv2:
       - Support for Qualcomm SDM670 (MDSS) and SM7150 SoCs
   - SMMUv3:
       - Large refactoring of the context descriptor code to move the CD
         table into the master, paving the way for '->set_dev_pasid()'
         support on non-SVA domains
   - Minor cleanups to the SVA code

  Intel VT-d:
   - Enable debugfs to dump domain attached to a pasid
   - Remove an unnecessary inline function

  AMD IOMMU:
   - Initial patches for SVA support (not complete yet)

  S390 IOMMU:
   - DMA-API conversion and optimized IOTLB flushing

  And some smaller fixes and improvements"

* tag 'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (102 commits)
  iommu/dart: Remove the force_bypass variable
  iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()
  iommu/dart: Convert to domain_alloc_paging()
  iommu/dart: Move the blocked domain support to a global static
  iommu/dart: Use static global identity domains
  iommufd: Convert to alloc_domain_paging()
  iommu/vt-d: Use ops->blocked_domain
  iommu/vt-d: Update the definition of the blocking domain
  iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domain
  Revert "iommu/vt-d: Remove unused function"
  iommu/amd: Remove DMA_FQ type from domain allocation path
  iommu: change iommu_map_sgtable to return signed values
  iommu/virtio: Add __counted_by for struct viommu_request and use struct_size()
  iommu/vt-d: debugfs: Support dumping a specified page table
  iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  iommu/vt-d: debugfs: Dump entry pointing to huge page
  iommu/vt-d: Remove unused function
  iommu/arm-smmu-v3-sva: Remove bond refcount
  iommu/arm-smmu-v3-sva: Remove unused iommu_sva handle
  iommu/arm-smmu-v3: Rename cdcfg to cd_table
  ...
2023-11-09 13:37:28 -08:00
Joerg Roedel
e8cca466a8 Merge branches 'iommu/fixes', 'arm/tegra', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd', 'core' and 's390' into next 2023-10-27 09:13:40 +02:00
Yi Liu
2bdabb8e82 iommu: Pass in parent domain with user_data to domain_alloc_user op
domain_alloc_user op already accepts user flags for domain allocation, add
a parent domain pointer and a driver specific user data support as well.
The user data would be tagged with a type for iommu drivers to add their
own driver specific user data per hw_pagetable.

Add a struct iommu_user_data as a bundle of data_ptr/data_len/type from an
iommufd core uAPI structure. Make the user data opaque to the core, since
a userspace driver must match the kernel driver. In the future, if drivers
share some common parameter, there would be a generic parameter as well.

Link: https://lore.kernel.org/r/20231026043938.63898-7-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-26 11:15:57 -03:00
Joao Martins
421a511a29 iommu/amd: Access/Dirty bit support in IOPTEs
IOMMU advertises Access/Dirty bits if the extended feature register reports
it. Relevant AMD IOMMU SDM ref[0] "1.3.8 Enhanced Support for Access and
Dirty Bits"

To enable it set the DTE flag in bits 7 and 8 to enable access, or
access+dirty. With that, the IOMMU starts marking the D and A flags on
every Memory Request or ATS translation request. It is on the VMM side to
steer whether to enable dirty tracking or not, rather than wrongly doing in
IOMMU. Relevant AMD IOMMU SDM ref [0], "Table 7. Device Table Entry (DTE)
Field Definitions" particularly the entry "HAD".

To actually toggle on and off it's relatively simple as it's setting 2 bits
on DTE and flush the device DTE cache.

To get what's dirtied use existing AMD io-pgtable support, by walking the
pagetables over each IOVA, with fetch_pte().  The IOTLB flushing is left to
the caller (much like unmap), and iommu_dirty_bitmap_record() is the one
adding page-ranges to invalidate. This allows caller to batch the flush
over a big span of IOVA space, without the iommu wondering about when to
flush.

Worthwhile sections from AMD IOMMU SDM:

"2.2.3.1 Host Access Support"
"2.2.3.2 Host Dirty Support"

For details on how IOMMU hardware updates the dirty bit see, and expects
from its consequent clearing by CPU:

"2.2.7.4 Updating Accessed and Dirty Bits in the Guest Address Tables"
"2.2.7.5 Clearing Accessed and Dirty Bits"

Quoting the SDM:

"The setting of accessed and dirty status bits in the page tables is
visible to both the CPU and the peripheral when sharing guest page tables.
The IOMMU interlocked operations to update A and D bits must be 64-bit
operations and naturally aligned on a 64-bit boundary"

.. and for the IOMMU update sequence to Dirty bit, essentially is states:

1. Decodes the read and write intent from the memory access.
2. If P=0 in the page descriptor, fail the access.
3. Compare the A & D bits in the descriptor with the read and write
intent in the request.
4. If the A or D bits need to be updated in the descriptor:
* Start atomic operation.
* Read the descriptor as a 64-bit access.
* If the descriptor no longer appears to require an update, release the
atomic lock with
no further action and continue to step 5.
* Calculate the new A & D bits.
* Write the descriptor as a 64-bit access.
* End atomic operation.
5. Continue to the next stage of translation or to the memory access.

Access/Dirty bits readout also need to consider the non-default page-sizes
(aka replicated PTEs as mentined by manual), as AMD supports all powers of
two (except 512G) page sizes.

Select IOMMUFD_DRIVER only if IOMMUFD is enabled considering that IOMMU
dirty tracking requires IOMMUFD.

Link: https://lore.kernel.org/r/20231024135109.73787-12-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:43 -03:00