Commit Graph

58 Commits

Author SHA1 Message Date
Linus Torvalds
c93529ad4f iommufd 6.17 merge window pull
- IOMMU HW now has features to directly assign HW command queues to a
   guest VM. In this mode the command queue operates on a limited set of
   invalidation commands that are suitable for improving guest invalidation
   performance and easy for the HW to virtualize.
 
   This PR brings the generic infrastructure to allow IOMMU drivers to
   expose such command queues through the iommufd uAPI, mmap the doorbell
   pages, and get the guest physical range for the command queue ring
   itself.
 
 - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built on
   the new iommufd command queue features. It works with the existing SMMU
   driver support for cmdqv in guest VMs.
 
 - Many precursor cleanups and improvements to support the above cleanly,
   changes to the general ioctl and object helpers, driver support for
   VDEVICE, and mmap pgoff cookie infrastructure.
 
 - Sequence VDEVICE destruction to always happen before VFIO device
   destruction. When using the above type features, and also in future
   confidential compute, the internal virtual device representation becomes
   linked to HW or CC TSM configuration and objects. If a VFIO device is
   removed from iommufd those HW objects should also be cleaned up to
   prevent a sort of UAF. This became important now that we have HW backing
   the VDEVICE.
 
 - Fix one syzkaller found error related to math overflows during iova
   allocation
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaIpl9AAKCRCFwuHvBreF
 YS5tAP9MDIRML5a/2IOhzcsc4LiDkWTMKm2m1wcRYd+iU2aFVQEAjdghINLHrUlx
 HVuIDvNvWIUED/oTAp5kCxQ7PBFN4gU=
 =NmCO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "This broadly brings the assigned HW command queue support to iommufd.
  This feature is used to improve SVA performance in VMs by avoiding
  paravirtualization traps during SVA invalidations.

  Along the way I think some of the core logic is in a much better state
  to support future driver backed features.

  Summary:

   - IOMMU HW now has features to directly assign HW command queues to a
     guest VM. In this mode the command queue operates on a limited set
     of invalidation commands that are suitable for improving guest
     invalidation performance and easy for the HW to virtualize.

     This brings the generic infrastructure to allow IOMMU drivers to
     expose such command queues through the iommufd uAPI, mmap the
     doorbell pages, and get the guest physical range for the command
     queue ring itself.

   - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built
     on the new iommufd command queue features. It works with the
     existing SMMU driver support for cmdqv in guest VMs.

   - Many precursor cleanups and improvements to support the above
     cleanly, changes to the general ioctl and object helpers, driver
     support for VDEVICE, and mmap pgoff cookie infrastructure.

   - Sequence VDEVICE destruction to always happen before VFIO device
     destruction. When using the above type features, and also in future
     confidential compute, the internal virtual device representation
     becomes linked to HW or CC TSM configuration and objects. If a VFIO
     device is removed from iommufd those HW objects should also be
     cleaned up to prevent a sort of UAF. This became important now that
     we have HW backing the VDEVICE.

   - Fix one syzkaller found error related to math overflows during iova
     allocation"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits)
  iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size
  iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3
  iommufd: Rename some shortterm-related identifiers
  iommufd/selftest: Add coverage for vdevice tombstone
  iommufd/selftest: Explicitly skip tests for inapplicable variant
  iommufd/vdevice: Remove struct device reference from struct vdevice
  iommufd: Destroy vdevice on idevice destroy
  iommufd: Add a pre_destroy() op for objects
  iommufd: Add iommufd_object_tombstone_user() helper
  iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
  iommufd/selftest: Test reserved regions near ULONG_MAX
  iommufd: Prevent ALIGN() overflow
  iommu/tegra241-cmdqv: import IOMMUFD module namespace
  iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set
  iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support
  iommu/tegra241-cmdqv: Add user-space use support
  iommu/tegra241-cmdqv: Do not statically map LVCMDQs
  iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf()
  iommu/tegra241-cmdqv: Use request_threaded_irq
  iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops
  ...
2025-07-31 12:43:08 -07:00
Xu Yilun
39a369c341 iommufd/selftest: Add coverage for vdevice tombstone
This tests the flow to tombstone vdevice when idevice is to be unbound
before vdevice destruction. The expected results of the tombstone are:

 - The vdevice ID can't be reused anymore (not tested in this patch).
 - Even ioctl(IOMMU_DESTROY) can't free the vdevice ID.
 - iommufd_fops_release() can still free everything.

Link: https://patch.msgid.link/r/20250716070349.1807226-8-yilun.xu@linux.intel.com
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18 17:33:08 -03:00
Xu Yilun
c4e496d413 iommufd/selftest: Explicitly skip tests for inapplicable variant
no_viommu is not applicable for some viommu/vdevice tests. Explicitly
report the skipping, don't do it silently.

Opportunistically adjust the line wrappings after the indentation
changes using git clang-format.

Only add the prints. No functional change intended.

Link: https://patch.msgid.link/r/20250716070349.1807226-7-yilun.xu@linux.intel.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18 17:33:08 -03:00
Jason Gunthorpe
5d8b1d957d iommufd/selftest: Test reserved regions near ULONG_MAX
This has triggered an overflow inside the ioas iova auto allocation logic,
test it directly. Use the same stimulus syzkaller found.

Link: https://patch.msgid.link/all/2-v1-7b4a16fc390b+10f4-iommufd_alloc_overflow_jgg@nvidia.com/
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18 17:33:07 -03:00
Nicolin Chen
3a35f7d4a4 iommufd/selftest: Update hw_info coverage for an input data_type
Test both IOMMU_HW_INFO_TYPE_DEFAULT and IOMMU_HW_INFO_TYPE_SELFTEST, and
add a negative test for an unsupported type.

Also drop the unused mask in test_cmd_get_hw_capabilities() as checkpatch
is complaining.

Link: https://patch.msgid.link/r/f01a1e50cd7366f217cbf192ad0b2b79e0eb89f0.1752126748.git.nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11 14:34:35 -03:00
Nicolin Chen
80478a2b45 iommufd/selftest: Add coverage for the new mmap interface
Extend the loopback test to a new mmap page.

Link: https://patch.msgid.link/r/b02b1220c955c3cf9ea5dd9fe9349ab1b4f8e20b.1752126748.git.nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11 14:34:35 -03:00
Nicolin Chen
20896914da iommufd/selftest: Add coverage for IOMMUFD_CMD_HW_QUEUE_ALLOC
Some simple tests for IOMMUFD_CMD_HW_QUEUE_ALLOC infrastructure covering
the new iommufd_hw_queue_depend/undepend() helpers.

Link: https://patch.msgid.link/r/e8a194d187d7ef445f43e4a3c04fb39472050afd.1752126748.git.nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11 11:09:26 -03:00
Nicolin Chen
0e3e0b0c08 iommufd/selftest: Add coverage for viommu data
Extend the existing test_cmd/err_viommu_alloc helpers to accept optional
user data. And add a TEST_F for a loopback test.

Link: https://patch.msgid.link/r/8ceb64d30e9953f29270a7d341032ca439317271.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-10 12:38:51 -03:00
Nicolin Chen
9a96876e3c iommufd/selftest: Fix build warnings due to uninitialized mfd
Commit 869c788909 ("selftests: harness: Stop using setjmp()/longjmp()")
changed the harness structure. For some unknown reason, two build warnings
occur to the iommufd selftest:

iommufd.c: In function ‘wrapper_iommufd_mock_domain_all_aligns’:
iommufd.c:1807:17: warning: ‘mfd’ may be used uninitialized in this function
 1807 |                 close(mfd);
      |                 ^~~~~~~~~~
iommufd.c:1767:13: note: ‘mfd’ was declared here
 1767 |         int mfd;
      |             ^~~
iommufd.c: In function ‘wrapper_iommufd_mock_domain_all_aligns_copy’:
iommufd.c:1870:17: warning: ‘mfd’ may be used uninitialized in this function
 1870 |                 close(mfd);
      |                 ^~~~~~~~~~
iommufd.c:1819:13: note: ‘mfd’ was declared here
 1819 |         int mfd;
      |             ^~~

All the mfd have been used in the variant->file path only, so it's likely
a false alarm.

FWIW, the commit mentioned above does not cause this, yet it might affect
gcc in a certain way that resulted in the warnings. It is also found that
ading a dummy setjmp (which doesn't make sense) could mute the warnings:
https://lore.kernel.org/all/aEi8DV+ReF3v3Rlf@nvidia.com/

The job of this selftest is to catch kernel bug, while such warnings will
unlikely disrupt its role. Mute the warning by force initializing the mfd
and add an ASSERT_GT().

Link: https://patch.msgid.link/r/6951d85d5cd34cbf22abab7714542654e63ecc44.1750787928.git.nicolinc@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-06-24 15:45:13 -03:00
Nicolin Chen
a9bf67ee17 iommufd/selftest: Add asserts testing global mfd
The mfd and mfd_buffer will be used in the tests directly without an extra
check. Test them in setup_sizes() to ensure they are safe to use.

Fixes: 0bcceb1f51 ("iommufd: Selftest coverage for IOMMU_IOAS_MAP_FILE")
Link: https://patch.msgid.link/r/94bdc11d2b6d5db337b1361c5e5fce0ed494bb40.1750787928.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-06-24 15:45:12 -03:00
Nicolin Chen
8186255705 iommufd/selftest: Fix iommufd_dirty_tracking with large hugepage sizes
The hugepage test cases of iommufd_dirty_tracking have the 64MB and 128MB
coverages. Both of them are smaller than the default hugepage size 512MB,
when CONFIG_PAGE_SIZE_64KB=y. However, these test cases have a variant of
using huge pages, which would mmap(MAP_HUGETLB) using these smaller sizes
than the system hugepag size. This results in the kernel aligning up the
smaller size to 512MB. If a memory was located between the upper 64/128MB
size boundary and the hugepage 512MB boundary, it would get wiped out:
https://lore.kernel.org/all/aEoUhPYIAizTLADq@nvidia.com/

Given that this aligning up behavior is well documented, we have no choice
but to allocate a hugepage aligned size to avoid this unintended wipe out.
Instead of relying on the kernel's internal force alignment, pass the same
size to posix_memalign() and map().

Also, fix the FIXTURE_TEARDOWN() misusing munmap() to free the memory from
posix_memalign(), as munmap() doesn't destroy the allocator meta data. So,
call free() instead.

Fixes: a9af47e382 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Link: https://patch.msgid.link/r/1ea8609ae6d523fdd4d8efb179ddee79c8582cb6.1750787928.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-06-24 15:45:12 -03:00
Yi Liu
7be11d34f6 iommufd: Test attach before detaching pasid
Check if the pasid has been attached before going further in the detach
path. This fixes a crash found by syzkaller. Add a selftest as well.

   Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASI
   KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
   CPU: 1 UID: 0 PID: 668 Comm: repro Not tainted 6.14.0-next-20250325-eb4bc4b07f66 #1 PREEMPT(voluntary)
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org4
   RIP: 0010:iommufd_hw_pagetable_detach+0x8a/0x4d0
   Code: 00 00 00 44 89 ee 48 89 c7 48 89 75 c8 48 89 45 c0 e8 ca 55 17 02 48 89 c2 49 89 c4 48 b8 00 00 00b
   RSP: 0018:ffff888021b17b78 EFLAGS: 00010246
   RAX: dffffc0000000000 RBX: ffff888014b5a000 RCX: ffff888021b17a64
   RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801dad07fc
   RBP: ffff888021b17bc8 R08: 0000000000000001 R09: 0000000000000001
   R10: 0000000000000001 R11: ffff88801dad0e58 R12: 0000000000000000
   R13: 0000000000000001 R14: ffff888021b17e18 R15: ffff8880132d3008
   FS:  00007fca52013600(0000) GS:ffff8880e3684000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00000000200006c0 CR3: 00000000112d0005 CR4: 0000000000770ef0
   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
   PKRU: 55555554
   Call Trace:
    <TASK>
    iommufd_device_detach+0x2a/0x2e0
    iommufd_test+0x2f99/0x5cd0
    iommufd_fops_ioctl+0x38e/0x520
    __x64_sys_ioctl+0x1ba/0x220
    x64_sys_call+0x122e/0x2150
    do_syscall_64+0x6d/0x150
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

Link: https://patch.msgid.link/r/20250328133448.22052-1-yi.l.liu@intel.com
Reported-by: Lai Yi <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/linux-iommu/Z+X0tzxhiaupJT7b@ly-workstation
Fixes: c0e301b297 ("iommufd/device: Add pasid_attach array to track per-PASID attach")
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28 11:40:41 -03:00
Yi Liu
6d9500bb1f iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO
IOMMU_HW_INFO is extended to report max_pasid_log2, hence add coverage
for it.

Link: https://patch.msgid.link/r/20250321180143.8468-6-yi.l.liu@intel.com
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-28 10:07:23 -03:00
Yi Liu
d57a1fb342 iommufd/selftest: Add coverage for iommufd pasid attach/detach
This tests iommufd pasid attach/replace/detach.

Link: https://patch.msgid.link/r/20250321171940.7213-19-yi.l.liu@intel.com
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-25 10:18:31 -03:00
Nicolin Chen
97717a1f28 iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage
Trigger vEVENTs by feeding an idev ID and validating the returned output
virt_ids whether they equal to the value that was set to the vDEVICE.

Link: https://patch.msgid.link/r/e829532ec0a3927d61161b7674b20e731ecd495b.1741719725.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18 14:17:48 -03:00
Nicolin Chen
941d0719aa iommufd/selftest: Require vdev_id when attaching to a nested domain
When attaching a device to a vIOMMU-based nested domain, vdev_id must be
present. Add a piece of code hard-requesting it, preparing for a vEVENTQ
support in the following patch. Then, update the TEST_F.

A HWPT-based nested domain will return a NULL new_viommu, thus no such a
vDEVICE requirement.

Link: https://patch.msgid.link/r/4051ca8a819e51cb30de6b4fe9e4d94d956afe3d.1741719725.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18 14:17:48 -03:00
Yi Liu
1062d81086 iommufd: Disallow allocating nested parent domain with fault ID
Allocating a domain with a fault ID indicates that the domain is faultable.
However, there is a gap for the nested parent domain to support PRI. Some
hardware lacks the capability to distinguish whether PRI occurs at stage 1
or stage 2. This limitation may require software-based page table walking
to resolve. Since no in-tree IOMMU driver currently supports this
functionality, it is disallowed. For more details, refer to the related
discussion at [1].

[1] https://lore.kernel.org/linux-iommu/bd1655c6-8b2f-4cfa-adb1-badc00d01811@intel.com/

Link: https://patch.msgid.link/r/20250226104012.82079-1-yi.l.liu@intel.com
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-04 09:34:54 -04:00
Steve Sistare
c0dec4b848 iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest
Add selftest cases for IOMMU_IOAS_CHANGE_PROCESS.

Link: https://patch.msgid.link/r/1731527497-16091-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-14 13:42:35 -04:00
Nicolin Chen
49ad127719 iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl
Add a viommu_cache test function to cover vIOMMU invalidations using the
updated IOMMU_HWPT_INVALIDATE ioctl, which now allows passing in a vIOMMU
via its hwpt_id field.

Link: https://patch.msgid.link/r/f317f902041f3d05deaee4ca3fdd8ef4b8297361.1730836308.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12 11:46:19 -04:00
Nicolin Chen
576ad6eb45 iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command
Similar to IOMMU_TEST_OP_MD_CHECK_IOTLB verifying a mock_domain's iotlb,
IOMMU_TEST_OP_DEV_CHECK_CACHE will be used to verify a mock_dev's cache.

Link: https://patch.msgid.link/r/cd4082079d75427bd67ed90c3c825e15b5720a5f.1730836308.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12 11:46:19 -04:00
Nicolin Chen
54ce69e36c iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE
With a vIOMMU object, use space can flush any IOMMU related cache that can
be directed via a vIOMMU object. It is similar to the IOMMU_HWPT_INVALIDATE
uAPI, but can cover a wider range than IOTLB, e.g. device/desciprtor cache.

Allow hwpt_id of the iommu_hwpt_invalidate structure to carry a viommu_id,
and reuse the IOMMU_HWPT_INVALIDATE uAPI for vIOMMU invalidations. Drivers
can define different structures for vIOMMU invalidations v.s. HWPT ones.

Since both the HWPT-based and vIOMMU-based invalidation pathways check own
cache invalidation op, remove the WARN_ON_ONCE in the allocator.

Update the uAPI, kdoc, and selftest case accordingly.

Link: https://patch.msgid.link/r/b411e2245e303b8a964f39f49453a5dff280968f.1730836308.git.nicolinc@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12 11:46:19 -04:00
Nicolin Chen
5778c75703 iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage
Add a vdevice_alloc op to the viommu mock_viommu_ops for the coverage of
IOMMU_VIOMMU_TYPE_SELFTEST allocations. Then, add a vdevice_alloc TEST_F
to cover the IOMMU_VDEVICE_ALLOC ioctl.

Link: https://patch.msgid.link/r/4b9607e5b86726c8baa7b89bd48123fb44104a23.1730836308.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12 11:46:18 -04:00
Nicolin Chen
7156cd9ef2 iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
Add a new iommufd_viommu FIXTURE and setup it up with a vIOMMU object.

Any new vIOMMU feature will be added as a TEST_F under that.

Link: https://patch.msgid.link/r/abe267c9d004b29cb1712ceba2f378209d4b7e01.1730836219.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12 11:46:18 -04:00
Steve Sistare
0bcceb1f51 iommufd: Selftest coverage for IOMMU_IOAS_MAP_FILE
Add test cases to exercise IOMMU_IOAS_MAP_FILE.

Link: https://patch.msgid.link/r/1729861919-234514-10-git-send-email-steven.sistare@oracle.com
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-10-28 13:24:24 -03:00
Jason Gunthorpe
996dc53ac2 iommufd: Do not allow creating areas without READ or WRITE
This results in passing 0 or just IOMMU_CACHE to iommu_map(). Most of
the page table formats don't like this:

  amdv1 - -EINVAL
  armv7s - returns 0, doesn't update mapped
  arm-lpae - returns 0 doesn't update mapped
  dart - returns 0, doesn't update mapped
  VT-D - returns -EINVAL

Unfortunately the three formats that return 0 cause serious problems:

 - Returning ret = but not uppdating mapped from domain->map_pages()
   causes an infinite loop in __iommu_map()

 - Not writing ioptes means that VFIO/iommufd have no way to recover them
   and we will have memory leaks and worse during unmap

Since almost nothing can support this, and it is a useless thing to do,
block it early in iommufd.

Cc: stable@kernel.org
Fixes: aad37e71d5 ("iommufd: IOCTLs for the io_pagetable")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/1-v1-1211e1294c27+4b1-iommu_no_prot_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-26 09:16:13 +02:00
Jason Gunthorpe
18dcca2496 Merge branch 'iommufd_pri' into iommufd for-next
Lu Baolu says:

====================
This series implements the functionality of delivering IO page faults to
user space through the IOMMUFD framework. One feasible use case is the
nested translation. Nested translation is a hardware feature that supports
two-stage translation tables for IOMMU. The second-stage translation table
is managed by the host VMM, while the first-stage translation table is
owned by user space. This allows user space to control the IOMMU mappings
for its devices.

When an IO page fault occurs on the first-stage translation table, the
IOMMU hardware can deliver the page fault to user space through the
IOMMUFD framework. User space can then handle the page fault and respond
to the device top-down through the IOMMUFD. This allows user space to
implement its own IO page fault handling policies.

User space application that is capable of handling IO page faults should
allocate a fault object, and bind the fault object to any domain that it
is willing to handle the fault generatd for them. On a successful return
of fault object allocation, the user can retrieve and respond to page
faults by reading or writing to the file descriptor (FD) returned.

The iommu selftest framework has been updated to test the IO page fault
delivery and response functionality.
====================

* iommufd_pri:
  iommufd/selftest: Add coverage for IOPF test
  iommufd/selftest: Add IOPF support for mock device
  iommufd: Associate fault object with iommufd_hw_pgtable
  iommufd: Fault-capable hwpt attach/detach/replace
  iommufd: Add iommufd fault object
  iommufd: Add fault and response message definitions
  iommu: Extend domain attach group with handle support
  iommu: Add attach handle to struct iopf_group
  iommu: Remove sva handle list
  iommu: Introduce domain attachment handle

Link: https://lore.kernel.org/all/20240702063444.105814-1-baolu.lu@linux.intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-07-09 13:55:05 -03:00
Lu Baolu
d1211768b6 iommufd/selftest: Add coverage for IOPF test
Extend the selftest tool to add coverage of testing IOPF handling. This
would include the following tests:

- Allocating and destroying an iommufd fault object.
- Allocating and destroying an IOPF-capable HWPT.
- Attaching/detaching/replacing an IOPF-capable HWPT on a device.
- Triggering an IOPF on the mock device.
- Retrieving and responding to the IOPF through the file interface.

Link: https://lore.kernel.org/r/20240702063444.105814-11-baolu.lu@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-07-09 13:54:32 -03:00
Joao Martins
ffa3c799ce iommufd/selftest: Fix tests to use MOCK_PAGE_SIZE based buffer sizes
commit a9af47e382 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
added tests covering edge cases in the boundaries of iova bitmap. Although
it used buffer sizes thinking in PAGE_SIZE (4K) as opposed to the
MOCK_PAGE_SIZE (2K) that is used in iommufd mock selftests. This meant that
isn't correctly exercising everything specifically the u32 and 4K bitmap
test cases. Fix selftests buffer sizes to be based on mock page size.

Link: https://lore.kernel.org/r/20240627110105.62325-5-joao.m.martins@oracle.com
Reported-by: Kevin Tian <kevin.tian@intel.com>
Closes: https://lore.kernel.org/linux-iommu/96efb6cf-a41c-420f-9673-2f0b682cac8c@oracle.com/
Fixes: a9af47e382 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-06-28 13:12:22 -03:00
Joao Martins
33335584eb iommufd/selftest: Add tests for <= u8 bitmap sizes
Add more tests for bitmaps smaller than or equal to an u8, though skip the
tests if the IOVA buffer size is smaller than the mock page size.

Link: https://lore.kernel.org/r/20240627110105.62325-4-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-06-28 13:12:22 -03:00
Joao Martins
ec61f820a2 iommufd/selftest: Fix dirty bitmap tests with u8 bitmaps
With 64k base pages, the first 128k iova length test requires less than a
byte for a bitmap, exposing a bug in the tests that assume that bitmaps are
at least a byte.

Rather than dealing with bytes, have _test_mock_dirty_bitmaps() pass the
number of bits. The caller functions are adjusted to also use bits as well,
and converting to bytes when clearing, allocating and freeing the bitmap.

Link: https://lore.kernel.org/r/20240627110105.62325-2-joao.m.martins@oracle.com
Reported-by: Matt Ochs <mochs@nvidia.com>
Fixes: a9af47e382 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-06-28 13:12:22 -03:00
Joao Martins
fe13166f05 iommufd/selftest: Add mock IO hugepages tests
Leverage previously added MOCK_FLAGS_DEVICE_HUGE_IOVA flag to create an
IOMMU domain with more than MOCK_IO_PAGE_SIZE supported.

Plumb the hugetlb backing memory for buffer allocation and change the
expected page size to MOCK_HUGE_PAGE_SIZE (1M) when hugepage variant test
cases are used. These so far are limited to 128M and 256M IOVA range tests
cases which is when 1M hugepages can be used.

Link: https://lore.kernel.org/r/20240202133415.23819-9-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06 11:31:46 -04:00
Joao Martins
407fc184f0 iommufd/selftest: Refactor dirty bitmap tests
Rework the functions that test and set the bitmaps to receive a new
parameter (the pte_page_size) that reflects the expected PTE size in the
page tables. The same scheme is still used i.e. even bits are dirty and
odd page indexes aren't dirty. Here it just refactors to consider the size
of the PTE rather than hardcoded to IOMMU mock base page assumptions.

While at it, refactor dirty bitmap tests to use the idev_id created by the
fixture instead of creating a new one.

This is in preparation for doing tests with IOMMU hugepages where multiple
bits set as part of recording a whole hugepage as dirty and thus the
pte_page_size will vary depending on io hugepages or io base pages.

Link: https://lore.kernel.org/r/20240202133415.23819-6-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06 11:31:45 -04:00
Joao Martins
42af951145 iommufd/selftest: Test u64 unaligned bitmaps
Exercise the dirty tracking bitmaps with byte unaligned addresses in
addition to the PAGE_SIZE unaligned bitmaps, using a address towards the
end of the page boundary.

In doing so, increase the tailroom we allocate for the bitmap from
MOCK_PAGE_SIZE(2K) into PAGE_SIZE(4K), such that we can test end of bitmap
boundary.

Link: https://lore.kernel.org/r/20240202133415.23819-4-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-02-06 11:31:45 -04:00
Nicolin Chen
bf26eb83fd iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Add test cases for the IOMMU_HWPT_INVALIDATE ioctl and verify it by using
the new IOMMU_TEST_OP_MD_CHECK_IOTLB.

Link: https://lore.kernel.org/r/20240111041015.47920-7-yi.l.liu@intel.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-01-11 13:01:25 -04:00
Nicolin Chen
e1fa6640d5 iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
Allow to test whether IOTLB has been invalidated or not.

Link: https://lore.kernel.org/r/20240111041015.47920-6-yi.l.liu@intel.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-01-11 13:01:25 -04:00
Nicolin Chen
55a01657cb iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs
The IOMMU_HWPT_ALLOC ioctl now supports passing user_data to allocate a
user-managed domain for nested HWPTs. Add its coverage for that. Also,
update _test_cmd_hwpt_alloc() and add test_cmd/err_hwpt_alloc_nested().

Link: https://lore.kernel.org/r/20231026043938.63898-11-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-26 11:15:57 -03:00
Joao Martins
0795b305da iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag
Change test_mock_dirty_bitmaps() to pass a flag where it specifies the flag
under test. The test does the same thing as the GET_DIRTY_BITMAP regular
test. Except that it tests whether the dirtied bits are fetched all the
same a second time, as opposed to observing them cleared.

Link: https://lore.kernel.org/r/20231024135109.73787-19-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins
ae36fe70ce iommufd/selftest: Test out_capabilities in IOMMU_GET_HW_INFO
Enumerate the capabilities from the mock device and test whether it
advertises as expected. Include it as part of the iommufd_dirty_tracking
fixture.

Link: https://lore.kernel.org/r/20231024135109.73787-18-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins
a9af47e382 iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP
Add a new test ioctl for simulating the dirty IOVAs in the mock domain, and
implement the mock iommu domain ops that get the dirty tracking supported.

The selftest exercises the usual main workflow of:

1) Setting dirty tracking from the iommu domain
2) Read and clear dirty IOPTEs

Different fixtures will test different IOVA range sizes, that exercise
corner cases of the bitmaps.

Link: https://lore.kernel.org/r/20231024135109.73787-17-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins
7adf267d66 iommufd/selftest: Test IOMMU_HWPT_SET_DIRTY_TRACKING
Change mock_domain to supporting dirty tracking and add tests to exercise
the new SET_DIRTY_TRACKING API in the iommufd_dirty_tracking selftest
fixture.

Link: https://lore.kernel.org/r/20231024135109.73787-16-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Joao Martins
266ce58989 iommufd/selftest: Test IOMMU_HWPT_ALLOC_DIRTY_TRACKING
In order to selftest the iommu domain dirty enforcing implement the
mock_domain necessary support and add a new dev_flags to test that the
hwpt_alloc/attach_device fails as expected.

Expand the existing mock_domain fixture with a enforce_dirty test that
exercises the hwpt_alloc and device attachment.

Link: https://lore.kernel.org/r/20231024135109.73787-15-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-24 11:58:44 -03:00
Nicolin Chen
266dcae34d iommufd/selftest: Rework TEST_LENGTH to test min_size explicitly
TEST_LENGTH passing ".size = sizeof(struct _struct) - 1" expects -EINVAL
from "if (ucmd.user_size < op->min_size)" check in iommufd_fops_ioctl().
This has been working when min_size is exactly the size of the structure.

However, if the size of the structure becomes larger than min_size, i.e.
the passing size above is larger than min_size, that min_size sanity no
longer works.

Since the first test in TEST_LENGTH() was to test that min_size sanity
routine, rework it to support a min_size calculation, rather than using
the full size of the structure.

Link: https://lore.kernel.org/r/20231015074648.24185-1-nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-16 11:05:51 -03:00
Yi Liu
408663619f iommufd/selftest: Add domain_alloc_user() support in iommu mock
Add mock_domain_alloc_user() and a new test case for
IOMMU_HWPT_ALLOC_NEST_PARENT.

Link: https://lore.kernel.org/r/20230928071528.26258-6-yi.l.liu@intel.com
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-10 13:31:24 -03:00
Nicolin Chen
bb812e0069 iommufd/selftest: Iterate idev_ids in mock_domain's alloc_hwpt test
The point in iterating variant->mock_domains is to test the idev_ids[0]
and idev_ids[1]. So use it instead of keeping testing idev_ids[0] only.

Link: https://lore.kernel.org/r/20230919011637.16483-1-nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-09-26 13:07:02 -03:00
GuokaiXu
474d9f3276 iommufd: Fix spelling errors in comments
requres -> requires
dramtically -> dramatically

Link: https://lore.kernel.org/r/31680D47D9533D91+20230904023236.GA12494@xgk8823
Signed-off-by: GuokaiXu <xuguokai@ucas.com.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-09-19 09:09:22 -03:00
Nicolin Chen
af4fde93c3 iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl
Add a mock_domain_hw_info function and an iommu_test_hw_info data
structure. This allows to test the IOMMU_GET_HW_INFO ioctl passing the
test_reg value for the mock_dev.

Link: https://lore.kernel.org/r/20230818101033.4100-5-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-18 12:52:15 -03:00
Nicolin Chen
c154660b6e iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
Add a new IOMMU_TEST_OP_ACCESS_REPLACE_IOAS to allow replacing the
access->ioas, corresponding to the iommufd_access_replace() helper.

Then add replace coverage as a part of user_copy test case, which
basically repeats the copy test after replacing the old ioas with a new
one.

Link: https://lore.kernel.org/r/a4897f93d41c34b972213243b8dbf4c3832842e4.1690523699.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-28 13:31:24 -03:00
Jason Gunthorpe
6583c865de iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
Test the basic flow.

Link: https://lore.kernel.org/r/19-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-26 10:20:41 -03:00
Jason Gunthorpe
7a467e02b3 iommufd/selftest: Return the real idev id from selftest mock_domain
Now that we actually call iommufd_device_bind() we can return the
idev_id from that function to userspace for use in other APIs.

Link: https://lore.kernel.org/r/18-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-26 10:20:36 -03:00
Nicolin Chen
fa1ffdb9e2 iommufd/selftest: Test iommufd_device_replace()
Allow the selftest to call the function on the mock idev, add some tests
to exercise it.

Link: https://lore.kernel.org/r/16-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-26 10:20:26 -03:00