linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-31 22:23:05 +00:00

Author	SHA1	Message	Date
Linus Torvalds	c93529ad4f	iommufd 6.17 merge window pull - IOMMU HW now has features to directly assign HW command queues to a guest VM. In this mode the command queue operates on a limited set of invalidation commands that are suitable for improving guest invalidation performance and easy for the HW to virtualize. This PR brings the generic infrastructure to allow IOMMU drivers to expose such command queues through the iommufd uAPI, mmap the doorbell pages, and get the guest physical range for the command queue ring itself. - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built on the new iommufd command queue features. It works with the existing SMMU driver support for cmdqv in guest VMs. - Many precursor cleanups and improvements to support the above cleanly, changes to the general ioctl and object helpers, driver support for VDEVICE, and mmap pgoff cookie infrastructure. - Sequence VDEVICE destruction to always happen before VFIO device destruction. When using the above type features, and also in future confidential compute, the internal virtual device representation becomes linked to HW or CC TSM configuration and objects. If a VFIO device is removed from iommufd those HW objects should also be cleaned up to prevent a sort of UAF. This became important now that we have HW backing the VDEVICE. - Fix one syzkaller found error related to math overflows during iova allocation -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaIpl9AAKCRCFwuHvBreF YS5tAP9MDIRML5a/2IOhzcsc4LiDkWTMKm2m1wcRYd+iU2aFVQEAjdghINLHrUlx HVuIDvNvWIUED/oTAp5kCxQ7PBFN4gU= =NmCO -----END PGP SIGNATURE----- Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This broadly brings the assigned HW command queue support to iommufd. This feature is used to improve SVA performance in VMs by avoiding paravirtualization traps during SVA invalidations. Along the way I think some of the core logic is in a much better state to support future driver backed features. Summary: - IOMMU HW now has features to directly assign HW command queues to a guest VM. In this mode the command queue operates on a limited set of invalidation commands that are suitable for improving guest invalidation performance and easy for the HW to virtualize. This brings the generic infrastructure to allow IOMMU drivers to expose such command queues through the iommufd uAPI, mmap the doorbell pages, and get the guest physical range for the command queue ring itself. - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built on the new iommufd command queue features. It works with the existing SMMU driver support for cmdqv in guest VMs. - Many precursor cleanups and improvements to support the above cleanly, changes to the general ioctl and object helpers, driver support for VDEVICE, and mmap pgoff cookie infrastructure. - Sequence VDEVICE destruction to always happen before VFIO device destruction. When using the above type features, and also in future confidential compute, the internal virtual device representation becomes linked to HW or CC TSM configuration and objects. If a VFIO device is removed from iommufd those HW objects should also be cleaned up to prevent a sort of UAF. This became important now that we have HW backing the VDEVICE. - Fix one syzkaller found error related to math overflows during iova allocation" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits) iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3 iommufd: Rename some shortterm-related identifiers iommufd/selftest: Add coverage for vdevice tombstone iommufd/selftest: Explicitly skip tests for inapplicable variant iommufd/vdevice: Remove struct device reference from struct vdevice iommufd: Destroy vdevice on idevice destroy iommufd: Add a pre_destroy() op for objects iommufd: Add iommufd_object_tombstone_user() helper iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice iommufd/selftest: Test reserved regions near ULONG_MAX iommufd: Prevent ALIGN() overflow iommu/tegra241-cmdqv: import IOMMUFD module namespace iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support iommu/tegra241-cmdqv: Add user-space use support iommu/tegra241-cmdqv: Do not statically map LVCMDQs iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf() iommu/tegra241-cmdqv: Use request_threaded_irq iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops ...	2025-07-31 12:43:08 -07:00
Nicolin Chen	2c78e74493	iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size It's more flexible to have a get_viommu_size op. Replace static vsmmu_size and vsmmu_type with that. Link: https://patch.msgid.link/r/20250724221002.1883034-3-nicolinc@nvidia.com Suggested-by: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-28 12:07:50 -03:00
Nicolin Chen	4dc0d12474	iommu/tegra241-cmdqv: Add user-space use support The CMDQV HW supports a user-space use for virtualization cases. It allows the VM to issue guest-level TLBI or ATC_INV commands directly to the queue and executes them without a VMEXIT, as HW will replace the VMID field in a TLBI command and the SID field in an ATC_INV command with the preset VMID and SID. This is built upon the vIOMMU infrastructure by allowing VMM to allocate a VINTF (as a vIOMMU object) and assign VCMDQs (HW QUEUE objs) to the VINTF. So firstly, replace the standard vSMMU model with the VINTF implementation but reuse the standard cache_invalidate op (for unsupported commands) and the standard alloc_domain_nested op (for standard nested STE). Each VINTF has two 64KB MMIO pages (128B per logical VCMDQ): - Page0 (directly accessed by guest) has all the control and status bits. - Page1 (trapped by VMM) has guest-owned queue memory location/size info. VMM should trap the emulated VINTF0's page1 of the guest VM for the guest- level VCMDQ location/size info and forward that to the kernel to translate to a physical memory location to program the VCMDQ HW during an allocation call. Then, it should mmap the assigned VINTF's page0 to the VINTF0 page0 of the guest VM. This allows the guest OS to read and write the guest-own VINTF's page0 for direct control of the VCMDQ HW. For ATC invalidation commands that hold an SID, it requires all devices to register their virtual SIDs to the SID_MATCH registers and their physical SIDs to the pairing SID_REPLACE registers, so that HW can use those as a lookup table to replace those virtual SIDs with the correct physical SIDs. Thus, implement the driver-allocated vDEVICE op with a tegra241_vintf_sid structure to allocate SID_REPLACE and to program the SIDs accordingly. This enables the HW accelerated feature for NVIDIA Grace CPU. Compared to the standard SMMUv3 operating in the nested translation mode trapping CMDQ for TLBI and ATC_INV commands, this gives a huge performance improvement: 70% to 90% reductions of invalidation time were measured by various DMA unmap tests running in a guest OS. Link: https://patch.msgid.link/r/fb0eab83f529440b6aa181798912a6f0afa21eb0.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-11 14:34:36 -03:00
Nicolin Chen	9eb6a666df	iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops This will be used by Tegra241 CMDQV implementation to report a non-default HW info data. Link: https://patch.msgid.link/r/8a3bf5709358eb21aed2e8434534c30ecf83917c.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-11 14:34:35 -03:00
Nicolin Chen	61dd912ee0	iommu/arm-smmu-v3-iommufd: Add vsmmu_size/type and vsmmu_init impl ops An impl driver might want to allocate its own type of vIOMMU object or the standard IOMMU_VIOMMU_TYPE_ARM_SMMUV3 by setting up its own SW/HW bits, as the tegra241-cmdqv driver will add IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV. Add vsmmu_size/type and vsmmu_init to struct arm_smmu_impl_ops. Prioritize them in arm_smmu_get_viommu_size() and arm_vsmmu_init(). Link: https://patch.msgid.link/r/375ac2b056764534bb7c10ecc4f34a0bae82b108.1752126748.git.nicolinc@nvidia.com Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-11 14:34:35 -03:00
Nicolin Chen	c3436d42f8	iommu: Pass in a driver-level user data structure to viommu_init op The new type of vIOMMU for tegra241-cmdqv allows user space VM to use one of its virtual command queue HW resources exclusively. This requires user space to mmap the corresponding MMIO page from kernel space for direct HW control. To forward the mmap info (offset and length), iommufd should add a driver specific data structure to the IOMMUFD_CMD_VIOMMU_ALLOC ioctl, for driver to output the info during the vIOMMU initialization back to user space. Similar to the existing ioctls and their IOMMU handlers, add a user_data to viommu_init op to bridge between iommufd and drivers. Link: https://patch.msgid.link/r/90bd5637dab7f5507c7a64d2c4826e70431e45a4.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-10 12:38:51 -03:00
Nicolin Chen	4b57c057f9	iommu: Use enum iommu_hw_info_type for type in hw_info op Replace u32 to make it clear. No functional changes. Also simplify the kdoc since the type itself is clear enough. Link: https://patch.msgid.link/r/651c50dee8ab900f691202ef0204cd5a43fdd6a2.1752126748.git.nicolinc@nvidia.com Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-07-10 12:38:50 -03:00
Mikołaj Lenczewski	212c439bdd	iommu/arm: Add BBM Level 2 smmu feature For supporting BBM Level 2 for userspace mappings, we want to ensure that the smmu also supports its own version of BBM Level 2. Luckily, the smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI 0487K.a D8.16.2), so already guarantees that no aborts are raised when BBM level 2 is claimed. Add the feature and testing for it under arm_smmu_sva_supported(). Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20250625113435.26849-4-miko.lenczewski@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-06-30 18:09:05 +01:00
Nicolin Chen	3961f2f5da	iommu/arm-smmu-v3: Replace arm_vsmmu_alloc with arm_vsmmu_init To ease the for-driver iommufd APIs, get_viommu_size and viommu_init ops are introduced. Sanitize the inputs and report the size of struct arm_vsmmu on success, in arm_smmu_get_viommu_size(). Place the type sanity at the last, becase there will be soon an impl level get_viommu_size op, which will require the same sanity tests prior. It can simply insert a piece of code in front of the IOMMU_VIOMMU_TYPE_ARM_SMMUV3 sanity. The core will ensure the viommu_type is set to the core vIOMMU object, and pass in the same dev pointer, so arm_vsmmu_init() won't need to repeat the same sanity tests but to simply init the arm_vsmmu struct. Remove the arm_vsmmu_alloc, completing the replacement. Link: https://patch.msgid.link/r/64e4b4c33acd26e1bd676e077be80e00fb63f17c.1749882255.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-06-19 15:43:29 -03:00
Jason Gunthorpe	cfea71aea9	iommu/arm-smmu-v3: Put iopf enablement in the domain attach path SMMUv3 co-mingles FEAT_IOPF and FEAT_SVA behaviors so that fault reporting doesn't work unless both are enabled. This is not correct and causes problems for iommufd which does not enable FEAT_SVA for it's fault capable domains. These APIs are both obsolete, update SMMUv3 to use the new method like AMD implements. A driver should enable iopf support when a domain with an iopf_handler is attached, and disable iopf support when the domain is removed. Move the fault support logic to sva domain allocation and to domain attach, refusing to create or attach fault capable domains if the HW doesn't support it. Move all the logic for controlling the iopf queue under arm_smmu_attach_prepare(). Keep track of the number of domains on the master (over all the SSIDs) that require iopf. When the first domain requiring iopf is attached create the iopf queue, when the last domain is detached destroy it. Turn FEAT_IOPF and FEAT_SVA into no ops. Remove the sva_lock, this is all protected by the group mutex. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418080130.1844424-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:28 +02:00
Nicolin Chen	da0c56520e	iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations There is a DoS concern on the shared hardware event queue among devices passed through to VMs, that too many translation failures that belong to VMs could overflow the shared hardware event queue if those VMs or their VMMs don't handle/recover the devices properly. The MEV bit in the STE allows to configure the SMMU HW to merge similar event records, though there is no guarantee. Set it in a nested STE for DoS mitigations. In the future, we might want to enable the MEV for non-nested cases too such as domain->type == IOMMU_DOMAIN_UNMANAGED or even IOMMU_DOMAIN_DMA. Link: https://patch.msgid.link/r/8ed12feef67fc65273d0f5925f401a81f56acebe.1741719725.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-18 14:17:48 -03:00
Nicolin Chen	e7d3fa3d29	iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU Aside from the IOPF framework, iommufd provides an additional pathway to report hardware events, via the vEVENTQ of vIOMMU infrastructure. Define an iommu_vevent_arm_smmuv3 uAPI structure, and report stage-1 events in the threaded IRQ handler. Also, add another four event record types that can be forwarded to a VM. Link: https://patch.msgid.link/r/5cf6719682fdfdabffdb08374cdf31ad2466d75a.1741719725.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-18 14:17:48 -03:00
Nicolin Chen	f0ea207ed7	iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster Use it to store all vSMMU-related data. The vsid (Virtual Stream ID) will be the first use case. Since the vsid reader will be the eventq handler that already holds a streams_mutex, reuse that to fence the vmaster too. Also add a pair of arm_smmu_attach_prepare/commit_vmaster helpers to set or unset the master->vmaster pointer. Put the helpers inside the existing arm_smmu_attach_prepare/commit(). For identity/blocked ops that don't call arm_smmu_attach_prepare/commit(), add a simpler arm_smmu_master_clear_vmaster helper to unset the vmaster. Link: https://patch.msgid.link/r/a7f282e1a531279e25f06c651e95d56f6b120886.1741719725.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-18 14:17:48 -03:00
Jason Gunthorpe	48e7b8e284	iommu/arm-smmu-v3: Remove arm_smmu_domain_finalise() during attach Domains are now always finalized during allocation because the core code no longer permits a NULL dev argument to domain_alloc_paging/_flags(). Remove the late finalize during attach that supported domains that were not fully initialized. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v1-0bb8d5313a27+27b-smmuv3_paging_flags_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-12-09 23:08:06 +00:00
Pranjal Shrivastava	d814b70b9b	iommu/arm-smmu-v3: Log better event records Currently, the driver dumps the raw hex for a received event record. Improve this by leveraging `struct arm_smmu_event` for event fields and log human-readable event records with meaningful information. Signed-off-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/20241203184906.2264528-3-praan@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-12-09 21:43:43 +00:00
Pranjal Shrivastava	43ca55f555	iommu/arm-smmu-v3: Introduce struct arm_smmu_event Introduce `struct arm_smmu_event` to represent event records. Parse out relevant fields from raw event records for ease and use the new `struct arm_smmu_event` instead. Signed-off-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/20241203184906.2264528-2-praan@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-12-09 21:43:43 +00:00
Linus Torvalds	ceba6f6f33	IOMMU Updates for Linux v6.13: Including: - Core Updates: - Convert call-sites using iommu_domain_alloc() to more specific versions and remove function. - Introduce iommu_paging_domain_alloc_flags(). - Extend support for allocating PASID-capable domains to more drivers. - Remove iommu_present(). - Some smaller improvements. - New IOMMU driver for RISC-V. - Intel VT-d Updates: - Add domain_alloc_paging support. - Enable user space IOPFs in non-PASID and non-svm cases. - Small code refactoring and cleanups. - Add domain replacement support for pasid. - AMD-Vi Updates: - Adapt to iommu_paging_domain_alloc_flags() interface and alloc V2 page-tables by default. - Replace custom domain ID allocator with IDA allocator. - Add ops->release_domain() support. - Other improvements to device attach and domain allocation code paths. - ARM-SMMU Updates: - SMMUv2: - Return -EPROBE_DEFER for client devices probing before their SMMU. - Devicetree binding updates for Qualcomm MMU-500 implementations. - SMMUv3: - Minor fixes and cleanup for NVIDIA's virtual command queue driver. - IO-PGTable: - Fix indexing of concatenated PGDs and extend selftest coverage. - Remove unused block-splitting support. - S390 IOMMU: - Implement support for blocking domain. - Mediatek IOMMU: - Enable 35-bit physical address support for mt8186. - OMAP IOMMU driver: - Adapt to recent IOMMU core changes and unbreak driver. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmdAPOoACgkQK/BELZcB GuOs1w/+PoLbOYUjmJiOfpI6YNSEfF2tE4z2al/YYIBcNoAmTTRauuhv6+S0gVRy NTfSucw7OuLlbE9vGsdY02UL1PK58NGfUF8Z2rZSf+RRgLACc47cjZWh0vzDlNbP 4LTdqJXmIWiYcmDtY7LmHtwTSiB900YFZwZOHmTSfNyJt8UC4tBPRh8k2YD3vuxc QZlxSihEf+F+vm8GtW40Ia9BiG3YhCYAcHq6Y4dKxI0JWN+7oRiPN8CF+z/vcdjV VpCDBcbHjvqqpXJvddQHA0SrGDBMHz1AXYhRXnfe7Ogh6SbaSWDSsdaIS27DsOzC L6fxW3+sNmfEOO1RmJoizkHzAtkLWCLNjBvjOb1hUCpwLcKf5nhgE3wOQSwzqumn KbxpoQpHFJutikDBGRsKJCsNqS8ZNWd4Z8rHhTnq2ctuYUFvurkcwX4WXOSRpsoA iJ+x1ezk9FxObHj/B+1nIAwKoeaLyFEwJe7Etom/E2m/2mq2oQOrq1bvfIGCms5h mqLYJ9L9MDanhEiOshHooy6ROPD842XmWILfq3HUi9JcrB/BvILPRsESQnNAn3Zl 8ImbR5VijGGDy50KBE8I9abRwDTIn9c2JJVDSh3tAz1aicGnRLcIeqNeuJ4IEQZf IQb7qcZQge17ie/Pwr24GlwrKG7DhOg5NXvl3DiVUum2NFGjuBc= =V9hb -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core Updates: - Convert call-sites using iommu_domain_alloc() to more specific versions and remove function - Introduce iommu_paging_domain_alloc_flags() - Extend support for allocating PASID-capable domains to more drivers - Remove iommu_present() - Some smaller improvements New IOMMU driver for RISC-V Intel VT-d Updates: - Add domain_alloc_paging support - Enable user space IOPFs in non-PASID and non-svm cases - Small code refactoring and cleanups - Add domain replacement support for pasid AMD-Vi Updates: - Adapt to iommu_paging_domain_alloc_flags() interface and alloc V2 page-tables by default - Replace custom domain ID allocator with IDA allocator - Add ops->release_domain() support - Other improvements to device attach and domain allocation code paths ARM-SMMU Updates: - SMMUv2: - Return -EPROBE_DEFER for client devices probing before their SMMU - Devicetree binding updates for Qualcomm MMU-500 implementations - SMMUv3: - Minor fixes and cleanup for NVIDIA's virtual command queue driver - IO-PGTable: - Fix indexing of concatenated PGDs and extend selftest coverage - Remove unused block-splitting support S390 IOMMU: - Implement support for blocking domain Mediatek IOMMU: - Enable 35-bit physical address support for mt8186 OMAP IOMMU driver: - Adapt to recent IOMMU core changes and unbreak driver" * tag 'iommu-updates-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (92 commits) iommu/tegra241-cmdqv: Fix alignment failure at max_n_shift iommu: Make set_dev_pasid op support domain replacement iommu/arm-smmu-v3: Make set_dev_pasid() op support replace iommu/vt-d: Add set_dev_pasid callback for nested domain iommu/vt-d: Make identity_domain_set_dev_pasid() to handle domain replacement iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement iommu/vt-d: Add iommu_domain_did() to get did iommu/vt-d: Consolidate the struct dev_pasid_info add/remove iommu/vt-d: Add pasid replace helpers iommu/vt-d: Refactor the pasid setup helpers iommu/vt-d: Add a helper to flush cache for updating present pasid entry iommu: Pass old domain to set_dev_pasid op iommu/iova: Fix typo 'adderss' iommu: Add a kdoc to iommu_unmap() iommu/io-pgtable-arm-v7s: Remove split on unmap behavior iommu/io-pgtable-arm: Remove split on unmap behavior iommu/vt-d: Drain PRQs when domain removed from RID iommu/vt-d: Drop pasid requirement for prq initialization ...	2024-11-22 19:55:10 -08:00
Joerg Roedel	42f0cbb2a2	Merge branches 'intel/vt-d', 'amd/amd-vi' and 'iommufd/arm-smmuv3-nested' into next	2024-11-15 09:27:43 +01:00
Nicolin Chen	d68beb276b	iommu/arm-smmu-v3: Support IOMMU_HWPT_INVALIDATE using a VIOMMU object Implement the vIOMMU's cache_invalidate op for user space to invalidate the IOTLB entries, Device ATS and CD entries that are cached by hardware. Add struct iommu_viommu_arm_smmuv3_invalidate defining invalidation entries that are simply in the native format of a 128-bit TLBI command. Scan those commands against the permitted command list and fix their VMID/SID fields to match what is stored in the vIOMMU. Link: https://patch.msgid.link/r/12-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Co-developed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-11-12 14:11:03 -04:00
Jason Gunthorpe	f27298a82b	iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED The EATS flag needs to flow through the vSTE and into the pSTE, and ensure physical ATS is enabled on the PCI device. The physical ATS state must match the VM's idea of EATS as we rely on the VM to issue the ATS invalidation commands. Thus ATS must remain off at the device until EATS on a nesting domain turns it on. Attaching a nesting domain is the point where the invalidation responsibility transfers to userspace. Update the ATS logic to track EATS for nesting domains and flush the ATC whenever the S2 nesting parent changes. Link: https://patch.msgid.link/r/11-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-11-12 14:11:03 -04:00
Jason Gunthorpe	67e4fe3985	iommu/arm-smmu-v3: Use S2FWB for NESTED domains Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field works. When S2FWB is supported and enabled the IOPTE will force cachable access to IOMMU_CACHE memory when nesting with a S1 and deny cachable access when !IOMMU_CACHE. When using a single stage of translation, a simple S2 domain, it doesn't change things for PCI devices as it is just a different encoding for the existing mapping of the IOMMU protection flags to cachability attributes. For non-PCI it also changes the combining rules when incoming transactions have inconsistent attributes. However, when used with a nested S1, FWB has the effect of preventing the guest from choosing a MemAttr in it's S1 that would cause ordinary DMA to bypass the cache. Consistent with KVM we wish to deny the guest the ability to become incoherent with cached memory the hypervisor believes is cachable so we don't have to flush it. Allow NESTED domains to be created if the SMMU has S2FWB support and use S2FWB for NESTING_PARENTS. This is an additional option to CANWBS. Link: https://patch.msgid.link/r/10-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-11-12 14:11:03 -04:00
Jason Gunthorpe	1e8be08d1c	iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED For SMMUv3 a IOMMU_DOMAIN_NESTED is composed of a S2 iommu_domain acting as the parent and a user provided STE fragment that defines the CD table and related data with addresses translated by the S2 iommu_domain. The kernel only permits userspace to control certain allowed bits of the STE that are safe for user/guest control. IOTLB maintenance is a bit subtle here, the S1 implicitly includes the S2 translation, but there is no way of knowing which S1 entries refer to a range of S2. For the IOTLB we follow ARM's guidance and issue a CMDQ_OP_TLBI_NH_ALL to flush all ASIDs from the VMID after flushing the S2 on any change to the S2. The IOMMU_DOMAIN_NESTED can only be created from inside a VIOMMU as the invalidation path relies on the VIOMMU to translate virtual stream ID used in the invalidation commands for the CD table and ATS. Link: https://patch.msgid.link/r/9-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-11-12 14:11:03 -04:00
Nicolin Chen	69d9b312f3	iommu/arm-smmu-v3: Support IOMMU_VIOMMU_ALLOC Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type. Implement an arm_vsmmu_alloc(). As an initial step, copy the VMID from s2_parent. A followup series is required to give the VIOMMU object it's own VMID that will be used in all nesting configurations. Link: https://patch.msgid.link/r/8-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-11-12 14:09:44 -04:00
Jason Gunthorpe	e9f1f727e6	iommu/arm-smmu-v3: Make set_dev_pasid() op support replace set_dev_pasid() op is going to be enhanced to support domain replacement of a pasid. This prepares for this op definition. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20241107122234.7424-13-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-11-08 14:04:58 +01:00
Jason Gunthorpe	f6681abd41	iommu/arm-smmu-v3: Expose the arm_smmu_attach interface The arm-smmuv3-iommufd.c file will need to call these functions too. Remove statics and put them in the header file. Remove the kunit visibility protections from arm_smmu_make_abort_ste() and arm_smmu_make_s2_domain_ste(). Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-11-05 10:24:17 +00:00
Nicolin Chen	6912ec9182	iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info For virtualization cases the IDR/IIDR/AIDR values of the actual SMMU instance need to be available to the VMM so it can construct an appropriate vSMMUv3 that reflects the correct HW capabilities. For userspace page tables these values are required to constrain the valid values within the CD table and the IOPTEs. The kernel does not sanitize these values. If building a VMM then userspace is required to only forward bits into a VM that it knows it can implement. Some bits will also require a VMM to detect if appropriate kernel support is available such as for ATS and BTM. Start a new file and kconfig for the advanced iommufd support. This lets it be compiled out for kernels that are not intended to support virtualization, and allows distros to leave it disabled until they are shipping a matching qemu too. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-11-05 10:24:17 +00:00
Jason Gunthorpe	e89573cf4a	iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS HW with CANWBS is always cache coherent and ignores PCI No Snoop requests as well. This meets the requirement for IOMMU_CAP_ENFORCE_CACHE_COHERENCY, so let's return it. Implement the enforce_cache_coherency() op to reject attaching devices that don't have CANWBS. Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-11-05 10:24:17 +00:00
Jason Gunthorpe	e3b1be2e73	iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg The members here are being used for both the linear and the 2 level case, with the meaning of each item slightly different in the two cases. Split it into a clean union where both cases have their own struct with their own logical names and correct types. Adjust all the users to detect linear/2lvl and use the right sub structure and types consistently. Remove CTXDESC_CD_DWORDS by changing the last places to use sizeof(struct arm_smmu_cd). Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	7c567eb1e1	iommu/arm-smmu-v3: Add types for each level of the CD table As well as indexing helpers arm_smmu_cdtab_l1/2_idx(). Remove CTXDESC_L1_DESC_DWORDS and CTXDESC_CD_DWORDS replacing them all with type specific calculations. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	c0a25a96de	iommu/arm-smmu-v3: Shrink the cdtab l1_desc array The top of the 2 level CD table is (at most) 1024 entries big, and two high order allocations are required. One of __le64 which is programmed into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the CPU pointer (16k). There are two copies of the l2ptr_dma, one is stored in the struct arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to use. Instead of storing two copies just decode the value from the __le64. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:15 +01:00
Jason Gunthorpe	8c153ef956	iommu/arm-smmu-v3: Remove strtab_base/cfg These values can be computed from the other values already stored in the config. Move the calculation to arm_smmu_write_strtab() and do it directly before writing the registers. This moves all the logic to calculate the two registers into one function from three and saves an unimportant 16 bytes from the arm_smmu_device. Suggested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	85196f5474	iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg The members here are being used for both the linear and the 2 level case, with the meaning of each item slightly different in the two cases. Split it into a clean union where both cases have their own struct with their own logical names and correct types. Adjust all the users to detect linear/2lvl and use the right sub structure and types consistently. Remove STRTAB_STE_DWORDS by changing the last places to use sizeof(struct arm_smmu_ste). Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	abb4f9d323	iommu/arm-smmu-v3: Add types for each level of the 2 level stream table Add types struct arm_smmu_strtab_l1 and l2 to represent the HW layout of the descriptors, and use them in most places, following patches will get the remaing places. The size of the l1 and l2 HW allocations are sizeof(struct arm_smmu_strtab_l1/2). This provides some more clarity than having raw __le64 *'s and sizes computed via macros. Remove STRTAB_L1_DESC_DWORDS. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Jason Gunthorpe	ce410410f1	iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Don't open code the calculations of the indexes for each level, provide two functions to do that math and call them in all the places. Update all the places computing indexes. Calculate the L1 table size directly based on the max required index from the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES. Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-6416877274e1+1af-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-09 15:47:14 +01:00
Nicolin Chen	483e0bd888	iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent It's observed that, when the first 4GB of system memory was reserved, all VCMDQ allocations failed (even with the smallest qsz in the last attempt): arm-smmu-v3: found companion CMDQV device: NVDA200C:00 arm-smmu-v3: option mask 0x10 arm-smmu-v3: failed to allocate queue (0x8000 bytes) for vcmdq0 acpi NVDA200C:00: tegra241_cmdqv: Falling back to standard SMMU CMDQ arm-smmu-v3: ias 48-bit, oas 48-bit (features 0x001e1fbf) arm-smmu-v3: allocated 524288 entries for cmdq arm-smmu-v3: allocated 524288 entries for evtq arm-smmu-v3: allocated 524288 entries for priq This is because the 4GB reserved memory shifted the entire DMA zone from a lower 32-bit range (on a system without the 4GB carveout) to higher range, while the dev->coherent_dma_mask was set to DMA_BIT_MASK(32) by default. The dma_set_mask_and_coherent() call is done in arm_smmu_device_hw_probe() of the SMMU driver. So any DMA allocation from tegra241_cmdqv_probe() must wait until the coherent_dma_mask is correctly set. Move the vintf/vcmdq structure initialization routine into a different op, "init_structures". Call it at the end of arm_smmu_init_structures(), where standard SMMU queues get allocated. Most of the impl_ops aren't ready until vintf/vcmdq structure are init-ed. So replace the full impl_ops with an init_ops in __tegra241_cmdqv_probe(). And switch to tegra241_cmdqv_impl_ops later in arm_smmu_init_structures(). Note that tegra241_cmdqv_impl_ops does not link to the new init_structures op after this switch, since there is no point in having it once it's done. Fixes: `918eb5c856` ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Reported-by: Matt Ochs <mochs@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/530993c3aafa1b0fc3d879b8119e13c629d12e2b.1725503154.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-09-05 16:07:38 +01:00
Mostafa Saleh	ce7cb08e22	iommu/arm-smmu-v3: Match Stall behaviour for S2 According to the spec (ARM IHI 0070 F.b), in "5.5 Fault configuration (A, R, S bits)": A STE with stage 2 translation enabled and STE.S2S == 0 is considered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10. Also described in the pseudocode “SteIllegal()” if STE.Config == '11x' then [..] if eff_idr0_stall_model == '10' && STE.S2S == '0' then // stall_model forcing stall, but S2S == 0 return TRUE; Which means, S2S must be set when stall model is "ARM_SMMU_FEAT_STALL_FORCE", but currently the driver ignores that. Although, the driver can do the minimum and only set S2S for “ARM_SMMU_FEAT_STALL_FORCE”, it is more consistent to match S1 behaviour, which also sets it for “ARM_SMMU_FEAT_STALL” if the master has requested stalls. Also, since S2 stalls are enabled now, report them to the IOMMU layer and for VFIO devices it will fail anyway as VFIO doesn’t register an iopf handler. Signed-off-by: Mostafa Saleh <smostafa@google.com> Link: https://lore.kernel.org/r/20240830110349.797399-2-smostafa@google.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 16:02:27 +01:00
Nicolin Chen	a9d40285bd	iommu/tegra241-cmdqv: Limit CMDs for VCMDQs of a guest owned VINTF When VCMDQs are assigned to a VINTF owned by a guest (HYP_OWN bit unset), only TLB and ATC invalidation commands are supported by the VCMDQ HW. So, implement the new cmdq->supports_cmd op to scan the input cmd in order to make sure that it is supported by the selected queue. Note that the guest VM shouldn't have HYP_OWN bit being set regardless of guest kernel driver writing it or not, i.e. the hypervisor running in the host OS should wire this bit to zero when trapping a write access to this VINTF_CONFIG register from a guest kernel. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/8160292337059b91271045800e5c62f7295e2c24.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:28:25 +01:00
Nicolin Chen	f59e854907	iommu/arm-smmu-v3: Start a new batch if new command is not supported The VCMDQ in the tegra241-cmdqv driver has a guest mode that supports only a few invalidation commands. A batch is initialized with a cmdq, so it has to confirm whether a new command is supported or not. Add a supports_cmd function pointer to the cmdq structure, where the vcmdq driver should hook a command scan function. Add an inline helper too so it can be used by both sides. If a new command is not supported, simply issue the existing batch and re- init it as a new batch. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/aafb24b881504f18c5d0c7c15f2134e40ad2c486.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:28:25 +01:00
Nate Watterson	918eb5c856	iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV NVIDIA's Tegra241 Soc has a CMDQ-Virtualization (CMDQV) hardware, extending the standard ARM SMMU v3 IP to support multiple VCMDQs with virtualization capabilities. In terms of command queue, they are very like a standard SMMU CMDQ (or ECMDQs), but only support CS_NONE in the CS field of CMD_SYNC. Add a new tegra241-cmdqv driver, and insert its structure pointer into the existing arm_smmu_device, and then add related function calls in the SMMUv3 driver to interact with the CMDQV driver. In the CMDQV driver, add a minimal part for the in-kernel support: reserve VINTF0 for in-kernel use, and assign some of the VCMDQs to the VINTF0, and select one VCMDQ based on the current CPU ID to execute supported commands. This multi-queue design for in-kernel use gives some limited improvements: up to 20% reduction of invalidation time was measured by a multi-threaded DMA unmap benchmark, compared to a single queue. The other part of the CMDQV driver will be user-space support that gives a hypervisor running on the host OS to talk to the driver for virtualization use cases, allowing VMs to use VCMDQs without trappings, i.e. no VM Exits. This is designed based on IOMMUFD, and its RFC series is also under review. It will provide a guest OS a bigger improvement: 70% to 90% reductions of TLB invalidation time were measured by DMA unmap tests running in a guest, compared to nested SMMU CMDQ (with trappings). As the initial version, the CMDQV driver only supports ACPI configurations. Signed-off-by: Nate Watterson <nwatterson@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/dce50490b2c10b7254fb36aa73ed7ffd812b283a.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:28:03 +01:00
Jason Gunthorpe	6de80d6192	iommu/arm-smmu-v3: Add struct arm_smmu_impl_ops Mimicing the arm-smmu (v2) driver, introduce a struct arm_smmu_impl_ops to accommodate impl routines. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/8fe9f3805568aabf771fc6706c116459016bf62d.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:23:59 +01:00
Nicolin Chen	b935a5b1c6	iommu/arm-smmu-v3: Add ARM_SMMU_OPT_TEGRA241_CMDQV The CMDQV extension in NVIDIA Tegra241 SoC only supports CS_NONE in the CS field of CMD_SYNC. Add a new SMMU option to accommodate that. Suggested-by: Will Deacon <will@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/a3cb9bb2429fbae4a59f7ef517614d226763d717.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:18:42 +01:00
Nicolin Chen	a7a08b857a	iommu/arm-smmu-v3: Make symbols public for CONFIG_TEGRA241_CMDQV The symbols __arm_smmu_cmdq_skip_err(), arm_smmu_init_one_queue(), and arm_smmu_cmdq_init() need to be used by the tegra241-cmdqv compilation unit in a following patch. Remove the static and put prototypes in the header. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/c4f2aa5f5f40a2e7c68b132c6d3171d6403de57a.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:18:42 +01:00
Nicolin Chen	56ae8866f3	iommu/arm-smmu-v3: Issue a batch of commands to the same cmdq The driver calls in different places the arm_smmu_get_cmdq() helper, and it's fine to do so since the helper always returns the single SMMU CMDQ. However, with NVIDIA CMDQV extension or SMMU ECMDQ, there can be multiple cmdqs in the system to select one from. And either case requires a batch of commands to be issued to the same cmdq. Thus, a cmdq has to be decided in the higher-level callers. Add a cmdq pointer in arm_smmu_cmdq_batch structure, and decide the cmdq when initializing the batch. Pass its pointer down to the bottom function. Update __arm_smmu_cmdq_issue_cmd() accordingly for single command issuers. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/2cbf5ddefb6ea611e48d67c642271bd24421eb21.1724970714.git.nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-30 15:18:41 +01:00
Zhang Zekun	df49881956	iommu/arm-smmu-v3: Remove the unused empty definition arm_smmu_sva_remove_dev_pasid() has been removed since commit `d38c28dbef` ("iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain"), remain the empty definition untouched in header file, which is used when CONFIG_ARM_SMMU_V3_SVA is not set. So, let's remove the unused definition. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240815111504.48810-1-zhangzekun11@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-08-16 15:32:21 +01:00
Kunkun Jiang	25c776dd03	iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping If io-pgtable quirk flag indicates support for hardware update of dirty state, enable HA/HD bits in the SMMU CD and also set the DBM bit in the page descriptor. Now report the dirty page tracking capability of SMMUv3 and select IOMMUFD_DRIVER for ARM_SMMU_V3 if IOMMUFD is enabled. Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20240703101604.2576-6-shameerali.kolothum.thodi@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-03 15:45:47 +01:00
Jean-Philippe Brucker	2f8d6178b4	iommu/arm-smmu-v3: Add feature detection for HTTU If the SMMU supports it and the kernel was built with HTTU support, Probe support for Hardware Translation Table Update (HTTU) which is essentially to enable hardware update of access and dirty flags. Probe and set the smmu::features for Hardware Dirty and Hardware Access bits. This is in preparation, to enable it on the context descriptors of stage 1 format. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20240703101604.2576-3-shameerali.kolothum.thodi@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-03 15:45:47 +01:00
Jason Gunthorpe	a4d75360f7	iommu/arm-smmu-v3: Shrink the strtab l1_desc array The top of the 2 level stream table is (at most) 128k entries big, and two high order allocations are required. One of __le64 which is programmed into the HW (1M), and one of struct arm_smmu_strtab_l1_desc which holds the CPU pointer (3M). There is no reason to store the l2ptr_dma as nothing reads it. devm stores a copy of it and the DMA memory will be freed via devm mechanisms. span is a constant of 8+1. Remove both. This removes 16 bytes from each arm_smmu_l1_ctx_desc and saves up to 2M of memory per iommu instance. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/2-v2-318ed5f6983b+198f-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-02 16:34:16 +01:00
Jason Gunthorpe	f3b273b7c7	iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID The SVA cleanup made the SSID logic entirely general so all we need to do is call it with the correct cd table entry for a S1 domain. This is slightly tricky because of the ASID and how the locking works, the simple fix is to just update the ASID once we get the right locks. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-02 15:39:48 +01:00
Jason Gunthorpe	8ee9175c25	iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED If the STE doesn't point to the CD table we can upgrade it by reprogramming the STE with the appropriate S1DSS. We may also need to turn on ATS at the same time. Keep track if the installed STE is pointing at the cd_table and the ATS state to trigger this path. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-02 15:39:48 +01:00
Jason Gunthorpe	ce26ea9e6e	iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used The HW supports this, use the S1DSS bits to configure the behavior of SSID=0 which is the RID's translation. If SSID's are currently being used in the CD table then just update the S1DSS bits in the STE, remove the master_domain and leave ATS alone. For iommufd the driver design has a small problem that all the unused CD table entries are set with V=0 which will generate an event if VFIO userspace tries to use the CD entry. This patch extends this problem to include the RID as well if PASID is being used. For BLOCKED with used PASIDs the F_STREAM_DISABLED (STRTAB_STE_1_S1DSS_TERMINATE) event is generated on untagged traffic and a substream CD table entry with V=0 (removed pasid) will generate C_BAD_CD. Arguably there is no advantage to using S1DSS over the CD entry 0 with V=0. As we don't yet support PASID in iommufd this is a problem to resolve later, possibly by using EPD0 for unused CD table entries instead of V=0, and not using S1DSS for BLOCKED. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2024-07-02 15:39:48 +01:00

1 2 3

104 Commits