We currently call arm64_mm_context_put() without holding a reference to
the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
ensure the mm only gets freed after we unpinned the ASID.
Fixes: 32784a9562 ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/r/20220426130444.300556-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
The arm_smmu_mm_invalidate_range function is designed to be called
by mm core for Shared Virtual Addressing purpose between IOMMU and
CPU MMU. However, the ways of two subsystems defining their "end"
addresses are slightly different. IOMMU defines its "end" address
using the last address of an address range, while mm core defines
that using the following address of an address range:
include/linux/mm_types.h:
unsigned long vm_end;
/* The first byte after our end address ...
This mismatch resulted in an incorrect calculation for size so it
failed to be page-size aligned. Further, it caused a dead loop at
"while (iova < end)" check in __arm_smmu_tlb_inv_range function.
This patch fixes the issue by doing the calculation correctly.
Fixes: 2f7e8c553e ("iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20220419210158.21320-1-nicolinc@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
Including:
- IOMMU Core changes:
- Removal of aux domain related code as it is basically dead
and will be replaced by iommu-fd framework
- Split of iommu_ops to carry domain-specific call-backs
separatly
- Cleanup to remove useless ops->capable implementations
- Improve 32-bit free space estimate in iova allocator
- Intel VT-d updates:
- Various cleanups of the driver
- Support for ATS of SoC-integrated devices listed in
ACPI/SATC table
- ARM SMMU updates:
- Fix SMMUv3 soft lockup during continuous stream of events
- Fix error path for Qualcomm SMMU probe()
- Rework SMMU IRQ setup to prepare the ground for PMU support
- Minor cleanups and refactoring
- AMD IOMMU driver:
- Some minor cleanups and error-handling fixes
- Rockchip IOMMU driver:
- Use standard driver registration
- MSM IOMMU driver:
- Minor cleanup and change to standard driver registration
- Mediatek IOMMU driver:
- Fixes for IOTLB flushing logic
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmI8OUkACgkQK/BELZcB
GuNz9xAAvlgEg3byMx1y6LY9IctVGyLsegweVGM4+m6XR7qvT5Llc1E2Yw4Gooe4
EAceOihDKW2T9VnMlz9g/cG7Modrx60chcB22KKfxDXPl6yF3R89EMd7DE43T6n/
KPrP9+EsBnI8QSXyYu9ZowioX4CYwWhWD0dKHKAwDvw0BWHHUJ4hTaoHqEoIqLdP
vubeHziIok/g1sylSpJjTzV7r/Na8Q3TGcb/Mi5qC8uiyiyx40vtaduMGNW+/ToN
EqOKszxPmHfHv/xf0IHo0eUZ2L/JAe0mAlZzOb09f5F2sXJrbC05jlmRaDmSjT+u
iEc1r2By/0xo6iOuQC3wD6kTvwwO/ecpNYGhXYXdTbtLquYfL5PSXjRHEU9gf2BO
i/llPDsnytPvm/hnmbi26ChNR6yrDPz5bkoCUl5mnB1jZcaZtIURN7cRlEPPZUWo
62VDNdqWDB6AvALc1/SwYdJX/i5eaBf+niS7/BJ/IkLp2oJxFzrGsU8SRJFHNYsa
zdFIUUoTw647Ul6derSpGzHow169/RwVKYPiXMsaA8/viPNjpBOtfg56abn1WLW6
N4CtwNu6tt+sPfftFdFx2cDEMW2zpWg5doMddBfEx9FAk0HJ4WLZiTpaO2PxcLyd
kCAsGHj+ViAZHINVKFV4nQN/V9yQtcIc4UPmSGJBtKCK3KUYujw=
=bcqr
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- IOMMU Core changes:
- Removal of aux domain related code as it is basically dead and
will be replaced by iommu-fd framework
- Split of iommu_ops to carry domain-specific call-backs separatly
- Cleanup to remove useless ops->capable implementations
- Improve 32-bit free space estimate in iova allocator
- Intel VT-d updates:
- Various cleanups of the driver
- Support for ATS of SoC-integrated devices listed in ACPI/SATC
table
- ARM SMMU updates:
- Fix SMMUv3 soft lockup during continuous stream of events
- Fix error path for Qualcomm SMMU probe()
- Rework SMMU IRQ setup to prepare the ground for PMU support
- Minor cleanups and refactoring
- AMD IOMMU driver:
- Some minor cleanups and error-handling fixes
- Rockchip IOMMU driver:
- Use standard driver registration
- MSM IOMMU driver:
- Minor cleanup and change to standard driver registration
- Mediatek IOMMU driver:
- Fixes for IOTLB flushing logic
* tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits)
iommu/amd: Improve amd_iommu_v2_exit()
iommu/amd: Remove unused struct fault.devid
iommu/amd: Clean up function declarations
iommu/amd: Call memunmap in error path
iommu/arm-smmu: Account for PMU interrupts
iommu/vt-d: Enable ATS for the devices in SATC table
iommu/vt-d: Remove unused function intel_svm_capable()
iommu/vt-d: Add missing "__init" for rmrr_sanity_check()
iommu/vt-d: Move intel_iommu_ops to header file
iommu/vt-d: Fix indentation of goto labels
iommu/vt-d: Remove unnecessary prototypes
iommu/vt-d: Remove unnecessary includes
iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
iommu/vt-d: Remove domain and devinfo mempool
iommu/vt-d: Remove iova_cache_get/put()
iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
iommu/vt-d: Remove intel_iommu::domains
iommu/mediatek: Always tlb_flush_all when each PM resume
iommu/mediatek: Add tlb_lock in tlb_flush_all
iommu/mediatek: Remove the power status checking in tlb flush all
...
Move the domain specific operations out of struct iommu_ops into a new
structure that only has domain specific operations. This solves the
problem of needing to know if the method vector for a given operation
needs to be retrieved from the device or the domain. Logically the domain
ops are the ones that make sense for external subsystems and endpoint
drivers to use, while device ops, with the sole exception of domain_alloc,
are IOMMU API internals.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
PASIDs are process-wide. It was attempted to use refcounted PASIDs to
free them when the last thread drops the refcount. This turned out to
be complex and error prone. Given the fact that the PASID space is 20
bits, which allows up to 1M processes to have a PASID associated
concurrently, PASID resource exhaustion is not a realistic concern.
Therefore, it was decided to simplify the approach and stick with lazy
on demand PASID allocation, but drop the eager free approach and make an
allocated PASID's lifetime bound to the lifetime of the process.
Get rid of the refcounting mechanisms and replace/rename the interfaces
to reflect this new approach.
[ bp: Massage commit message. ]
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com
During event processing, events are read from the event queue one
by one until the queue is empty.If the master device continuously
requests address access at the same time and the SMMU generates
events, the cyclic processing of the event takes a long time and
softlockup warnings may be reported.
arm-smmu-v3 arm-smmu-v3.34.auto: event 0x0a received:
arm-smmu-v3 arm-smmu-v3.34.auto: 0x00007f220000280a
arm-smmu-v3 arm-smmu-v3.34.auto: 0x000010000000007e
arm-smmu-v3 arm-smmu-v3.34.auto: 0x00000000034e8670
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [irq/268-arm-smm:247]
Call trace:
_dev_info+0x7c/0xa0
arm_smmu_evtq_thread+0x1c0/0x230
irq_thread_fn+0x30/0x80
irq_thread+0x128/0x210
kthread+0x134/0x138
ret_from_fork+0x10/0x1c
Kernel panic - not syncing: softlockup: hung tasks
Fix this by calling cond_resched() after the event information is
printed.
Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
Link: https://lore.kernel.org/r/20220119070754.26528-1-zhouguanghui1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
kmalloc_array()/kcalloc() should be used to avoid potential overflow when
a multiplication is needed to compute the size of the requested memory.
So turn a devm_kzalloc()+explicit size computation into an equivalent
devm_kcalloc().
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/3f7b9b202c6b6f5edc234ab7af5f208fbf8bc944.1644274051.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Will Deacon <will@kernel.org>
Treewide cleanup and consolidation of MSI interrupt handling in
preparation for further changes in this area which are necessary to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmHf+SETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYobzGD/wNEFl5qQo5mNZ9thP6JSJFOItm7zMc
2QgzCYOqNwAv4jL6Dqo+EHtbShYqDyWzKdKccgqNjmdIqgW8q7/fubN1OPzRsClV
CZG997AsXDGXYlQcE3tXZjkeCWnWEE2AGLnygSkFV1K/r9ALAtFfTBJAWB+UD+Zc
1P8Kxo0q0Jg+DQAMAA5bWfSSjo/Pmpr/1AFjY7+GA8BBeJJgWOyW7H1S+GYEWVOE
RaQP81Sbd6x1JkopxkNqSJ/lbNJfnPJxi2higB56Y0OYn5CuSarYbZUM7oQ2V61t
jN7pcEEvTpjLd6SJ93ry8WOcJVMTbccCklVfD0AfEwwGUGw2VM6fSyNrZfnrosUN
tGBEO8eflBJzGTAwSkz1EhiGKna4o1NBDWpr0sH2iUiZC5G6V2hUDbM+0PQJhDa8
bICwguZElcUUPOprwjS0HXhymnxghTmNHyoEP1yxGoKLTrwIqkH/9KGustWkcBmM
hNtOCwQNqxcOHg/r3MN0KxttTASgoXgNnmFliAWA7XwseRpLWc95XPQFa5sptRhc
EzwumEz17EW1iI5/NyZQcY+jcZ9BdgCqgZ9ECjZkyN4U+9G6iACUkxVaHUUs77jl
a0ISSEHEvJisFOsOMYyFfeWkpIKGIKP/bpLOJEJ6kAdrUWFvlRGF3qlav3JldXQl
ypFjPapDeB5guw==
=vKzd
-----END PGP SIGNATURE-----
Merge tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI irq updates from Thomas Gleixner:
"Rework of the MSI interrupt infrastructure.
This is a treewide cleanup and consolidation of MSI interrupt handling
in preparation for further changes in this area which are necessary
to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space"
* tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
genirq/msi: Populate sysfs entry only once
PCI/MSI: Unbreak pci_irq_get_affinity()
genirq/msi: Convert storage to xarray
genirq/msi: Simplify sysfs handling
genirq/msi: Add abuse prevention comment to msi header
genirq/msi: Mop up old interfaces
genirq/msi: Convert to new functions
genirq/msi: Make interrupt allocation less convoluted
platform-msi: Simplify platform device MSI code
platform-msi: Let core code handle MSI descriptors
bus: fsl-mc-msi: Simplify MSI descriptor handling
soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
NTB/msi: Convert to msi_on_each_desc()
PCI: hv: Rework MSI handling
powerpc/mpic_u3msi: Use msi_for_each-desc()
powerpc/fsl_msi: Use msi_for_each_desc()
powerpc/pasemi/msi: Convert to msi_on_each_dec()
powerpc/cell/axon_msi: Convert to msi_on_each_desc()
powerpc/4xx/hsta: Rework MSI handling
...
Let the core code fiddle with the MSI descriptor retrieval.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221815.089008198@linutronix.de
Use the common msi_index member and get rid of the pointless wrapper struct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221814.413638645@linutronix.de
The only unconditional part of MSI data in struct device is the irqdomain
pointer. Everything else can be allocated on demand. Create a data
structure and move the irqdomain pointer into it. The other MSI specific
parts are going to be removed from struct device in later steps.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211210221813.617178827@linutronix.de
The commit f115f3c0d5 ("iommu/arm-smmu-v3: Decrease the queue size of
evtq and priq") decreases evtq and priq, which may lead evtq/priq to be
full with fault events, e.g HiSilicon ZIP/SEC/HPRE have maximum 1024 queues
in one device, every queue could be binded with one process and trigger a
fault event. So let's revert f115f3c0d5.
In fact, if an implementation of SMMU really does not need so long evtq
and priq, value of IDR1_EVTQS and IDR1_PRIQS can be set to proper ones.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Acked-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/1638858768-9971-1-git-send-email-wangzhou1@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
The only usage of arm_smmu_mmu_notifier_ops is to assign its address to
the ops field in the mmu_notifier struct, which is a pointer to const
struct mmu_notifier_ops. Make it const to allow the compiler to put it
in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20211204223301.100649-1-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
1. Build command CMD_SYNC cannot fail. So the return value can be ignored.
2. The arm_smmu_cmdq_build_cmd() almost never fails, the addition of
"unlikely()" can optimize the instruction pipeline.
3. Check the return value in arm_smmu_cmdq_batch_add().
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210818080452.2079-2-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Pre-zeroing the batched commands structure is inefficient, as individual
commands are zeroed later in arm_smmu_cmdq_build_cmd(). Therefore, only
the member 'num' needs to be initialized to 0.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20210817113411.1962-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
IO_PGTABLE_QUIRK_NON_STRICT was never a very comfortable fit, since it's
not a quirk of the pagetable format itself. Now that we have a more
appropriate way to convey non-strict unmaps, though, this last of the
non-quirk quirks can also go, and with the flush queue code also now
enforcing its own ordering we can have a lovely cleanup all round.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/155b5c621cd8936472e273a8b07a182f62c6c20d.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pre-zeroing the batched commands structure is inefficient, as individual
commands are zeroed later in arm_smmu_cmdq_build_cmd(). The size is quite
large and commonly most commands won't even be used:
struct arm_smmu_cmdq_batch cmds = {};
345c: 52800001 mov w1, #0x0 // #0
3460: d2808102 mov x2, #0x408 // #1032
3464: 910143a0 add x0, x29, #0x50
3468: 94000000 bl 0 <memset>
Stop pre-zeroing the complete structure and only zero the num member.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1628696966-88386-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When SMMU_GERROR.CMDQP_ERR is different to SMMU_GERRORN.CMDQP_ERR, it
indicates that one or more errors have been encountered on a command queue
control page interface. We need to traverse all ECMDQs in that control
page to find all errors. For each ECMDQ error handling, it is much the
same as the CMDQ error handling. This common processing part is extracted
as a new function __arm_smmu_cmdq_skip_err().
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-5-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
One SMMU has only one normal CMDQ. Therefore, this CMDQ is used regardless
of the core on which the command is inserted. It can be referenced
directly through "smmu->cmdq". However, one SMMU has multiple ECMDQs, and
the ECMDQ used by the core on which the command insertion is executed may
be different. So the helper function arm_smmu_get_cmdq() is added, which
returns the CMDQ/ECMDQ that the current core should use. Currently, the
code that supports ECMDQ is not added. just simply returns "&smmu->cmdq".
Many subfunctions of arm_smmu_cmdq_issue_cmdlist() use "&smmu->cmdq" or
"&smmu->cmdq.q" directly. To support ECMDQ, they need to call the newly
added function arm_smmu_get_cmdq() instead.
Note that normal CMDQ is still required until ECMDQ is available.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-4-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The obvious key to the performance optimization of commit 587e6c10a7
("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
to allow multiple cores to insert commands in parallel after a brief mutex
contention.
Obviously, inserting as many commands at a time as possible can reduce the
number of times the mutex contention participates, thereby improving the
overall performance. At least it reduces the number of calls to function
arm_smmu_cmdq_issue_cmdlist().
Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to insert
the 'cmd+sync' commands at a time.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-3-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The obvious key to the performance optimization of commit 587e6c10a7
("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
to allow multiple cores to insert commands in parallel after a brief mutex
contention.
Obviously, inserting as many commands at a time as possible can reduce the
number of times the mutex contention participates, thereby improving the
overall performance. At least it reduces the number of calls to function
arm_smmu_cmdq_issue_cmdlist().
Therefore, use command queue batching helpers to insert multiple commands
at a time.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210811114852.2429-2-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Implement the map_pages() callback for ARM SMMUV3 driver to allow calls
from iommu_map to map multiple pages of the same size in one call.
Also remove the map() callback for the ARM SMMUV3 driver as it will no
longer be used.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1627697831-158822-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement the unmap_pages() callback for ARM SMMUV3 driver to allow calls
from iommu_unmap to unmap multiple pages of the same size in one call.
Also remove the unmap() callback for the ARM SMMUV3 driver as it will
no longer be used.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1627697831-158822-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Members of struct "llq" will be zero-inited, apart from member max_n_shift.
But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.
In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.
Removing these initializations is seen as a small performance optimisation,
as this code is (very) hot path.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1624293394-202509-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If people are going to insist on calling iommu_iova_to_phys()
pointlessly and expecting it to work, we can at least do ourselves a
favour by handling those cases in the core code, rather than repeatedly
across an inconsistent handful of drivers.
Since all the existing drivers implement the internal callback, and any
future ones are likely to want to work with iommu-dma which relies on
iova_to_phys a fair bit, we may as well remove that currently-redundant
check as well and consider it mandatory.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/f564f3f6ff731b898ff7a898919bf871c2c7745a.1626354264.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:382:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message
Remove it can help us save a bit of memory.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210609125438.14369-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If device registration fails, remove sysfs attribute
and if setting bus callbacks fails, unregister the device
and cleanup the sysfs attribute.
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Link: https://lore.kernel.org/r/20210608164559.204023-1-ameynarkhede03@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
The only place of_iommu.h is needed is in drivers/of/device.c. Remove it
from everywhere else.
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Yong Wu <yong.wu@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20210527193710.1281746-2-robh@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit d25f6ead16 ("iommu/arm-smmu-v3: Increase maximum size of queues")
expands the cmdq queue size to improve the success rate of concurrent
command queue space allocation by multiple cores. However, this extension
does not apply to evtq and priq, because for both of them, the SMMU driver
is the consumer. Instead, memory resources are wasted. Therefore, the
queue size of evtq and priq is restored to the original setting, one page.
Fixes: d25f6ead16 ("iommu/arm-smmu-v3: Increase maximum size of queues")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210531123553.9602-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When a device or driver misbehaves, it is possible to receive DMA fault
events much faster than we can print them out, causing a lock up of the
system and inability to cancel the source of the problem. Ratelimit
printing of events to help recovery.
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210531095648.118282-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
The SMMU provides a Stall model for handling page faults in platform
devices. It is similar to PCIe PRI, but doesn't require devices to have
their own translation cache. Instead, faulting transactions are parked
and the OS is given a chance to fix the page tables and retry the
transaction.
Enable stall for devices that support it (opt-in by firmware). When an
event corresponds to a translation error, call the IOMMU fault handler.
If the fault is recoverable, it will call us back to terminate or
continue the stall.
To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
initializes the fault queue for the device.
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210526161927.24268-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Rather than have separate opaque setter functions that are easy to
overlook and lead to repetitive boilerplate in drivers, let's pass the
relevant initialisation parameters directly to iommu_device_register().
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It happens that the 3 drivers which first supported being modular are
also ones which play games with their pgsize_bitmap, so have non-const
iommu_ops where dynamically setting the owner manages to work out OK.
However, it's less than ideal to force that upon all drivers which want
to be modular - like the new sprd-iommu driver which now has a potential
bug in that regard - so let's just statically set the module owner and
let ops remain const wherever possible.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/31423b99ff609c3d4b291c701a7a7a810d9ce8dc.1617285386.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Per SMMUv3 spec, there is no Size and Addr field in the PREFETCH_CONFIG
command and they're not used by the driver. Remove them.
We can add them back if we're going to use PREFETCH_ADDR in the future.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20210407084448.1838-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Instead make the global iommu_dma_strict paramete in iommu.c canonical by
exporting helpers to get and set it and use those directly in the drivers.
This make sure that the iommu.strict parameter also works for the AMD and
Intel IOMMU drivers on x86. As those default to lazy flushing a new
IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
represent the default if not overriden by an explicit parameter.
[ported on top of the other iommu_attr changes and added a few small
missing bits]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210401155256.298656-19-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use an explicit enable_nesting method instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Link: https://lore.kernel.org/r/20210401155256.298656-17-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When handling faults from the event or PRI queue, we need to find the
struct device associated with a SID. Add a rb_tree to keep track of
SIDs.
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210401154718.307519-8-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The pasid-num-bits property shouldn't need a dedicated fwspec field,
it's a job for device properties. Add properties for IORT, and access
the number of PASID bits using device_property_read_u32().
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20210401154718.307519-3-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In arm_smmu_gerror_handler(), the value of the SMMU_GERROR register is
filtered by GERROR_ERR_MASK. However, the GERROR_ERR_MASK does not contain
the SFM bit. As a result, the subsequent error processing is not performed
when only the SFM error occurs.
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Reported-by: Rui Zhu <zhurui3@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210324081603.1074-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Merge in Mediatek support from Yong Wu which introduces significant
changes to the TLB invalidation and Arm short-descriptor code in the
io-pgtable layer.
* for-joerg/mtk: (40 commits)
MAINTAINERS: Add entry for MediaTek IOMMU
iommu/mediatek: Add mt8192 support
iommu/mediatek: Remove unnecessary check in attach_device
iommu/mediatek: Support master use iova over 32bit
iommu/mediatek: Add iova reserved function
iommu/mediatek: Support for multi domains
iommu/mediatek: Add get_domain_id from dev->dma_range_map
iommu/mediatek: Add iova_region structure
iommu/mediatek: Move geometry.aperture updating into domain_finalise
iommu/mediatek: Move domain_finalise into attach_device
iommu/mediatek: Adjust the structure
iommu/mediatek: Support report iova 34bit translation fault in ISR
iommu/mediatek: Support up to 34bit iova in tlb flush
iommu/mediatek: Add power-domain operation
iommu/mediatek: Add pm runtime callback
iommu/mediatek: Add device link for smi-common and m4u
iommu/mediatek: Add error handle for mtk_iommu_probe
iommu/mediatek: Move hw_init into attach_device
iommu/mediatek: Update oas for v7s
iommu/mediatek: Add a flag for iova 34bits case
...
Currently gather->end is "unsigned long" which may be overflow in
arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
Although it doesn't affect the size(end - start), it affects the checking
"gather->end < end"
This patch changes this "end" to the real end address
(end = start + size - 1). Correspondingly, update the length to
"end - start + 1".
Fixes: a7d20dc19d ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210107122909.16317-5-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
ARMv8.1 extensions added Virtualization Host Extensions (VHE), which allow
to run a host kernel at EL2. When using normal DMA, Device and CPU address
spaces are dissociated, and do not need to implement the same
capabilities, so VHE hasn't been used in the SMMU until now.
With shared address spaces however, ASIDs are shared between MMU and SMMU,
and broadcast TLB invalidations issued by a CPU are taken into account by
the SMMU. TLB entries on both sides need to have identical exception level
in order to be cleared with a single invalidation.
When the CPU is using VHE, enable VHE in the SMMU for all STEs. Normal DMA
mappings will need to use TLBI_EL2 commands instead of TLBI_NH, but
shouldn't be otherwise affected by this change.
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210122151054.2833521-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
When BTM isn't supported by the SMMU, send invalidations on the
command queue.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210122151054.2833521-3-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Extract some of the cmd initialization and the ATC invalidation from
arm_smmu_tlb_inv_range(), to allow an MMU notifier to invalidate a VA
range by ASID.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210122151054.2833521-2-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Since we now keep track of page 1 via a separate pointer that already
encapsulates aliasing to page 0 as necessary, we can remove the clunky
fixup routine and simply use the relevant bases directly. The current
architecture spec (IHI0070D.a) defines SMMU_{EVENTQ,PRIQ}_{PROD,CONS} as
offsets relative to page 1, so the cleanup represents a little bit of
convergence as well as just lines of code saved.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/08d9bda570bb5681f11a2f250a31be9ef763b8c5.1611238182.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The only user of tlb_flush_leaf is a particularly hairy corner of the
Arm short-descriptor code, which wants a synchronous invalidation to
minimise the races inherent in trying to split a large page mapping.
This is already far enough into "here be dragons" territory that no
sensible caller should ever hit it, and thus it really doesn't need
optimising. Although using tlb_flush_walk there may technically be
more heavyweight than needed, it does the job and saves everyone else
having to carry around useless baggage.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/9844ab0c5cb3da8b2f89c6c2da16941910702b41.1606324115.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
More steps along the way to Shared Virtual {Addressing, Memory} support
for Arm's SMMUv3, including the addition of a helper library that can be
shared amongst other IOMMU implementations wishing to support this
feature.
* for-next/iommu/svm:
iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
iommu/sva: Add PASID helpers
iommu/ioasid: Add ioasid references
The invalidate_range() notifier is called for any change to the address
space. Perform the required ATC invalidations.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20201106155048.997886-5-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
The sva_bind() function allows devices to access process address spaces
using a PASID (aka SSID).
(1) bind() allocates or gets an existing MMU notifier tied to the
(domain, mm) pair. Each mm gets one PASID.
(2) Any change to the address space calls invalidate_range() which sends
ATC invalidations (in a subsequent patch).
(3) When the process address space dies, the release() notifier disables
the CD to allow reclaiming the page tables. Since release() has to
be light we do not instruct device drivers to stop DMA here, we just
ignore incoming page faults from this point onwards.
To avoid any event 0x0a print (C_BAD_CD) we disable translation
without clearing CD.V. PCIe Translation Requests and Page Requests
are silently denied. Don't clear the R bit because the S bit can't
be cleared when STALL_MODEL==0b10 (forced), and clearing R without
clearing S is useless. Faulting transactions will stall and will be
aborted by the IOPF handler.
(4) After stopping DMA, the device driver releases the bond by calling
unbind(). We release the MMU notifier, free the PASID and the bond.
Three structures keep track of bonds:
* arm_smmu_bond: one per {device, mm} pair, the handle returned to the
device driver for a bind() request.
* arm_smmu_mmu_notifier: one per {domain, mm} pair, deals with ATS/TLB
invalidations and clearing the context descriptor on mm exit.
* arm_smmu_ctx_desc: one per mm, holds the pinned ASID and pgd.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20201106155048.997886-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Fix the following coccinelle warnings:
./drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:36:12-26: WARNING: Assignment of 0/1 to bool variable
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604744439-6846-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Will Deacon <will@kernel.org>
Implement the IOMMU device feature callbacks to support the SVA feature.
At the moment dev_has_feat() returns false since I/O Page Faults and BTM
aren't yet implemented.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-12-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Aggregate all sanity-checks for sharing CPU page tables with the SMMU
under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to
check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check
FEAT_STALLS.
Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't
enable it at the moment. Since the entire VMID space is shared with the
CPU, enabling DVM (by clearing SMMU_CR2.PTM) could result in
over-invalidation and affect performance of stage-2 mappings.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200918101852.582559-11-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
The SMMU has a single ASID space, the union of shared and private ASID
sets. This means that the SMMU driver competes with the arch allocator
for ASIDs. Shared ASIDs are those of Linux processes, allocated by the
arch, and contribute in broadcast TLB maintenance. Private ASIDs are
allocated by the SMMU driver and used for "classic" map/unmap DMA. They
require command-queue TLB invalidations.
When we pin down an mm_context and get an ASID that is already in use by
the SMMU, it belongs to a private context. We used to simply abort the
bind, but this is unfair to users that would be unable to bind a few
seemingly random processes. Try to allocate a new private ASID for the
context, and make the old ASID shared.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-10-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
into two sets, shared and private. Shared ASIDs correspond to those
obtained from the arch ASID allocator, and private ASIDs are used for
"classic" map/unmap DMA.
A possible conflict happens when trying to use a shared ASID that has
already been allocated for private use by the SMMU driver. This will be
addressed in a later patch by replacing the private ASID. At the
moment we return -EBUSY.
Each mm_struct shared with the SMMU will have a single context
descriptor. Add a refcount to keep track of this. It will be protected
by the global SVA lock.
Introduce a new arm-smmu-v3-sva.c file and the CONFIG_ARM_SMMU_V3_SVA
option to let users opt in SVA support.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-9-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Allow sharing structure definitions with the upcoming SVA support for
Arm SMMUv3, by moving them to a separate header. We could surgically
extract only what is needed but keeping all definitions in one place
looks nicer.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-8-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Reading the 'prod' MMIO register in order to determine whether or not
there is valid data beyond 'cons' for a given queue does not provide
sufficient dependency ordering, as the resulting access is address
dependent only on 'cons' and can therefore be speculated ahead of time,
potentially allowing stale data to be read by the CPU.
Use readl() instead of readl_relaxed() when updating the shadow copy of
the 'prod' pointer, so that all speculated memory reads from the
corresponding queue can occur only from valid slots.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
[will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
Signed-off-by: Will Deacon <will@kernel.org>
When building with C=1, sparse reports some issues regarding endianness
annotations:
arm-smmu-v3.c:221:26: warning: cast to restricted __le64
arm-smmu-v3.c:221:24: warning: incorrect type in assignment (different base types)
arm-smmu-v3.c:221:24: expected restricted __le64 [usertype]
arm-smmu-v3.c:221:24: got unsigned long long [usertype]
arm-smmu-v3.c:229:20: warning: incorrect type in argument 1 (different base types)
arm-smmu-v3.c:229:20: expected restricted __le64 [usertype] *[assigned] dst
arm-smmu-v3.c:229:20: got unsigned long long [usertype] *ent
arm-smmu-v3.c:229:25: warning: incorrect type in argument 2 (different base types)
arm-smmu-v3.c:229:25: expected unsigned long long [usertype] *[assigned] src
arm-smmu-v3.c:229:25: got restricted __le64 [usertype] *
arm-smmu-v3.c:396:20: warning: incorrect type in argument 1 (different base types)
arm-smmu-v3.c:396:20: expected restricted __le64 [usertype] *[assigned] dst
arm-smmu-v3.c:396:20: got unsigned long long *
arm-smmu-v3.c:396:25: warning: incorrect type in argument 2 (different base types)
arm-smmu-v3.c:396:25: expected unsigned long long [usertype] *[assigned] src
arm-smmu-v3.c:396:25: got restricted __le64 [usertype] *
arm-smmu-v3.c:1349:32: warning: invalid assignment: |=
arm-smmu-v3.c:1349:32: left side has type restricted __le64
arm-smmu-v3.c:1349:32: right side has type unsigned long
arm-smmu-v3.c:1396:53: warning: incorrect type in argument 3 (different base types)
arm-smmu-v3.c:1396:53: expected restricted __le64 [usertype] *dst
arm-smmu-v3.c:1396:53: got unsigned long long [usertype] *strtab
arm-smmu-v3.c:1424:39: warning: incorrect type in argument 1 (different base types)
arm-smmu-v3.c:1424:39: expected unsigned long long [usertype] *[assigned] strtab
arm-smmu-v3.c:1424:39: got restricted __le64 [usertype] *l2ptr
While harmless, they are incorrect and could hide actual errors during
development. Fix them.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20200918141856.629722-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Polling by MSI isn't necessarily faster than polling by SEV. Tests on
hi1620 show hns3 100G NIC network throughput can improve from 25G to
27G if we disable MSI polling while running 16 netperf threads sending
UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for
single thread.
The reason for the throughput improvement is that the latency to poll
the completion of CMD_SYNC becomes smaller. After sending a CMD_SYNC
in an empty cmd queue, typically we need to wait for 280ns using MSI
polling. But we only need around 190ns after disabling MSI polling.
This patch provides a command line option so that users can decide to
use MSI polling or not based on their tests.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20200827092957.22500-4-song.bao.hua@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Just use module_param() - going out of the way to specify a "different"
name that's identical to the variable name is silly.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20200827092957.22500-3-song.bao.hua@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
This fixed the below checkpatch issue:
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using
octal permissions '0444'.
417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417:
module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20200827092957.22500-2-song.bao.hua@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
The actual size of level-1 stream table is l1size. This looks like an
oversight on commit d2e88e7c08 ("iommu/arm-smmu: Fix LOG2SIZE setting
for 2-level stream tables") which forgot to update the @size in error
message as well.
As memory allocation failure is already bad enough, nothing worse would
happen. But let's be careful.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20200826141758.341-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
- Move Arm SMMU driver files into their own subdirectory
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl8ewXgQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNGfuB/wMYzwgDotb0nggknPfm/TEcucjdN6ClXqN
9YUCslL1ggSLNHIZYSXkLQ5wwPnIg/vFIoU5ishOLT4YumW0Abck0ARfuJOV87GV
CO7h1iTS4fAxQarvtrjQmxYgNwlMI3+fKzNoQ/F9YP+v/2gUeuiPZibjyEtnc5jK
S6yLvrvQRBCmDVXtnI0/RdXbBqF4SU1RgI16bq2TbKvj8NmiEtIZjY7VxDNHKOMW
BpOw1r8Cl1K3uvgnkZzkOQPTvolgNpdXF3ZzyK219oN2FufD85w/g3LS00NuNwRc
c72TP4SksXTvRz/XiFvfAuskXQI2j2hg1k3XMPeB7pbJnzO67DYf
=chZm
-----END PGP SIGNATURE-----
Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into next
More Arm SMMU updates for 5.9
- Move Arm SMMU driver files into their own subdirectory
The Arm SMMU drivers are getting fat on vendor value-add, so move them
to their own subdirectory out of the way of the other IOMMU drivers.
Suggested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Will Deacon <will@kernel.org>