Commit Graph

303 Commits

Author SHA1 Message Date
Zhen Lei
5e41c99894 iommu/vt-d: Remove unnecessary oom message
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message

Remove it can help us save a bit of memory.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210609124937.14260-1-thunder.leizhen@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210818134852.1847070-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Lu Baolu
4d99efb229 iommu/vt-d: Update the virtual command related registers
The VT-d spec Revision 3.3 updated the virtual command registers, virtual
command opcode B register, virtual command response register and virtual
command capability register (Section 10.4.43, 10.4.44, 10.4.45, 10.4.46).
This updates the virtual command interface implementation in the Intel
IOMMU driver accordingly.

Fixes: 24f27d32ab ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210713042649.3547403-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19 10:41:08 +02:00
Robin Murphy
78ca078459 iommu/vt-d: Prepare for multiple DMA domain types
In preparation for the strict vs. non-strict decision for DMA domains
to be expressed in the domain type, make sure we expose our flush queue
awareness by accepting the new domain type, and test the specific
feature flag where we want to identify DMA domains in general. The DMA
ops reset/setup can simply be made unconditional, since iommu-dma
already knows only to touch DMA domains.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/31a8ef868d593a2f3826a6a120edee81815375a7.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:27:49 +02:00
Robin Murphy
f297e27f83 iommu/vt-d: Drop IOVA cookie management
The core code bakes its own cookies now.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e9dbe3b6108f8538e17e0c5f59f8feeb714f51a4.1628682048.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:25:31 +02:00
Liu Yi L
8798d36411 iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry()
This fixes improper iotlb invalidation in intel_pasid_tear_down_entry().
When a PASID was used as nested mode, released and reused, the following
error message will appear:

[  180.187556] Unexpected page request in Privilege Mode
[  180.187565] Unexpected page request in Privilege Mode
[  180.279933] Unexpected page request in Privilege Mode
[  180.279937] Unexpected page request in Privilege Mode

Per chapter 6.5.3.3 of VT-d spec 3.3, when tear down a pasid entry, the
software should use Domain selective IOTLB flush if the PGTT of the pasid
entry is SL only or Nested, while for the pasid entries whose PGTT is FL
only or PT using PASID-based IOTLB flush is enough.

Fixes: 2cd1311a26 ("iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr")
Signed-off-by: Kumar Sanjay K <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Yi Sun <yi.y.sun@intel.com>
Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:15:58 +02:00
Fenghua Yu
62ef907a04 iommu/vt-d: Fix PASID reference leak
A PASID reference is increased whenever a device is bound to an mm (and
its PASID) successfully (i.e. the device's sdev user count is increased).
But the reference is not dropped every time the device is unbound
successfully from the mm (i.e. the device's sdev user count is decreased).
The reference is dropped only once by calling intel_svm_free_pasid() when
there isn't any device bound to the mm. intel_svm_free_pasid() drops the
reference and only frees the PASID on zero reference.

Fix the issue by dropping the PASID reference and freeing the PASID when
no reference on successful unbinding the device by calling
intel_svm_free_pasid() .

Fixes: 4048377414 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:15:58 +02:00
Lu Baolu
75cc1018a9 iommu/vt-d: Move clflush'es from iotlb_sync_map() to map_pages()
As the Intel VT-d driver has switched to use the iommu_ops.map_pages()
callback, multiple pages of the same size will be mapped in a call.
There's no need to put the clflush'es in iotlb_sync_map() callback.
Move them back into __domain_mapping() to simplify the code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720020615.4144323-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:56:25 +02:00
Lu Baolu
3f34f12597 iommu/vt-d: Implement map/unmap_pages() iommu_ops callback
Implement the map_pages() and unmap_pages() callback for the Intel IOMMU
driver to allow calls from iommu core to map and unmap multiple pages of
the same size in one call. With map/unmap_pages() implemented, the prior
map/unmap callbacks are deprecated.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720020615.4144323-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:56:25 +02:00
Lu Baolu
a886d5a7e6 iommu/vt-d: Report real pgsize bitmap to iommu core
The pgsize bitmap is used to advertise the page sizes our hardware supports
to the IOMMU core, which will then use this information to split physically
contiguous memory regions it is mapping into page sizes that we support.

Traditionally the IOMMU core just handed us the mappings directly, after
making sure the size is an order of a 4KiB page and that the mapping has
natural alignment. To retain this behavior, we currently advertise that we
support all page sizes that are an order of 4KiB.

We are about to utilize the new IOMMU map/unmap_pages APIs. We could change
this to advertise the real page sizes we support.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720020615.4144323-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:56:25 +02:00
John Garry
308723e358 iommu: Remove mode argument from iommu_set_dma_strict()
We only ever now set strict mode enabled in iommu_set_dma_strict(), so
just remove the argument.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1626088340-5838-7-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:27:38 +02:00
Zhen Lei
d0e108b8e9 iommu/vt-d: Add support for IOMMU default DMA mode build options
Make IOMMU_DEFAULT_LAZY default for when INTEL_IOMMU config is set,
as is current behaviour.

Also delete global flag intel_iommu_strict:
- In intel_iommu_setup(), call iommu_set_dma_strict(true) directly. Also
  remove the print, as iommu_subsys_init() prints the mode and we have
  already marked this param as deprecated.

- For cap_caching_mode() check in intel_iommu_setup(), call
  iommu_set_dma_strict(true) directly; also reword the accompanying print
  with a level downgrade and also add the missing '\n'.

- For Ironlake GPU, again call iommu_set_dma_strict(true) directly and
  keep the accompanying print.

[jpg: Remove intel_iommu_strict]

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1626088340-5838-5-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:27:38 +02:00
John Garry
1d479f160c iommu: Deprecate Intel and AMD cmdline methods to enable strict mode
Now that the x86 drivers support iommu.strict, deprecate the custom
methods.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1626088340-5838-2-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:27:38 +02:00
Lu Baolu
474dd1c650 iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries
The commit 2b0140c696 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
fixes an issue of "sub-device is removed where the context entry is cleared
for all aliases". But this commit didn't consider the PASID entry and PASID
table in VT-d scalable mode. This fix increases the coverage of scalable
mode.

Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Fixes: 8038bdb855 ("iommu/vt-d: Only clear real DMA device's context entries")
Fixes: 2b0140c696 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable@vger.kernel.org # v5.6+
Cc: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-14 12:58:07 +02:00
Sanjay Kumar
37764b952e iommu/vt-d: Global devTLB flush when present context entry changed
This fixes a bug in context cache clear operation. The code was not
following the correct invalidation flow. A global device TLB invalidation
should be added after the IOTLB invalidation. At the same time, it
uses the domain ID from the context entry. But in scalable mode, the
domain ID is in PASID table entry, not context entry.

Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210712071315.3416543-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-14 12:57:39 +02:00
Joerg Roedel
2b9d8e3e9a Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/smmu', 'x86/vt-d', 'x86/amd', 'virtio' and 'core' into next 2021-06-25 15:23:25 +02:00
Jean-Philippe Brucker
ac6d704679 iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()
Passing a 64-bit address width to iommu_setup_dma_ops() is valid on
virtual platforms, but isn't currently possible. The overflow check in
iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass
a limit address instead of a size, so callers don't have to fake a size
to work around the check.

The base and limit parameters are being phased out, because:
* they are redundant for x86 callers. dma-iommu already reserves the
  first page, and the upper limit is already in domain->geometry.
* they can now be obtained from dev->dma_range_map on Arm.
But removing them on Arm isn't completely straightforward so is left for
future work. As an intermediate step, simplify the x86 callers by
passing dummy limits.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-25 15:02:43 +02:00
Colin Ian King
934ed4580c iommu/vt-d: Fix dereference of pointer info before it is null checked
The assignment of iommu from info->iommu occurs before info is null checked
hence leading to a potential null pointer dereference issue. Fix this by
assigning iommu and checking if iommu is null after null checking info.

Addresses-Coverity: ("Dereference before null check")
Fixes: 4c82b88696 ("iommu/vt-d: Allocate/register iopf queue for sva devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210611135024.32781-1-colin.king@canonical.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-18 15:19:50 +02:00
Joerg Roedel
d6a9642bd6 iommu/vt-d: Fix linker error on 32-bit
A recent commit broke the build on 32-bit x86. The linker throws these
messages:

	ld: drivers/iommu/intel/perf.o: in function `dmar_latency_snapshot':
	perf.c:(.text+0x40c): undefined reference to `__udivdi3'
	ld: perf.c:(.text+0x458): undefined reference to `__udivdi3'

The reason are the 64-bit divides in dmar_latency_snapshot(). Use the
div_u64() helper function for those.

Fixes: 55ee5e67a5 ("iommu/vt-d: Add common code for dmar latency performance monitors")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20210610083120.29224-1-joro@8bytes.org
2021-06-10 10:33:14 +02:00
Parav Pandit
7a0f06c197 iommu/vt-d: No need to typecast
Page directory assignment by alloc_pgtable_page() or phys_to_virt()
doesn't need typecasting as both routines return void*. Hence, remove
typecasting from both the calls.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-24-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit
cee57d4fe7 iommu/vt-d: Remove unnecessary braces
No need for braces for single line statement under if() block.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-22-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit
74f6d776ae iommu/vt-d: Removed unused iommu_count in dmar domain
DMAR domain uses per DMAR refcount. It is indexed by iommu seq_id.
Older iommu_count is only incremented and decremented but no decisions
are taken based on this refcount. This is not of much use.

Hence, remove iommu_count and further simplify domain_detach_iommu()
by returning void.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-21-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit
1f106ff0ea iommu/vt-d: Use bitfields for DMAR capabilities
IOTLB device presence, iommu coherency and snooping are boolean
capabilities. Use them as bits and keep them adjacent.

Structure layout before the reorg.
$ pahole -C dmar_domain drivers/iommu/intel/dmar.o
struct dmar_domain {
        int                        nid;                  /*     0     4 */
        unsigned int               iommu_refcnt[128];    /*     4   512 */
        /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */
        u16                        iommu_did[128];       /*   516   256 */
        /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */
        bool                       has_iotlb_device;     /*   772     1 */

        /* XXX 3 bytes hole, try to pack */

        struct list_head           devices;              /*   776    16 */
        struct list_head           subdevices;           /*   792    16 */
        struct iova_domain         iovad __attribute__((__aligned__(8)));
							 /*   808  2320 */
        /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */
        struct dma_pte *           pgd;                  /*  3128     8 */
        /* --- cacheline 49 boundary (3136 bytes) --- */
        int                        gaw;                  /*  3136     4 */
        int                        agaw;                 /*  3140     4 */
        int                        flags;                /*  3144     4 */
        int                        iommu_coherency;      /*  3148     4 */
        int                        iommu_snooping;       /*  3152     4 */
        int                        iommu_count;          /*  3156     4 */
        int                        iommu_superpage;      /*  3160     4 */

        /* XXX 4 bytes hole, try to pack */

        u64                        max_addr;             /*  3168     8 */
        u32                        default_pasid;        /*  3176     4 */

        /* XXX 4 bytes hole, try to pack */

        struct iommu_domain        domain;               /*  3184    72 */

        /* size: 3256, cachelines: 51, members: 18 */
        /* sum members: 3245, holes: 3, sum holes: 11 */
        /* forced alignments: 1 */
        /* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));

After arranging it for natural padding and to make flags as u8 bits, it
saves 8 bytes for the struct.

struct dmar_domain {
        int                        nid;                  /*     0     4 */
        unsigned int               iommu_refcnt[128];    /*     4   512 */
        /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */
        u16                        iommu_did[128];       /*   516   256 */
        /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */
        u8                         has_iotlb_device:1;   /*   772: 0  1 */
        u8                         iommu_coherency:1;    /*   772: 1  1 */
        u8                         iommu_snooping:1;     /*   772: 2  1 */

        /* XXX 5 bits hole, try to pack */
        /* XXX 3 bytes hole, try to pack */

        struct list_head           devices;              /*   776    16 */
        struct list_head           subdevices;           /*   792    16 */
        struct iova_domain         iovad __attribute__((__aligned__(8)));
							 /*   808  2320 */
        /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */
        struct dma_pte *           pgd;                  /*  3128     8 */
        /* --- cacheline 49 boundary (3136 bytes) --- */
        int                        gaw;                  /*  3136     4 */
        int                        agaw;                 /*  3140     4 */
        int                        flags;                /*  3144     4 */
        int                        iommu_count;          /*  3148     4 */
        int                        iommu_superpage;      /*  3152     4 */

        /* XXX 4 bytes hole, try to pack */

        u64                        max_addr;             /*  3160     8 */
        u32                        default_pasid;        /*  3168     4 */

        /* XXX 4 bytes hole, try to pack */

        struct iommu_domain        domain;               /*  3176    72 */

        /* size: 3248, cachelines: 51, members: 18 */
        /* sum members: 3236, holes: 3, sum holes: 11 */
        /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits */
        /* forced alignments: 1 */
        /* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-20-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
YueHaibing
3bc770b0e9 iommu/vt-d: Use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210528130229.22108-1-yuehaibing@huawei.com
Link: https://lore.kernel.org/r/20210610020115.1637656-19-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Gustavo A. R. Silva
606636dcbd iommu/vt-d: Fix out-bounds-warning in intel/svm.c
Replace a couple of calls to memcpy() with simple assignments in order
to fix the following out-of-bounds warning:

drivers/iommu/intel/svm.c:1198:4: warning: 'memcpy' offset [25, 32] from
    the object at 'desc' is out of the bounds of referenced subobject
    'qw2' with type 'long long unsigned int' at offset 16 [-Warray-bounds]

The problem is that the original code is trying to copy data into a
couple of struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &desc.qw2 and &resp.qw2, respectively.

This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().

Link: https://github.com/KSPP/linux/issues/109
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210414201403.GA392764@embeddedor
Link: https://lore.kernel.org/r/20210610020115.1637656-18-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
0f4834ab25 iommu/vt-d: Add PRQ handling latency sampling
The execution time for page fault request handling is performance critical
and needs to be monitored. This adds code to sample the execution time of
page fault request handling.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-17-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
74eb87a0f9 iommu/vt-d: Add cache invalidation latency sampling
Queued invalidation execution time is performance critical and needs
to be monitored. This adds code to sample the execution time of IOTLB/
devTLB/ICE cache invalidation.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-16-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
456bb0b97f iommu/vt-d: Expose latency monitor data through debugfs
A debugfs interface /sys/kernel/debug/iommu/intel/dmar_perf_latency is
created to control and show counts of execution time ranges for various
types per DMAR. The interface may help debug any potential performance
issue.

By default, the interface is disabled.

Possible write value of /sys/kernel/debug/iommu/intel/dmar_perf_latency
  0 - disable sampling all latency data
  1 - enable sampling IOTLB invalidation latency data
  2 - enable sampling devTLB invalidation latency data
  3 - enable sampling intr entry cache invalidation latency data
  4 - enable sampling prq handling latency data

Read /sys/kernel/debug/iommu/intel/dmar_perf_latency gives a snapshot
of sampling result of all enabled monitors.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-15-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
55ee5e67a5 iommu/vt-d: Add common code for dmar latency performance monitors
The execution time of some operations is very performance critical, such
as cache invalidation and PRQ processing time. This adds some common code
to monitor the execution time range of those operations. The interfaces
include enabling/disabling, checking status, updating sampling data and
providing a common string format for users.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-14-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
e93a67f5a0 iommu/vt-d: Add prq_report trace event
This adds a new trace event to track the page fault request report.
This event will provide almost all information defined in a page
request descriptor.

A sample output:
| prq_report: dmar0/0000:00:0a.0 seq# 1: rid=0x50 addr=0x559ef6f97 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 2: rid=0x50 addr=0x559ef6f9c rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 3: rid=0x50 addr=0x559ef6f98 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 4: rid=0x50 addr=0x559ef6f9d rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 5: rid=0x50 addr=0x559ef6f99 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 6: rid=0x50 addr=0x559ef6f9e rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 7: rid=0x50 addr=0x559ef6f9a r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 8: rid=0x50 addr=0x559ef6f9f rw--l pasid=0x2 index=0x1

This will be helpful for I/O page fault related debugging.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-13-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
d5b9e4bfe0 iommu/vt-d: Report prq to io-pgfault framework
Let the IO page fault requests get handled through the io-pgfault
framework.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
4c82b88696 iommu/vt-d: Allocate/register iopf queue for sva devices
This allocates and registers the iopf queue infrastructure for devices
which want to support IO page fault for SVA.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
ae7f09b14b iommu/vt-d: Refactor prq_event_thread()
Refactor prq_event_thread() by moving handling single prq event out of
the main loop.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
9e52cc0fed iommu/vt-d: Use common helper to lookup svm devices
It's common to iterate the svm device list and find a matched device. Add
common helpers to do this and consolidate the code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu
4048377414 iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers
Align the pasid alloc/free code with the generic helpers defined in the
iommu core. This also refactored the SVA binding code to improve the
readability.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu
100b8a14a3 iommu/vt-d: Add pasid private data helpers
We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core.
That means the pasid life cycle will be managed by iommu core. Use a
local array to save the per pasid private data instead of attaching it
the real pasid.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu
521f546b4e iommu/vt-d: Support asynchronous IOMMU nested capabilities
Current VT-d implementation supports nested translation only if all
underlying IOMMUs support the nested capability. This is unnecessary
as the upper layer is allowed to create different containers and set
them with different type of iommu backend. The IOMMU driver needs to
guarantee that devices attached to a nested mode iommu_domain should
support nested capabilility.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210517065701.5078-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu
879fcc6bda iommu/vt-d: Select PCI_ATS explicitly
The Intel VT-d implementation supports device TLB management. Select
PCI_ATS explicitly so that the pci_ats helpers are always available.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210512065313.3441309-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu
719a193356 iommu/vt-d: Tweak the description of a DMA fault
The Intel IOMMU driver reports the DMA fault reason in a decimal number
while the VT-d specification uses a hexadecimal one. It's inconvenient
that users need to covert them everytime before consulting the spec.
Let's use hexadecimal number for a DMA fault reason.

The fault message uses 0xffffffff as PASID for DMA requests w/o PASID.
This is confusing. Tweak this by adding "NO_PASID" explicitly.

Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210517065425.4953-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Aditya Srivastava
367f82de5a iommu/vt-d: Fix kernel-doc syntax in file header
The opening comment mark '/**' is used for highlighting the beginning of
kernel-doc comments.
The header for drivers/iommu/intel/pasid.c follows this syntax, but
the content inside does not comply with kernel-doc.

This line was probably not meant for kernel-doc parsing, but is parsed
due to the presence of kernel-doc like comment syntax(i.e, '/**'), which
causes unexpected warnings from kernel-doc:
warning: Function parameter or member 'fmt' not described in 'pr_fmt'

Provide a simple fix by replacing this occurrence with general comment
format, i.e. '/*', to prevent kernel-doc from parsing it.

Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210523143245.19040-1-yashsri421@gmail.com
Link: https://lore.kernel.org/r/20210610020115.1637656-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Colin Ian King
05d2cbf969 iommu/vt-d: Remove redundant assignment to variable agaw
The variable agaw is initialized with a value that is never read and it
is being updated later with a new value as a counter in a for-loop. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210416171826.64091-1-colin.king@canonical.com
Link: https://lore.kernel.org/r/20210610020115.1637656-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Rolf Eike Beer
0ee74d5a48 iommu/vt-d: Fix sysfs leak in alloc_iommu()
iommu_device_sysfs_add() is called before, so is has to be cleaned on subsequent
errors.

Fixes: 39ab9555c2 ("iommu: Add sysfs bindings for struct iommu_device")
Cc: stable@vger.kernel.org # 4.11.x
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/17411490.HIIP88n32C@mobilepool36.emlix.com
Link: https://lore.kernel.org/r/20210525070802.361755-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-27 16:07:08 +02:00
Lu Baolu
54c80d9074 iommu/vt-d: Use user privilege for RID2PASID translation
When first-level page tables are used for IOVA translation, we use user
privilege by setting U/S bit in the page table entry. This is to make it
consistent with the second level translation, where the U/S enforcement
is not available. Clear the SRE (Supervisor Request Enable) field in the
pasid table entry of RID2PASID so that requests requesting the supervisor
privilege are blocked and treated as DMA remapping faults.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210512064426.3440915-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210519015027.108468-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-19 08:51:02 +02:00
Dan Carpenter
1a590a1c8b iommu/vt-d: Check for allocation failure in aux_detach_device()
In current kernels small allocations never fail, but checking for
allocation failure is the correct thing to do.

Fixes: 18abda7a2d ("iommu/vt-d: Fix general protection fault in aux_detach_device()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/YJuobKuSn81dOPLd@mwanda
Link: https://lore.kernel.org/r/20210519015027.108468-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-19 08:51:02 +02:00
Linus Torvalds
57151b502c pci-v5.13-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmCRp48UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwsVRAAsIYueNKzZczpkeQwHigYzf4HLdKm
 yyT2c/Zlj9REAUOe7ApkowVAJWiMGDJP0J361KIluAGvAxnkMP1V6WlVdByorYd0
 CrXc/UhD//cs+3QDo4SmJRHyL8q5QQTDa8Z/8seVJUYTR/t5OhSpMOuEJPhpeQ1s
 nqUk0yWNJRoN6wn6T/7KqgYEvPhARXo9epuWy5MNPZ5f8E7SRi/QG/6hP8/YOLpK
 A+8beIOX5LAvUJaXxEovwv5UQnSUkeZTGDyRietQYE6xXNeHPKCvZ7vDjjSE7NOW
 mIodD6JcG3n/riYV3sMA5PKDZgsPI3P/qJU6Y6vWBBYOaO/kQX/c7CZ+M2bcZay4
 mh1dW0vOqoTy/pAVwQB2aq08Rrg2SAskpNdeyzduXllmuTyuwCMPXzG4RKmbQ8I1
 qMFb8qOyNulRAWcTKgSMKByEQYASQsFA5yShtaba6h0+vqrseuP6hchBKKOEan8F
 9THTI3ZflKwRvGjkI0MDbp0z0+wPYmNhrcZDpAJ3bEltw58E8TL/9aBtuhajmo8+
 wJ64mZclFuMmSyhsfkAXOvjeKXMlEBaw7vinZGbcACmv4ZGI0MV7r4vVYQbQltcy
 myzB6xJxcWB8N07UpKpUbsGMb9JjTUPlaT36eZNvUZQDntrE1ljt8RSq3nphDrcD
 KmBRU8ru74I2RE0=
 =WvTD
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:
   - Release OF node when pci_scan_device() fails (Dmitry Baryshkov)
   - Add pci_disable_parity() (Bjorn Helgaas)
   - Disable Mellanox Tavor parity reporting (Heiner Kallweit)
   - Disable N2100 r8169 parity reporting (Heiner Kallweit)
   - Fix RCiEP device to RCEC association (Qiuxu Zhuo)
   - Convert sysfs "config", "rom", "reset", "label", "index",
     "acpi_index" to static attributes to help fix races in device
     enumeration (Krzysztof Wilczyński)
   - Convert sysfs "vpd" to static attribute (Heiner Kallweit, Krzysztof
     Wilczyński)
   - Use sysfs_emit() in "show" functions (Krzysztof Wilczyński)
   - Remove unused alloc_pci_root_info() return value (Krzysztof
     Wilczyński)

  PCI device hotplug:
   - Fix acpiphp reference count leak (Feilong Lin)

  Power management:
   - Fix acpi_pci_set_power_state() debug message (Rafael J. Wysocki)
   - Fix runtime PM imbalance (Dinghao Liu)

  Virtualization:
   - Increase delay after FLR to work around Intel DC P4510 NVMe erratum
     (Raphael Norwitz)

  MSI:
   - Convert rcar, tegra, xilinx to MSI domains (Marc Zyngier)
   - For rcar, xilinx, use controller address as MSI doorbell (Marc
     Zyngier)
   - Remove unused hv msi_controller struct (Marc Zyngier)
   - Remove unused PCI core msi_controller support (Marc Zyngier)
   - Remove struct msi_controller altogether (Marc Zyngier)
   - Remove unused default_teardown_msi_irqs() (Marc Zyngier)
   - Let host bridges declare their reliance on MSI domains (Marc
     Zyngier)
   - Make pci_host_common_probe() declare its reliance on MSI domains
     (Marc Zyngier)
   - Advertise mediatek lack of built-in MSI handling (Thomas Gleixner)
   - Document ways of ending up with NO_MSI (Marc Zyngier)
   - Refactor HT advertising of NO_MSI flag (Marc Zyngier)

  VPD:
   - Remove obsolete Broadcom NIC VPD length-limiting quirk (Heiner
     Kallweit)
   - Remove sysfs VPD size checking dead code (Heiner Kallweit)
   - Convert VPF sysfs file to static attribute (Heiner Kallweit)
   - Remove unnecessary pci_set_vpd_size() (Heiner Kallweit)
   - Tone down "missing VPD" message (Heiner Kallweit)

  Endpoint framework:
   - Fix NULL pointer dereference when epc_features not implemented
     (Shradha Todi)
   - Add missing destroy_workqueue() in endpoint test (Yang Yingliang)

  Amazon Annapurna Labs PCIe controller driver:
   - Fix compile testing without CONFIG_PCI_ECAM (Arnd Bergmann)
   - Fix "no symbols" warnings when compile testing with
     CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann)

  APM X-Gene PCIe controller driver:
   - Fix cfg resource mapping regression (Dejin Zheng)

  Broadcom iProc PCIe controller driver:
   - Return zero for success of iproc_msi_irq_domain_alloc() (Pali
     Rohár)

  Broadcom STB PCIe controller driver:
   - Add reset_control_rearm() stub for !CONFIG_RESET_CONTROLLER (Jim
     Quinlan)
   - Fix use of BCM7216 reset controller (Jim Quinlan)
   - Use reset/rearm for Broadcom STB pulse reset instead of
     deassert/assert (Jim Quinlan)
   - Fix brcm_pcie_probe() error return for unsupported revision (Wei
     Yongjun)

  Cavium ThunderX PCIe controller driver:
   - Fix compile testing (Arnd Bergmann)
   - Fix "no symbols" warnings when compile testing with
     CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann)

  Freescale Layerscape PCIe controller driver:
   - Fix ls_pcie_ep_probe() syntax error (comma for semicolon)
     (Krzysztof Wilczyński)
   - Remove layerscape-gen4 dependencies on OF and ARM64, add dependency
     on ARCH_LAYERSCAPE (Geert Uytterhoeven)

  HiSilicon HIP PCIe controller driver:
   - Remove obsolete HiSilicon PCIe DT description (Dongdong Liu)

  Intel Gateway PCIe controller driver:
   - Remove unused pcie_app_rd() (Jiapeng Chong)

  Intel VMD host bridge driver:
   - Program IRTE with Requester ID of VMD endpoint, not child device
     (Jon Derrick)
   - Disable VMD MSI-X remapping when possible so children can use more
     MSI-X vectors (Jon Derrick)

  MediaTek PCIe controller driver:
   - Configure FC and FTS for functions other than 0 (Ryder Lee)
   - Add YAML schema for MediaTek (Jianjun Wang)
   - Export pci_pio_to_address() for module use (Jianjun Wang)
   - Add MediaTek MT8192 PCIe controller driver (Jianjun Wang)
   - Add MediaTek MT8192 INTx support (Jianjun Wang)
   - Add MediaTek MT8192 MSI support (Jianjun Wang)
   - Add MediaTek MT8192 system power management support (Jianjun Wang)
   - Add missing MODULE_DEVICE_TABLE (Qiheng Lin)

  Microchip PolarFlare PCIe controller driver:
   - Make several symbols static (Wei Yongjun)

  NVIDIA Tegra PCIe controller driver:
   - Add MCFG quirks for Tegra194 ECAM errata (Vidya Sagar)
   - Make several symbols const (Rikard Falkeborn)
   - Fix Kconfig host/endpoint typo (Wesley Sheng)

  SiFive FU740 PCIe controller driver:
   - Add pcie_aux clock to prci driver (Greentime Hu)
   - Use reset-simple in prci driver for PCIe (Greentime Hu)
   - Add SiFive FU740 PCIe host controller driver and DT binding (Paul
     Walmsley, Greentime Hu)

  Synopsys DesignWare PCIe controller driver:
   - Move MSI Receiver init to dw_pcie_host_init() so it is
     re-initialized along with the RC in resume (Jisheng Zhang)
   - Move iATU detection earlier to fix regression (Hou Zhiqiang)

  TI J721E PCIe driver:
   - Add DT binding and TI j721e support for refclk to PCIe connector
     (Kishon Vijay Abraham I)
   - Add host mode and endpoint mode DT bindings for TI AM64 SoC (Kishon
     Vijay Abraham I)

  TI Keystone PCIe controller driver:
   - Use generic config accessors for TI AM65x (K3) to fix regression
     (Kishon Vijay Abraham I)

  Xilinx NWL PCIe controller driver:
   - Add support for coherent PCIe DMA traffic using CCI (Bharat Kumar
     Gogada)
   - Add optional "dma-coherent" DT property (Bharat Kumar Gogada)

  Miscellaneous:
   - Fix kernel-doc warnings (Krzysztof Wilczyński)
   - Remove unused MicroGate SyncLink device IDs (Jiri Slaby)
   - Remove redundant dev_err() for devm_ioremap_resource() failure
     (Chen Hui)
   - Remove redundant initialization (Colin Ian King)
   - Drop redundant dev_err() for platform_get_irq() errors (Krzysztof
     Wilczyński)"

* tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (98 commits)
  riscv: dts: Add PCIe support for the SiFive FU740-C000 SoC
  PCI: fu740: Add SiFive FU740 PCIe host controller driver
  dt-bindings: PCI: Add SiFive FU740 PCIe host controller
  MAINTAINERS: Add maintainers for SiFive FU740 PCIe driver
  clk: sifive: Use reset-simple in prci driver for PCIe driver
  clk: sifive: Add pcie_aux clock in prci driver for PCIe driver
  PCI: brcmstb: Use reset/rearm instead of deassert/assert
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  reset: add missing empty function reset_control_rearm()
  PCI: Allow VPD access for QLogic ISP2722
  PCI/VPD: Add helper pci_get_func0_dev()
  PCI/VPD: Remove pci_vpd_find_tag() SRDT handling
  PCI/VPD: Remove pci_vpd_find_tag() 'offset' argument
  PCI/VPD: Change pci_vpd_init() return type to void
  PCI/VPD: Make missing VPD message less alarming
  PCI/VPD: Remove pci_set_vpd_size()
  x86/PCI: Remove unused alloc_pci_root_info() return value
  MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer
  PCI: mediatek-gen3: Add system PM support
  PCI: mediatek-gen3: Add MSI support
  ...
2021-05-05 13:24:11 -07:00
Robin Murphy
2d471b20c5 iommu: Streamline registration interface
Rather than have separate opaque setter functions that are easy to
overlook and lead to repetitive boilerplate in drivers, let's pass the
relevant initialisation parameters directly to iommu_device_register().

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-16 17:20:45 +02:00
Joerg Roedel
49d11527e5 Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', 'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next 2021-04-16 17:16:03 +02:00
Longpeng(Mike)
38c527aeb4 iommu/vt-d: Force to flush iotlb before creating superpage
The translation caches may preserve obsolete data when the
mapping size is changed, suppose the following sequence which
can reveal the problem with high probability.

1.mmap(4GB,MAP_HUGETLB)
2.
  while (1) {
   (a)    DMA MAP   0,0xa0000
   (b)    DMA UNMAP 0,0xa0000
   (c)    DMA MAP   0,0xc0000000
             * DMA read IOVA 0 may failure here (Not present)
             * if the problem occurs.
   (d)    DMA UNMAP 0,0xc0000000
  }

The page table(only focus on IOVA 0) after (a) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x21d200803  entry:0xffff89b3b0a72000

The page table after (b) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x0          entry:0xffff89b3b0a72000

The page table after (c) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x21d200883   entry:0xffff89b39cacb000 (*)

Because the PDE entry after (b) is present, it won't be
flushed even if the iommu driver flush cache when unmap,
so the obsolete data may be preserved in cache, which
would cause the wrong translation at end.

However, we can see the PDE entry is finally switch to
2M-superpage mapping, but it does not transform
to 0x21d200883 directly:

1. PDE: 0x1a30a72003
2. __domain_mapping
     dma_pte_free_pagetable
       Set the PDE entry to ZERO
     Set the PDE entry to 0x21d200883

So we must flush the cache after the entry switch to ZERO
to avoid the obsolete info be preserved.

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Gonglei (Arei) <arei.gonglei@huawei.com>

Fixes: 6491d4d028 ("intel-iommu: Free old page tables before creating superpage")
Cc: <stable@vger.kernel.org> # v3.0+
Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210415004628.1779-1-longpeng2@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-15 16:12:31 +02:00
Christophe JAILLET
745610c4a3 iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'
If 'intel_cap_audit()' fails, we should return directly, as already done in
the surrounding error handling path.

Fixes: ad3d190299 ("iommu/vt-d: Audit IOMMU Capabilities and add helper functions")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/98d531caabe66012b4fffc7813fd4b9470afd517.1618124777.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-15 15:44:38 +02:00
Lu Baolu
906f86c860 iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86
Commit f68c7f539b ("iommu/vt-d: Enable write protect for supervisor
SVM") added pasid_enable_wpe() which hit below compile error with !X86.

../drivers/iommu/intel/pasid.c: In function 'pasid_enable_wpe':
../drivers/iommu/intel/pasid.c:554:22: error: implicit declaration of function 'read_cr0' [-Werror=implicit-function-declaration]
  554 |  unsigned long cr0 = read_cr0();
      |                      ^~~~~~~~
In file included from ../include/linux/build_bug.h:5,
                 from ../include/linux/bits.h:22,
                 from ../include/linux/bitops.h:6,
                 from ../drivers/iommu/intel/pasid.c:12:
../drivers/iommu/intel/pasid.c:557:23: error: 'X86_CR0_WP' undeclared (first use in this function)
  557 |  if (unlikely(!(cr0 & X86_CR0_WP))) {
      |                       ^~~~~~~~~~
../include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
   78 | # define unlikely(x) __builtin_expect(!!(x), 0)
      |                                          ^
../drivers/iommu/intel/pasid.c:557:23: note: each undeclared identifier is reported only once for each function it appears in
  557 |  if (unlikely(!(cr0 & X86_CR0_WP))) {
      |                       ^~~~~~~~~~
../include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
   78 | # define unlikely(x) __builtin_expect(!!(x), 0)
      |

Add the missing dependency.

Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: f68c7f539b ("iommu/vt-d: Enable write protect for supervisor SVM")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20210411062312.3057579-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-15 15:43:20 +02:00
Lu Baolu
8b74b6ab25 iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown
When a present pasid entry is disassembled, all kinds of pasid related
caches need to be flushed. But when a pasid entry is not being used
(PRESENT bit not set), we don't need to do this. Check the PRESENT bit
in intel_pasid_tear_down_entry() and avoid flushing caches if it's not
set.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:47 +02:00
Lu Baolu
c0474a606e iommu/vt-d: Invalidate PASID cache when root/context entry changed
When the Intel IOMMU is operating in the scalable mode, some information
from the root and context table may be used to tag entries in the PASID
cache. Software should invalidate the PASID-cache when changing root or
context table entries.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:47 +02:00
Lu Baolu
eea53c5816 iommu/vt-d: Remove WO permissions on second-level paging entries
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:47 +02:00
Lu Baolu
03d205094a iommu/vt-d: Report the right page fault address
The Address field of the Page Request Descriptor only keeps bit [63:12]
of the offending address. Convert it to a full address before reporting
it to device drivers.

Fixes: eb8d93ea3c ("iommu/vt-d: Report page request faults for guest SVA")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:46 +02:00
Robin Murphy
a250c23f15 iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
Instead make the global iommu_dma_strict paramete in iommu.c canonical by
exporting helpers to get and set it and use those directly in the drivers.

This make sure that the iommu.strict parameter also works for the AMD and
Intel IOMMU drivers on x86.  As those default to lazy flushing a new
IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
represent the default if not overriden by an explicit parameter.

[ported on top of the other iommu_attr changes and added a few small
 missing bits]

Signed-off-by: Robin Murphy <robin.murphy@arm.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210401155256.298656-19-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:56:53 +02:00
Christoph Hellwig
7e14754778 iommu: remove DOMAIN_ATTR_NESTING
Use an explicit enable_nesting method instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Link: https://lore.kernel.org/r/20210401155256.298656-17-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:56:53 +02:00
Jean-Philippe Brucker
9003351cb6 iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210401154718.307519-5-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:54:29 +02:00
Lu Baolu
6c00612d0c iommu/vt-d: Report right snoop capability when using FL for IOVA
The Intel VT-d driver checks wrong register to report snoop capablility
when using first level page table for GPA to HPA translation. This might
lead the IOMMU driver to say that it supports snooping control, but in
reality, it does not. Fix this by always setting PASID-table-entry.PGSNP
whenever a pasid entry is setting up for GPA to HPA translation so that
the IOMMU driver could report snoop capability as long as it runs in the
scalable mode.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Rajesh Sankaran <rajesh.sankaran@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210330021145.13824-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:41:30 +02:00
John Garry
363f266eef iommu/vt-d: Remove IOVA domain rcache flushing for CPU offlining
Now that the core code handles flushing per-IOVA domain CPU rcaches,
remove the handling here.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:27:27 +02:00
Lu Baolu
442b81836d iommu/vt-d: Make unnecessarily global functions static
Make some functions static as they are only used inside pasid.c.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:15:58 +02:00
Lu Baolu
1b169fdf42 iommu/vt-d: Remove unused function declarations
Some functions have been removed. Remove the remaining function
delarations.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:15:49 +02:00
Lu Baolu
06905ea831 iommu/vt-d: Remove SVM_FLAG_PRIVATE_PASID
The SVM_FLAG_PRIVATE_PASID has never been referenced in the tree, and
there's no plan to have anything to use it. So cleanup it.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:15:19 +02:00
Lu Baolu
2e1a44c1c4 iommu/vt-d: Remove svm_dev_ops
The svm_dev_ops has never been referenced in the tree, and there's no
plan to have anything to use it. Remove it to make the code neat.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:15:19 +02:00
Lu Baolu
1d421058c8 iommu/vt-d: Don't set then clear private data in prq_event_thread()
The VT-d specification (section 7.6) requires that the value in the
Private Data field of a Page Group Response Descriptor must match
the value in the Private Data field of the respective Page Request
Descriptor.

The private data field of a page group response descriptor is set then
immediately cleared in prq_event_thread(). This breaks the rule defined
by the VT-d specification. Fix it by moving clearing code up.

Fixes: 5b438f4ba3 ("iommu/vt-d: Support page request in scalable mode")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:09:54 +02:00
Lu Baolu
803766cbf8 iommu/vt-d: Fix lockdep splat in intel_pasid_get_entry()
The pasid_lock is used to synchronize different threads from modifying a
same pasid directory entry at the same time. It causes below lockdep splat.

[   83.296538] ========================================================
[   83.296538] WARNING: possible irq lock inversion dependency detected
[   83.296539] 5.12.0-rc3+ #25 Tainted: G        W
[   83.296539] --------------------------------------------------------
[   83.296540] bash/780 just changed the state of lock:
[   83.296540] ffffffff82b29c98 (device_domain_lock){..-.}-{2:2}, at:
           iommu_flush_dev_iotlb.part.0+0x32/0x110
[   83.296547] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   83.296547]  (pasid_lock){+.+.}-{2:2}
[   83.296548]

           and interrupts could create inverse lock ordering between them.

[   83.296549] other info that might help us debug this:
[   83.296549] Chain exists of:
                 device_domain_lock --> &iommu->lock --> pasid_lock
[   83.296551]  Possible interrupt unsafe locking scenario:

[   83.296551]        CPU0                    CPU1
[   83.296552]        ----                    ----
[   83.296552]   lock(pasid_lock);
[   83.296553]                                local_irq_disable();
[   83.296553]                                lock(device_domain_lock);
[   83.296554]                                lock(&iommu->lock);
[   83.296554]   <Interrupt>
[   83.296554]     lock(device_domain_lock);
[   83.296555]
                *** DEADLOCK ***

Fix it by replacing the pasid_lock with an atomic exchange operation.

Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320020916.640115-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:08:09 +02:00
Jon Derrick
9b4a824b88 iommu/vt-d: Use Real PCI DMA device for IRTE
VMD retransmits child device MSI-X with the VMD endpoint's requester-id.
In order to support direct interrupt remapping of VMD child devices,
ensure that the IRTE is programmed with the VMD endpoint's requester-id
using pci_real_dma_dev().

Link: https://lore.kernel.org/r/20210210161315.316097-2-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
2021-03-22 14:08:20 +00:00
Jacob Pan
396bd6f3d9 iommu/vt-d: Calculate and set flags for handle_mm_fault
Page requests are originated from the user page fault. Therefore, we
shall set FAULT_FLAG_USER. 

FAULT_FLAG_REMOTE indicates that we are walking an mm which is not
guaranteed to be the same as the current->mm and should not be subject
to protection key enforcement. Therefore, we should set FAULT_FLAG_REMOTE
to avoid faults when both SVM and PKEY are used.

References: commit 1b2ee1266e ("mm/core: Do not enforce PKEY permissions on remote mm access")
Reviewed-by: Raj Ashok <ashok.raj@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-5-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:42:46 +01:00
Jacob Pan
78a523fe73 iommu/vt-d: Reject unsupported page request modes
When supervisor/privilige mode SVM is used, we bind init_mm.pgd with
a supervisor PASID. There should not be any page fault for init_mm.
Execution request with DMA read is also not supported.

This patch checks PRQ descriptor for both unsupported configurations,
reject them both with invalid responses.

Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:42:46 +01:00
Jacob Pan
bb0f61533d iommu/vt-d: Enable write protect propagation from guest
Write protect bit, when set, inhibits supervisor writes to the read-only
pages. In guest supervisor shared virtual addressing (SVA), write-protect
should be honored upon guest bind supervisor PASID request.

This patch extends the VT-d portion of the IOMMU UAPI to include WP bit.
WPE bit of the  supervisor PASID entry will be set to match CPU CR0.WP bit.

Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-3-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:42:46 +01:00
Jacob Pan
f68c7f539b iommu/vt-d: Enable write protect for supervisor SVM
Write protect bit, when set, inhibits supervisor writes to the read-only
pages. In supervisor shared virtual addressing (SVA), where page tables
are shared between CPU and DMA, IOMMU PASID entry WPE bit should match
CR0.WP bit in the CPU.
This patch sets WPE bit for supervisor PASIDs if CR0.WP is set.

Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-2-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:42:46 +01:00
Lu Baolu
6ca69e5841 iommu/vt-d: Report more information about invalidation errors
When the invalidation queue errors are encountered, dump the information
logged by the VT-d hardware together with the pending queue invalidation
descriptors.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210318005340.187311-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:27:46 +01:00
Kyung Min Park
dec991e472 iommu/vt-d: Disable SVM when ATS/PRI/PASID are not enabled in the device
Currently, the Intel VT-d supports Shared Virtual Memory (SVM) only when
IO page fault is supported. Otherwise, shared memory pages can not be
swapped out and need to be pinned. The device needs the Address Translation
Service (ATS), Page Request Interface (PRI) and Process Address Space
Identifier (PASID) capabilities to be enabled to support IO page fault.

Disable SVM when ATS, PRI and PASID are not enabled in the device.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210314201534.918-1-kyung.min.park@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:23:52 +01:00
Robin Murphy
3542dcb15c iommu/dma: Resurrect the "forcedac" option
In converting intel-iommu over to the common IOMMU DMA ops, it quietly
lost the functionality of its "forcedac" option. Since this is a handy
thing both for testing and for performance optimisation on certain
platforms, reimplement it under the common IOMMU parameter namespace.

For the sake of fixing the inadvertent breakage of the Intel-specific
parameter, remove the dmar_forcedac remnants and hook it up as an alias
while documenting the transition to the new common parameter.

Fixes: c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 10:55:23 +01:00
Zenghui Yu
444d66a23c iommu/vt-d: Fix status code for Allocate/Free PASID command
As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response
Register"), the status code of "No PASID available" error in response to
the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error
in response to the Free PASID command.

We will otherwise see confusing kernel log under the command failure from
guest side. Fix it.

Fixes: 24f27d32ab ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210227073909.432-1-yuzenghui@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-04 13:32:04 +01:00
Joerg Roedel
45e606f272 Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next 2021-02-12 15:27:17 +01:00
Yian Chen
31a75cbbb9 iommu/vt-d: Parse SATC reporting structure
Software should parse every SATC table and all devices in the tables
reported by the BIOS and keep the information in kernel list for further
reference.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Lu Baolu
933fcd01e9 iommu/vt-d: Add iotlb_sync_map callback
Some Intel VT-d hardware implementations don't support memory coherency
for page table walk (presented by the Page-Walk-coherency bit in the
ecap register), so that software must flush the corresponding CPU cache
lines explicitly after each page table entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple cache line flushes,
which leads to the degradation of I/O performance after the commit
<c588072bba6b5> ("iommu/vt-d: Convert intel iommu driver to the iommu
ops").

Fix this by adding iotlb_sync_map callback and centralizing the clflush
operations after all sg mappings.

Fixes: c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/D81314ED-5673-44A6-B597-090E3CB83EB0@oracle.com/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
[ cel: removed @first_pte, which is no longer used ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/161177763962.1311.15577661784296014186.stgit@manet.1015granger.net
Link: https://lore.kernel.org/r/20210204014401.2846425-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Kyung Min Park
010bf5659e iommu/vt-d: Move capability check code to cap_audit files
Move IOMMU capability check and sanity check code to cap_audit files.
Also implement some helper functions for sanity checks.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Kyung Min Park
ad3d190299 iommu/vt-d: Audit IOMMU Capabilities and add helper functions
Audit IOMMU Capability/Extended Capability and check if the IOMMUs have
the consistent value for features. Report out or scale to the lowest
supported when IOMMU features have incompatibility among IOMMUs.

Report out features when below features are mismatched:
  - First Level 5 Level Paging Support (FL5LP)
  - First Level 1 GByte Page Support (FL1GP)
  - Read Draining (DRD)
  - Write Draining (DWD)
  - Page Selective Invalidation (PSI)
  - Zero Length Read (ZLR)
  - Caching Mode (CM)
  - Protected High/Low-Memory Region (PHMR/PLMR)
  - Required Write-Buffer Flushing (RWBF)
  - Advanced Fault Logging (AFL)
  - RID-PASID Support (RPS)
  - Scalable Mode Page Walk Coherency (SMPWC)
  - First Level Translation Support (FLTS)
  - Second Level Translation Support (SLTS)
  - No Write Flag Support (NWFS)
  - Second Level Accessed/Dirty Support (SLADS)
  - Virtual Command Support (VCS)
  - Scalable Mode Translation Support (SMTS)
  - Device TLB Invalidation Throttle (DIT)
  - Page Drain Support (PDS)
  - Process Address Space ID Support (PASID)
  - Extended Accessed Flag Support (EAFS)
  - Supervisor Request Support (SRS)
  - Execute Request Support (ERS)
  - Page Request Support (PRS)
  - Nested Translation Support (NEST)
  - Snoop Control (SC)
  - Pass Through (PT)
  - Device TLB Support (DT)
  - Queued Invalidation (QI)
  - Page walk Coherency (C)

Set capability to the lowest supported when below features are mismatched:
  - Maximum Address Mask Value (MAMV)
  - Number of Fault Recording Registers (NFR)
  - Second Level Large Page Support (SLLPS)
  - Fault Recording Offset (FRO)
  - Maximum Guest Address Width (MGAW)
  - Supported Adjusted Guest Address Width (SAGAW)
  - Number of Domains supported (NDOMS)
  - Pasid Size Supported (PSS)
  - Maximum Handle Mask Value (MHMV)
  - IOTLB Register Offset (IRO)

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Lu Baolu
e1ed66ac30 iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration]
trace_qi_submit() could be used when interrupt remapping is supported,
but DMA remapping is not. In this case, the following compile error
occurs.

../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync':
../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 'trace_qi_submit';
  did you mean 'ftrace_nmi_exit'? [-Werror=implicit-function-declaration]
   trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1,
   ^~~~~~~~~~~~~~~
   ftrace_nmi_exit

Fixes: f2dd871799 ("iommu/vt-d: Add qi_submit trace event")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130151907.3929148-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-02 14:43:56 +01:00
Lu Baolu
3aa7c62cb7 iommu/vt-d: Use INVALID response code instead of FAILURE
The VT-d IOMMU response RESPONSE_FAILURE for a page request in below
cases:

- When it gets a Page_Request with no PASID;
- When it gets a Page_Request with PASID that is not in use for this
  device.

This is allowed by the spec, but IOMMU driver doesn't support such cases
today. When the device receives RESPONSE_FAILURE, it sends the device
state machine to HALT state. Now if we try to unload the driver, it hangs
since the device doesn't send any outbound transactions to host when the
driver is trying to clear things up. The only possible responses would be
for invalidation requests.

Let's use RESPONSE_INVALID instead for now, so that the device state
machine doesn't enter HALT state.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210126080730.2232859-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-29 09:25:24 +01:00
Lu Baolu
28a77185f1 iommu/vt-d: Clear PRQ overflow only when PRQ is empty
It is incorrect to always clear PRO when it's set w/o first checking
whether the overflow condition has been cleared. Current code assumes
that if an overflow condition occurs it must have been cleared by earlier
loop. However since the code runs in a threaded context, the overflow
condition could occur even after setting the head to the tail under some
extreme condition. To be sane, we should read both head/tail again when
seeing a pending PRO and only clear PRO after all pending PRs have been
handled.

Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/
Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-29 09:25:24 +01:00
Nadav Amit
29b3283972 iommu/vt-d: Do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf017446 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 13cf017446 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org

Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 13:59:02 +01:00
Lu Baolu
494b3688bb iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid()
An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid()
to check the address alignment. This leads to a lot of spurious kernel
warnings:

[  485.837093] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[  485.837098] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[  492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed
[  492.494147] DMAR: Invalidate non-aligned address 7f7728800000, order 11
[  492.508965] DMAR: Invalidate non-aligned address 7f7728800000, order 11

Fix it by checking the alignment in right way.

Fixes: 288d08e780 ("iommu/vt-d: Handle non-page aligned address")
Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 13:17:08 +01:00
Lu Baolu
a8ce9ebbec iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
The Access/Dirty bits in the first level page table entry will be set
whenever a page table entry was used for address translation or write
permission was successfully translated. This is always true when using
the first-level page table for kernel IOVA. Instead of wasting hardware
cycles to update the certain bits, it's better to set them up at the
beginning.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 11:33:35 +01:00
Lu Baolu
f2dd871799 iommu/vt-d: Add qi_submit trace event
This adds a new trace event to track the submissions of requests to the
invalidation queue. This event will provide the information like:
- IOMMU name
- Invalidation type
- Descriptor raw data

A sample output like:
| qi_submit: iotlb_inv dmar1: 0x100e2 0x0 0x0 0x0
| qi_submit: dev_tlb_inv dmar1: 0x1000000003 0x7ffffffffffff001 0x0 0x0
| qi_submit: iotlb_inv dmar2: 0x800f2 0xf9a00005 0x0 0x0

This will be helpful for queued invalidation related debugging.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210114090400.736104-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 11:32:13 +01:00
Lu Baolu
9872f9bd9d iommu/vt-d: Consolidate duplicate cache invaliation code
The pasid based IOTLB and devTLB invalidation code is duplicate in
several places. Consolidate them by using the common helpers.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210114085021.717041-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 11:30:36 +01:00
Tian Tao
694a1c0ade iommu/vt-d: Fix duplicate included linux/dma-map-ops.h
linux/dma-map-ops.h is included more than once, Remove the one that
isn't necessary.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609118774-10083-1-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-12 16:56:20 +00:00
Lu Baolu
2d6ffc63f1 iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev()
The VT-d hardware will ignore those Addr bits which have been masked by
the AM field in the PASID-based-IOTLB invalidation descriptor. As the
result, if the starting address in the descriptor is not aligned with
the address mask, some IOTLB caches might not invalidate. Hence people
will see below errors.

[ 1093.704661] dmar_fault: 29 callbacks suppressed
[ 1093.704664] DMAR: DRHD: handling fault status reg 3
[ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2
               fault addr 7f81c968d000 [fault reason 113]
               SM: Present bit in first-level paging entry is clear

Fix this by using aligned address for PASID-based-IOTLB invalidation.

Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-12 15:47:54 +00:00
Liu Yi L
7c29ada5e7 iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
iommu_flush_dev_iotlb() is called to invalidate caches on a device but
only loops over the devices which are fully-attached to the domain. For
sub-devices, this is ineffective and can result in invalid caching
entries left on the device.

Fix the missing invalidation by adding a loop over the subdevices and
ensuring that 'domain->has_iotlb_device' is updated when attaching to
subdevices.

Fixes: 67b8e02b5e ("iommu/vt-d: Aux-domain specific domain attach/detach")
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-4-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:38:15 +00:00
Liu Yi L
18abda7a2d iommu/vt-d: Fix general protection fault in aux_detach_device()
The aux-domain attach/detach are not tracked, some data structures might
be used after free. This causes general protection faults when multiple
subdevices are created and assigned to a same guest machine:

  | general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI
  | RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0
  | [...]
  | Call Trace:
  |  iommu_aux_detach_device+0x24/0x70
  |  vfio_mdev_detach_domain+0x3b/0x60
  |  ? vfio_mdev_set_domain+0x50/0x50
  |  iommu_group_for_each_dev+0x4f/0x80
  |  vfio_iommu_detach_group.isra.0+0x22/0x30
  |  vfio_iommu_type1_detach_group.cold+0x71/0x211
  |  ? find_exported_symbol_in_section+0x4a/0xd0
  |  ? each_symbol_section+0x28/0x50
  |  __vfio_group_unset_container+0x4d/0x150
  |  vfio_group_try_dissolve_container+0x25/0x30
  |  vfio_group_put_external_user+0x13/0x20
  |  kvm_vfio_group_put_external_user+0x27/0x40 [kvm]
  |  kvm_vfio_destroy+0x45/0xb0 [kvm]
  |  kvm_put_kvm+0x1bb/0x2e0 [kvm]
  |  kvm_vm_release+0x22/0x30 [kvm]
  |  __fput+0xcc/0x260
  |  ____fput+0xe/0x10
  |  task_work_run+0x8f/0xb0
  |  do_exit+0x358/0xaf0
  |  ? wake_up_state+0x10/0x20
  |  ? signal_wake_up_state+0x1a/0x30
  |  do_group_exit+0x47/0xb0
  |  __x64_sys_exit_group+0x18/0x20
  |  do_syscall_64+0x57/0x1d0
  |  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix the crash by tracking the subdevices when attaching and detaching
aux-domains.

Fixes: 67b8e02b5e ("iommu/vt-d: Aux-domain specific domain attach/detach")
Co-developed-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:35:14 +00:00
Liu Yi L
9ad9f45b3b iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.

In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.

Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: Guo Kaijie <Kaijie.Guo@intel.com>
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:34:36 +00:00
Lu Baolu
420d42f6f9 iommu/vt-d: Fix lockdep splat in sva bind()/unbind()
Lock(&iommu->lock) without disabling irq causes lockdep warnings.

========================================================
WARNING: possible irq lock inversion dependency detected
5.11.0-rc1+ #828 Not tainted
--------------------------------------------------------
kworker/0:1H/120 just changed the state of lock:
ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at:
iommu_flush_dev_iotlb.part.0+0x32/0x120
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (&iommu->lock){+.+.}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&iommu->lock);
                               local_irq_disable();
                               lock(device_domain_lock);
                               lock(&iommu->lock);
  <Interrupt>
    lock(device_domain_lock);

 *** DEADLOCK ***

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 13:27:14 +00:00
Lu Baolu
1efd17e7ac iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb()
Use IS_ALIGNED() instead. Otherwise, an unaligned address will be ignored.

Fixes: 33cd6e642d ("iommu/vt-d: Flush PASID-based iotlb for iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 13:27:14 +00:00
Dinghao Liu
ff2b46d7cf iommu/intel: Fix memleak in intel_irq_remapping_alloc
When irq_domain_get_irq_data() or irqd_cfg() fails
at i == 0, data allocated by kzalloc() has not been
freed before returning, which leads to memleak.

Fixes: b106ee63ab ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-05 19:12:06 +00:00
Linus Torvalds
19778dd504 IOMMU updates for 5.11
- IOVA allocation optimisations and removal of unused code
 
 - Introduction of DOMAIN_ATTR_IO_PGTABLE_CFG for parameterising the
   page-table of an IOMMU domain
 
 - Support for changing the default domain type in sysfs
 
 - Optimisation to the way in which identity-mapped regions are created
 
 - Driver updates:
   * Arm SMMU updates, including continued work on Shared Virtual Memory
   * Tegra SMMU updates, including support for PCI devices
   * Intel VT-D updates, including conversion to the IOMMU-DMA API
 
 - Cleanup, kerneldoc and minor refactoring
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/XWy8QHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNPejB/46QsXATkWt7hbDPIxlUvzUG8VP/FBNJ6A3
 /4Z+4KBXR3zhvZJOEqTarnm6Uc22tWkYpNS3QAOuRW0EfVeD8H+og4SOA2iri5tR
 x3GZUCng93APWpHdDtJP7kP/xuU47JsBblY/Ip9aJKYoXi9c9svtssAqKr008wxr
 knv/xv/awQ0O7CNc3gAoz7mUagQxG/no+HMXMT3Fz9KWRzzvTi6s+7ZDm2faI0hO
 GEJygsKbXxe1qbfeGqKTP/67EJVqjTGsLCF2zMogbnnD7DxadJ2hP0oNg5tvldT/
 oDj9YWG6oLMfIVCwDVQXuWNfKxd7RGORMbYwKNAaRSvmkli6625h
 =KFOO
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull IOMMU updates from Will Deacon:
 "There's a good mixture of improvements to the core code and driver
  changes across the board.

  One thing worth pointing out is that this includes a quirk to work
  around behaviour in the i915 driver (see 65f746e828 ("iommu: Add
  quirk for Intel graphic devices in map_sg")), which otherwise
  interacts badly with the conversion of the intel IOMMU driver over to
  the DMA-IOMMU APU but has being fixed properly in the DRM tree.

  We'll revert the quirk later this cycle once we've confirmed that
  things don't fall apart without it.

  Summary:

   - IOVA allocation optimisations and removal of unused code

   - Introduction of DOMAIN_ATTR_IO_PGTABLE_CFG for parameterising the
     page-table of an IOMMU domain

   - Support for changing the default domain type in sysfs

   - Optimisation to the way in which identity-mapped regions are
     created

   - Driver updates:
       * Arm SMMU updates, including continued work on Shared Virtual
         Memory
       * Tegra SMMU updates, including support for PCI devices
       * Intel VT-D updates, including conversion to the IOMMU-DMA API

   - Cleanup, kerneldoc and minor refactoring"

* tag 'iommu-updates-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (50 commits)
  iommu/amd: Add sanity check for interrupt remapping table length macros
  dma-iommu: remove __iommu_dma_mmap
  iommu/io-pgtable: Remove tlb_flush_leaf
  iommu: Stop exporting free_iova_mem()
  iommu: Stop exporting alloc_iova_mem()
  iommu: Delete split_and_remove_iova()
  iommu/io-pgtable-arm: Remove unused 'level' parameter from iopte_type() macro
  iommu: Defer the early return in arm_(v7s/lpae)_map
  iommu: Improve the performance for direct_mapping
  iommu: avoid taking iova_rbtree_lock twice
  iommu/vt-d: Avoid GFP_ATOMIC where it is not needed
  iommu/vt-d: Remove set but not used variable
  iommu: return error code when it can't get group
  iommu: Fix htmldocs warnings in sysfs-kernel-iommu_groups
  iommu: arm-smmu-impl: Add a space before open parenthesis
  iommu: arm-smmu-impl: Use table to list QCOM implementations
  iommu/arm-smmu: Move non-strict mode to use io_pgtable_domain_attr
  iommu/arm-smmu: Add support for pagetable config domain attribute
  iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
  iommu: Take lock before reading iommu group default domain type
  ...
2020-12-16 13:58:47 -08:00
Linus Torvalds
148842c98a Yet another large set of x86 interrupt management updates:
- Simplification and distangling of the MSI related functionality
 
    - Let IO/APIC construct the RTE entries from an MSI message instead of
      having IO/APIC specific code in the interrupt remapping drivers
 
    - Make the retrieval of the parent interrupt domain (vector or remap
      unit) less hardcoded and use the relevant irqdomain callbacks for
      selection.
 
    - Allow the handling of more than 255 CPUs without a virtualized IOMMU
      when the hypervisor supports it. This has made been possible by the
      above modifications and also simplifies the existing workaround in the
      HyperV specific virtual IOMMU.
 
    - Cleanup of the historical timer_works() irq flags related
      inconsistencies.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/Xxd8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYpOD/9C5TppNlPMUyx2SflH6bxt37pJEpln
 +hYTKsk+jSThntr5mfj+GifGvgmHOVBTGnlDUnUnrpN7TQmLFBzwTOtnBLW53AO2
 16/u0+Xci4LNCtEkaymf0Rq4MfsfriXHPJr0A/CnZ0tpHSf5QKHAiitSiGujdMlb
 gbq43+zXd+jNkH7vkOLPX/7dZVI1hNASQEevJu2tRR4xYTuXFdBxvLgYkHtYKKrK
 R1sbs6nI6yIzye2u4m4xGu29SxgUft+zdUf+UehJKM3yFmf51d9qpkX+kLaTWuaL
 VPsMItbn0kdvxwXQWO6DYnIAAnVKCklyHQJTZCoNq9Fe91OoByak1CEVspSOa1av
 JmycNSch4IYWasR4vVCB1gbb+V9SejcKu5SV3CDrEDqwkOIpfiqpriUXSCJTLlFd
 QOEDOLuuk/79Qs//J/tb/nJ4IuKv8WPudDfIlMro8wUsAr67DjD4mnXprZ+svwWx
 Ct/0/Memk+BSa0cw6pvg24BUZGN6zrufkBu2HKT9GOXRUdNkdLkiPhT8mK4T/O0l
 f90QCLjPSOJ/K/pLEWdUHEPmgC5Q9RsXOmwVGqX+RbjfP7mYTJXlmWnBb+cFNch0
 xFIH3SxVGylxxT06NX3SkvinrHj10CoAlmneefBlLtx6dF+2P84DAMZSF0OFToVI
 c2KMg5zoesI4bg==
 =8Gfs
 -----END PGP SIGNATURE-----

Merge tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 apic updates from Thomas Gleixner:
 "Yet another large set of x86 interrupt management updates:

   - Simplification and distangling of the MSI related functionality

   - Let IO/APIC construct the RTE entries from an MSI message instead
     of having IO/APIC specific code in the interrupt remapping drivers

   - Make the retrieval of the parent interrupt domain (vector or remap
     unit) less hardcoded and use the relevant irqdomain callbacks for
     selection.

   - Allow the handling of more than 255 CPUs without a virtualized
     IOMMU when the hypervisor supports it. This has made been possible
     by the above modifications and also simplifies the existing
     workaround in the HyperV specific virtual IOMMU.

   - Cleanup of the historical timer_works() irq flags related
     inconsistencies"

* tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/ioapic: Cleanup the timer_works() irqflags mess
  iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select()
  iommu/amd: Fix IOMMU interrupt generation in X2APIC mode
  iommu/amd: Don't register interrupt remapping irqdomain when IR is disabled
  iommu/amd: Fix union of bitfields in intcapxt support
  x86/ioapic: Correct the PCI/ISA trigger type selection
  x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index
  x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  x86/kvm: Enable 15-bit extension when KVM_FEATURE_MSI_EXT_DEST_ID detected
  iommu/hyper-v: Disable IRQ pseudo-remapping if 15 bit APIC IDs are available
  x86/apic: Support 15 bits of APIC ID in MSI where available
  x86/ioapic: Handle Extended Destination ID field in RTE
  iommu/vt-d: Simplify intel_irq_remapping_select()
  x86: Kill all traces of irq_remapping_get_irq_domain()
  x86/ioapic: Use irq_find_matching_fwspec() to find remapping irqdomain
  x86/hpet: Use irq_find_matching_fwspec() to find remapping irqdomain
  iommu/hyper-v: Implement select() method on remapping irqdomain
  iommu/vt-d: Implement select() method on remapping irqdomain
  iommu/amd: Implement select() method on remapping irqdomain
  x86/apic: Add select() method on vector irqdomain
  ...
2020-12-14 18:59:53 -08:00
Will Deacon
c74009f529 Merge branch 'for-next/iommu/fixes' into for-next/iommu/core
Merge in IOMMU fixes for 5.10 in order to resolve conflicts against the
queue for 5.11.

* for-next/iommu/fixes:
  iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
  iommu/vt-d: Don't read VCCAP register unless it exists
  x86/tboot: Don't disable swiotlb when iommu is forced on
  iommu: Check return of __iommu_attach_device()
  arm-smmu-qcom: Ensure the qcom_scm driver has finished probing
  iommu/amd: Enforce 4k mapping for certain IOMMU data structures
  MAINTAINERS: Temporarily add myself to the IOMMU entry
  iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set
  iommu/vt-d: Avoid panic if iommu init fails in tboot system
  iommu/vt-d: Cure VF irqdomain hickup
  x86/platform/uv: Fix copied UV5 output archtype
  x86/platform/uv: Drop last traces of uv_flush_tlb_others
2020-12-08 15:21:49 +00:00
Will Deacon
113eb4ce4f Merge branch 'for-next/iommu/vt-d' into for-next/iommu/core
Intel VT-D updates for 5.11. The main thing here is converting the code
over to the iommu-dma API, which required some improvements to the core
code to preserve existing functionality.

* for-next/iommu/vt-d:
  iommu/vt-d: Avoid GFP_ATOMIC where it is not needed
  iommu/vt-d: Remove set but not used variable
  iommu/vt-d: Cleanup after converting to dma-iommu ops
  iommu/vt-d: Convert intel iommu driver to the iommu ops
  iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
  iommu: Add quirk for Intel graphic devices in map_sg
  iommu: Allow the dma-iommu api to use bounce buffers
  iommu: Add iommu_dma_free_cpu_cached_iovas()
  iommu: Handle freelists when using deferred flushing in iommu drivers
  iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM
2020-12-08 15:11:58 +00:00
Will Deacon
a5f12de3ec Merge branch 'for-next/iommu/svm' into for-next/iommu/core
More steps along the way to Shared Virtual {Addressing, Memory} support
for Arm's SMMUv3, including the addition of a helper library that can be
shared amongst other IOMMU implementations wishing to support this
feature.

* for-next/iommu/svm:
  iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
  iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
  iommu/sva: Add PASID helpers
  iommu/ioasid: Add ioasid references
2020-12-08 15:07:49 +00:00
Christophe JAILLET
33e0715710 iommu/vt-d: Avoid GFP_ATOMIC where it is not needed
There is no reason to use GFP_ATOMIC in a 'suspend' function.
Use GFP_KERNEL instead to give more opportunities to allocate the
requested memory.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201030182630.5154-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201201013149.2466272-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-12-01 15:02:20 +00:00
Linus Torvalds
6adf33a5e4 iommu fixes for -rc6
- Fix intel iommu driver when running on devices without VCCAP_REG
 
 - Fix swiotlb and "iommu=pt" interaction under TXT (tboot)
 
 - Fix missing return value check during device probe()
 
 - Fix probe ordering for Qualcomm SMMU implementation
 
 - Ensure page-sized mappings are used for AMD IOMMU buffers with SNP RMP
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/A3msQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNAI6B/9/MLjurPmQrSusq9Y7dhnGR7aahtICLAwR
 UDZHebGTeVxNqtTklQVl/1qgcY7DxU40LAO1777MM7eHOW1FnlCAWrkTo6BBQsGx
 U1FKegOJ/0eVHFPtFDfM7IA2skeLwlZW+hywNLAksme5mtd6iZG9yQLlDFqAjxL7
 v8uXgfHFn6Z2MvMc2O+IeKTtflIwPek/6rYuaEf7UknA6ZYPAD3hnu9i1RTEuUAA
 h2PoVrrJ/KefEsCMUIq2jwMTsSvxohDH8ClGK9b6h74J2CLKKuhALSgABRAmsEL9
 7w5TVdtMQ85n3ccnXyT4RBQ+O/eVtmsKSdfeAbFI4nG9e9j7YE1L
 =BI1u
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull iommu fixes from Will Deacon:
 "Here's another round of IOMMU fixes for -rc6 consisting mainly of a
  bunch of independent driver fixes. Thomas agreed for me to take the
  x86 'tboot' fix here, as it fixes a regression introduced by a vt-d
  change.

   - Fix intel iommu driver when running on devices without VCCAP_REG

   - Fix swiotlb and "iommu=pt" interaction under TXT (tboot)

   - Fix missing return value check during device probe()

   - Fix probe ordering for Qualcomm SMMU implementation

   - Ensure page-sized mappings are used for AMD IOMMU buffers with SNP
     RMP"

* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  iommu/vt-d: Don't read VCCAP register unless it exists
  x86/tboot: Don't disable swiotlb when iommu is forced on
  iommu: Check return of __iommu_attach_device()
  arm-smmu-qcom: Ensure the qcom_scm driver has finished probing
  iommu/amd: Enforce 4k mapping for certain IOMMU data structures
2020-11-27 10:41:19 -08:00
Lu Baolu
405a43cc00 iommu/vt-d: Remove set but not used variable
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/iommu/intel/iommu.c:5643:27: warning: variable 'last_pfn' set but not used [-Wunused-but-set-variable]
5643 |  unsigned long start_pfn, last_pfn;
     |                           ^~~~~~~~

This variable is never used, so remove it.

Fixes: 2a2b8eaa5b ("iommu: Handle freelists when using deferred flushing in iommu drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201127013308.1833610-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-27 09:52:28 +00:00
David Woodhouse
d76b42e927 iommu/vt-d: Don't read VCCAP register unless it exists
My virtual IOMMU implementation is whining that the guest is reading a
register that doesn't exist. Only read the VCCAP_REG if the corresponding
capability is set in ECAP_REG to indicate that it actually exists.

Fixes: 3375303e82 ("iommu/vt-d: Add custom allocator for IOASID")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Cc: stable@vger.kernel.org # v5.7+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-26 14:50:24 +00:00
Lu Baolu
28b41e2c6a iommu: Move def_domain type check for untrusted device into core
So that the vendor iommu drivers are no more required to provide the
def_domain_type callback to always isolate the untrusted devices.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/linux-iommu/243ce89c33fe4b9da4c56ba35acebf81@huawei.com/
Link: https://lore.kernel.org/r/20201124130604.2912899-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:14:33 +00:00
Lu Baolu
58a8bb3949 iommu/vt-d: Cleanup after converting to dma-iommu ops
Some cleanups after converting the driver to use dma-iommu ops.
- Remove nobounce option;
- Cleanup and simplify the path in domain mapping.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-8-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:49 +00:00
Tom Murphy
c588072bba iommu/vt-d: Convert intel iommu driver to the iommu ops
Convert the intel iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the intel iommu driver.

Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-7-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:49 +00:00
Lu Baolu
c062db039f iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
The iommu-dma constrains IOVA allocation based on the domain geometry
that the driver reports. Update domain geometry everytime a domain is
attached to or detached from a device.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:48 +00:00
Tom Murphy
2a2b8eaa5b iommu: Handle freelists when using deferred flushing in iommu drivers
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.

This is useful for iommu drivers (in this case the intel iommu driver)
which need to wait for the ioTLB to be flushed before newly
free/unmapped page table pages can be freed. This way we can still batch
ioTLB free operations and handle the freelists.

Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:48 +00:00
Jean-Philippe Brucker
cb4789b0d1 iommu/ioasid: Add ioasid references
Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-23 14:16:55 +00:00
Will Deacon
66930e7e1e Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb into for-next/iommu/vt-d
Merge swiotlb updates from Konrad, as we depend on the updated function
prototype for swiotlb_tbl_map_single(), which dropped the 'tbl_dma_addr'
argument in -rc4.

* 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
  swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
2020-11-23 10:46:43 +00:00
Linus Torvalds
fc8299f9f3 iommu fixes for -rc5
- Fix boot when intel iommu initialisation fails under TXT (tboot)
 
 - Fix intel iommu compilation error when DMAR is enabled without ATS
 
 - Temporarily update IOMMU MAINTAINERs entry
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+3p6gQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNPowCADBKq5PBRwqIM1tmntRu5rePf/WuB1BotlJ
 cUUR8dIxFDvKMeGvN4mDbn/5jjL1+TlD7sQqFh+gcWJ9tImpAMzp1ENbg71isl5o
 Ej21X9YhjfiIsKpTrNGxcetCkYaR2tp8Z4/WzSQaO+IGB578dQ9VwiwSnDyFdWUb
 1Qcj712ainYvKjbq4NyL80lPm9v43OV8QZ73WeelzBjE2UcDFAV0Xl4BIX8uuGtv
 I1x03JfgzmzhVBmpj5HUGPsSopMDdo5K4vlcqscqSqvBY90iHD5fgRF4ZatTjR1t
 PVM/1+c9vdJIWSxSt1hsB8DPtqXVOvGjzrLW/h63rxyWsISwkAq0
 =DNX/
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull iommu fixes from Will Deacon:
 "Two straightforward vt-d fixes:

   - Fix boot when intel iommu initialisation fails under TXT (tboot)

   - Fix intel iommu compilation error when DMAR is enabled without ATS

  and temporarily update IOMMU MAINTAINERs entry"

* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  MAINTAINERS: Temporarily add myself to the IOMMU entry
  iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set
  iommu/vt-d: Avoid panic if iommu init fails in tboot system
2020-11-20 10:20:16 -08:00
Lu Baolu
3645a34f5b iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set
Fix the compile error below (CONFIG_PCI_ATS not set):

drivers/iommu/intel/dmar.c: In function ‘vf_inherit_msi_domain’:
drivers/iommu/intel/dmar.c:338:59: error: ‘struct pci_dev’ has no member named ‘physfn’; did you mean ‘is_physfn’?
  338 |  dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
      |                                                           ^~~~~~
      |                                                           is_physfn

Fixes: ff828729be ("iommu/vt-d: Cure VF irqdomain hickup")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-iommu/CAMuHMdXA7wfJovmfSH2nbAhN0cPyCiFHodTvg4a8Hm9rx5Dj-w@mail.gmail.com/
Link: https://lore.kernel.org/r/20201119055119.2862701-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-19 09:52:26 +00:00
Will Deacon
388255ce95 A small set of fixes for x86:
- Cure the fallout from the MSI irqdomain overhaul which missed that the
    Intel IOMMU does not register virtual function devices and therefore
    never reaches the point where the MSI interrupt domain is assigned. This
    makes the VF devices use the non-remapped MSI domain which is trapped by
    the IOMMU/remap unit.
 
  - Remove an extra space in the SGI_UV architecture type procfs output for
    UV5.
 
  - Remove a unused function which was missed when removing the UV BAU TLB
    shootdown handler.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJi0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVWxD/9Tq4W6Kniln7mtoEWHRvHRceiiGcS3
 MocvqurhoJwirH4F2gkvCegTBy0r3FdUORy3OMmChVs6nb8XpPpso84SANCRePWp
 JZezpVwLSNC4O1/ZCg1Kjj4eUpzLB/UjUUQV9RsjL5wyQEhfCZgb1D40yLM/2dj5
 SkVm/EAqWuQNtYe/jqAOwTX/7mV+k2QEmKCNOigM13R9EWgu6a4J8ta1gtNSbwvN
 jWMW+M1KjZ76pfRK+y4OpbuFixteSzhSWYPITSGwQz4IpQ+Ty2Rv0zzjidmDnAR+
 Q73cup0dretdVnVDRpMwDc06dBCmt/rbN50w4yGU0YFRFDgjGc8sIbQzuIP81nEQ
 XY4l4rcBgyVufFsLrRpQxu1iYPFrcgU38W1kRkkJ3Kl/rY1a2ZU7sLE4kt4Oh55W
 A9KCmsfqP1PCYppjAQ0QT4NOp4YtecPvAU4UcBOb722DDBd8TfhLWWGw2yG57Q/d
 Wnu8xCJGy7BaLHLGGGseAft+D4aNnCjKC3jgMyvNtRDXaV2cK2Kdd6ehMlWVUapD
 xfLlKXE+igXMyoWJIWjTXQJs4dpKu6QpJCPiorwEZ8rmNaRfxsWEJVbeYwEkmUke
 bMoBBSCbZT86WVOYhI8WtrIemraY0mMYrrcE03M96HU3eYB8BV92KrIzZWThupcQ
 ZqkZbqCZm3vfHA==
 =X/P+
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-next/iommu/fixes

Pull in x86 fixes from Thomas, as they include a change to the Intel DMAR
code on which we depend:

* tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  iommu/vt-d: Cure VF irqdomain hickup
  x86/platform/uv: Fix copied UV5 output archtype
  x86/platform/uv: Drop last traces of uv_flush_tlb_others
2020-11-19 09:50:06 +00:00
Zhenzhong Duan
4d213e76a3 iommu/vt-d: Avoid panic if iommu init fails in tboot system
"intel_iommu=off" command line is used to disable iommu but iommu is force
enabled in a tboot system for security reason.

However for better performance on high speed network device, a new option
"intel_iommu=tboot_noforce" is introduced to disable the force on.

By default kernel should panic if iommu init fail in tboot for security
reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off".

Fix the code setting force_on and move intel_iommu_tboot_noforce
from tboot code to intel iommu code.

Fixes: 7304e8f28b ("iommu/vt-d: Correctly disable Intel IOMMU force on")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com>
Tested-by: Lukasz Hawrylko <lukasz.hawrylko@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201110071908.3133-1-zhenzhong.duan@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-18 13:09:07 +00:00
Lukas Bulwahn
68dd9d89ea iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM
Commit 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM.
This function uses the dedicated static variable inv_type_granu_table
and functions to_vtd_granularity() and to_vtd_size().

These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence,
make CC=clang W=1 warns with an -Wunused-function warning.

Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM.

Fixes: 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-17 22:29:49 +00:00
Linus Torvalds
326fd6db61 A small set of fixes for x86:
- Cure the fallout from the MSI irqdomain overhaul which missed that the
    Intel IOMMU does not register virtual function devices and therefore
    never reaches the point where the MSI interrupt domain is assigned. This
    makes the VF devices use the non-remapped MSI domain which is trapped by
    the IOMMU/remap unit.
 
  - Remove an extra space in the SGI_UV architecture type procfs output for
    UV5.
 
  - Remove a unused function which was missed when removing the UV BAU TLB
    shootdown handler.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJi0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVWxD/9Tq4W6Kniln7mtoEWHRvHRceiiGcS3
 MocvqurhoJwirH4F2gkvCegTBy0r3FdUORy3OMmChVs6nb8XpPpso84SANCRePWp
 JZezpVwLSNC4O1/ZCg1Kjj4eUpzLB/UjUUQV9RsjL5wyQEhfCZgb1D40yLM/2dj5
 SkVm/EAqWuQNtYe/jqAOwTX/7mV+k2QEmKCNOigM13R9EWgu6a4J8ta1gtNSbwvN
 jWMW+M1KjZ76pfRK+y4OpbuFixteSzhSWYPITSGwQz4IpQ+Ty2Rv0zzjidmDnAR+
 Q73cup0dretdVnVDRpMwDc06dBCmt/rbN50w4yGU0YFRFDgjGc8sIbQzuIP81nEQ
 XY4l4rcBgyVufFsLrRpQxu1iYPFrcgU38W1kRkkJ3Kl/rY1a2ZU7sLE4kt4Oh55W
 A9KCmsfqP1PCYppjAQ0QT4NOp4YtecPvAU4UcBOb722DDBd8TfhLWWGw2yG57Q/d
 Wnu8xCJGy7BaLHLGGGseAft+D4aNnCjKC3jgMyvNtRDXaV2cK2Kdd6ehMlWVUapD
 xfLlKXE+igXMyoWJIWjTXQJs4dpKu6QpJCPiorwEZ8rmNaRfxsWEJVbeYwEkmUke
 bMoBBSCbZT86WVOYhI8WtrIemraY0mMYrrcE03M96HU3eYB8BV92KrIzZWThupcQ
 ZqkZbqCZm3vfHA==
 =X/P+
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A small set of fixes for x86:

   - Cure the fallout from the MSI irqdomain overhaul which missed that
     the Intel IOMMU does not register virtual function devices and
     therefore never reaches the point where the MSI interrupt domain is
     assigned. This made the VF devices use the non-remapped MSI domain
     which is trapped by the IOMMU/remap unit

   - Remove an extra space in the SGI_UV architecture type procfs output
     for UV5

   - Remove a unused function which was missed when removing the UV BAU
     TLB shootdown handler"

* tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  iommu/vt-d: Cure VF irqdomain hickup
  x86/platform/uv: Fix copied UV5 output archtype
  x86/platform/uv: Drop last traces of uv_flush_tlb_others
2020-11-15 09:49:56 -08:00
Thomas Gleixner
ff828729be iommu/vt-d: Cure VF irqdomain hickup
The recent changes to store the MSI irqdomain pointer in struct device
missed that Intel DMAR does not register virtual function devices.  Due to
that a VF device gets the plain PCI-MSI domain assigned and then issues
compat MSI messages which get caught by the interrupt remapping unit.

Cure that by inheriting the irq domain from the physical function
device.

Ideally the irqdomain would be associated to the bus, but DMAR can have
multiple units and therefore irqdomains on a single bus. The VF 'bus' could
of course inherit the domain from the PF, but that'd be yet another x86
oddity.

Fixes: 85a8dfc57a ("iommm/vt-d: Store irq domain in struct device")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Link: https://lore.kernel.org/r/draft-87eekymlpz.fsf@nanos.tec.linutronix.de
2020-11-13 12:00:40 +01:00
Linus Torvalds
3d5e28bff7 Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fixes from Konrad Rzeszutek Wilk:
 "Two tiny fixes for issues that make drivers under Xen unhappy under
  certain conditions"

* 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
  swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
2020-11-11 14:15:06 -08:00
Liu, Yi L
71cd8e2d16 iommu/vt-d: Fix a bug for PDP check in prq_event_thread
In prq_event_thread(), the QI_PGRP_PDP is wrongly set by
'req->pasid_present' which should be replaced to
'req->priv_data_present'.

Fixes: 5b438f4ba3 ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1604025444-6954-3-git-send-email-yi.y.sun@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-11-03 14:37:02 +01:00
Liu Yi L
eea4e29ab8 iommu/vt-d: Fix sid not set issue in intel_svm_bind_gpasid()
Should get correct sid and set it into sdev. Because we execute
'sdev->sid != req->rid' in the loop of prq_event_thread().

Fixes: eb8d93ea3c ("iommu/vt-d: Report page request faults for guest SVA")
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1604025444-6954-2-git-send-email-yi.y.sun@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-11-03 14:37:02 +01:00
Lu Baolu
6097df457a iommu/vt-d: Fix kernel NULL pointer dereference in find_domain()
If calling find_domain() for a device which hasn't been probed by the
iommu core, below kernel NULL pointer dereference issue happens.

[  362.736947] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  362.743953] #PF: supervisor read access in kernel mode
[  362.749115] #PF: error_code(0x0000) - not-present page
[  362.754278] PGD 0 P4D 0
[  362.756843] Oops: 0000 [#1] SMP NOPTI
[  362.760528] CPU: 0 PID: 844 Comm: cat Not tainted 5.9.0-rc4-intel-next+ #1
[  362.767428] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake
               U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3384.A02.1909200816
               09/20/2019
[  362.781109] RIP: 0010:find_domain+0xd/0x40
[  362.785234] Code: 48 81 fb 60 28 d9 b2 75 de 5b 41 5c 41 5d 5d c3 0f 1f 00 66
                     2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 e0 02 00
                     00 55 <48> 8b 40 38 48 89 e5 48 83 f8 fe 0f 94 c1 48 85 ff
                     0f 94 c2 08 d1
[  362.804041] RSP: 0018:ffffb09cc1f0bd38 EFLAGS: 00010046
[  362.809292] RAX: 0000000000000000 RBX: ffff905b98e4fac8 RCX: 0000000000000000
[  362.816452] RDX: 0000000000000001 RSI: ffff905b98e4fac8 RDI: ffff905b9ccd40d0
[  362.823617] RBP: ffffb09cc1f0bda0 R08: ffffb09cc1f0bd48 R09: 000000000000000f
[  362.830778] R10: ffffffffb266c080 R11: ffff905b9042602d R12: ffff905b98e4fac8
[  362.837944] R13: ffffb09cc1f0bd48 R14: ffff905b9ccd40d0 R15: ffff905b98e4fac8
[  362.845108] FS:  00007f8485460740(0000) GS:ffff905b9fc00000(0000)
               knlGS:0000000000000000
[  362.853227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.858996] CR2: 0000000000000038 CR3: 00000004627a6003 CR4: 0000000000770ef0
[  362.866161] PKRU: fffffffc
[  362.868890] Call Trace:
[  362.871363]  ? show_device_domain_translation+0x32/0x100
[  362.876700]  ? bind_store+0x110/0x110
[  362.880387]  ? klist_next+0x91/0x120
[  362.883987]  ? domain_translation_struct_show+0x50/0x50
[  362.889237]  bus_for_each_dev+0x79/0xc0
[  362.893121]  domain_translation_struct_show+0x36/0x50
[  362.898204]  seq_read+0x135/0x410
[  362.901545]  ? handle_mm_fault+0xeb8/0x1750
[  362.905755]  full_proxy_read+0x5c/0x90
[  362.909526]  vfs_read+0xa6/0x190
[  362.912782]  ksys_read+0x61/0xe0
[  362.916037]  __x64_sys_read+0x1a/0x20
[  362.919725]  do_syscall_64+0x37/0x80
[  362.923329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  362.928405] RIP: 0033:0x7f84855c5e95

Filter out those devices to avoid such error.

Fixes: e2726daea5 ("iommu/vt-d: debugfs: Add support to show page table internals")
Reported-and-tested-by: Xu Pengfei <pengfei.xu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org#v5.6+
Link: https://lore.kernel.org/r/20201028070725.24979-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-11-03 14:30:10 +01:00
Christoph Hellwig
fc0021aa34 swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
The tbl_dma_addr argument is used to check the DMA boundary for the
allocations, and thus needs to be a dma_addr_t.  swiotlb-xen instead
passed a physical address, which could lead to incorrect results for
strange offsets.  Fix this by removing the parameter entirely and hard
code the DMA address for io_tlb_start instead.

Fixes: 91ffe4ad53 ("swiotlb-xen: introduce phys_to_dma/dma_to_phys translations")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2020-11-02 10:10:39 -05:00
David Woodhouse
79eb3581bc iommu/vt-d: Simplify intel_irq_remapping_select()
Now that the old get_irq_domain() method has gone, consolidate on just the
map_XXX_to_iommu() functions.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-31-dwmw2@infradead.org
2020-10-28 20:26:28 +01:00
David Woodhouse
ed381fca47 x86: Kill all traces of irq_remapping_get_irq_domain()
All users are converted to use the fwspec based parent domain lookup.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-30-dwmw2@infradead.org
2020-10-28 20:26:28 +01:00
David Woodhouse
a87fb465ff iommu/vt-d: Implement select() method on remapping irqdomain
Preparatory for removing irq_remapping_get_irq_domain()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-26-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
David Woodhouse
5d5a971338 x86/ioapic: Generate RTE directly from parent irqchip's MSI message
The I/O-APIC generates an MSI cycle with address/data bits taken from its
Redirection Table Entry in some combination which used to make sense, but
now is just a bunch of bits which get passed through in some seemingly
arbitrary order.

Instead of making IRQ remapping drivers directly frob the I/OA-PIC RTE, let
them just do their job and generate an MSI message. The bit swizzling to
turn that MSI message into the I/O-APIC's RTE is the same in all cases,
since it's a function of the I/O-APIC hardware. The IRQ remappers have no
real need to get involved with that.

The only slight caveat is that the I/OAPIC is interpreting some of those
fields too, and it does want the 'vector' field to be unique to make EOI
work. The AMD IOMMU happens to put its IRTE index in the bits that the
I/O-APIC thinks are the vector field, and accommodates this requirement by
reserving the first 32 indices for the I/O-APIC.  The Intel IOMMU doesn't
actually use the bits that the I/O-APIC thinks are the vector field, so it
fills in the 'pin' value there instead.

[ tglx: Replaced the unreadably macro maze with the cleaned up RTE/msi_msg
  	bitfields and added commentry to explain the mapping magic ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-22-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
Thomas Gleixner
341b4a7211 x86/ioapic: Cleanup IO/APIC route entry structs
Having two seperate structs for the I/O-APIC RTE entries (non-remapped and
DMAR remapped) requires type casts and makes it hard to map.

Combine them in IO_APIC_routing_entry by defining a union of two 64bit
bitfields. Use naming which reflects which bits are shared and which bits
are actually different for the operating modes.

[dwmw2: Fix it up and finish the job, pulling the 32-bit w1,w2 words for
        register access into the same union and eliminating a few more
        places where bits were accessed through masks and shifts.]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-21-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
Thomas Gleixner
a27dca645d x86/io_apic: Cleanup trigger/polarity helpers
'trigger' and 'polarity' are used throughout the I/O-APIC code for handling
the trigger type (edge/level) and the active low/high configuration. While
there are defines for initializing these variables and struct members, they
are not used consequently and the meaning of 'trigger' and 'polarity' is
opaque and confusing at best.

Rename them to 'is_level' and 'active_low' and make them boolean in various
structs so it's entirely clear what the meaning is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-20-dwmw2@infradead.org
2020-10-28 20:26:26 +01:00
Thomas Gleixner
5c0d0e2cc6 iommu/intel: Use msi_msg shadow structs
Use the bitfields in the x86 shadow struct to compose the MSI message.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-14-dwmw2@infradead.org
2020-10-28 20:26:25 +01:00
Thomas Gleixner
8c44963b60 x86/apic: Cleanup destination mode
apic::irq_dest_mode is actually a boolean, but defined as u32 and named in
a way which does not explain what it means.

Make it a boolean and rename it to 'dest_mode_logical'

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-9-dwmw2@infradead.org
2020-10-28 20:26:25 +01:00
Thomas Gleixner
721612994f x86/apic: Cleanup delivery mode defines
The enum ioapic_irq_destination_types and the enumerated constants starting
with 'dest_' are gross misnomers because they describe the delivery mode.

Rename then enum and the constants so they actually make sense.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-6-dwmw2@infradead.org
2020-10-28 20:26:24 +01:00
Linus Torvalds
5c7e3f3f5c IOMMU Fix for Linux v5.10:
- Fix a build regression with !CONFIG_IOMMU_API
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl+Ns1UACgkQK/BELZcB
 GuPo3Q//RxTLMdvQ6Le3VB3ncbmYBKFWTEexStqrurWjIJbbQI/nZCJIGIkOXqi6
 HfJLGbn4QxGmYT0GXLqOZCyEgs8cicXgVBrWrWJepleZ9tY9kYMk9EUf8CFvWvBj
 Kg3B71q1sVflmG4OGxyxXcqZSD105oeOhAg8nDLxrqSgpEd/3lZfMKbWth/WP9GP
 qknxmhmlWZ3QHLdPvHNG2wuGOOr7jcAn8dSJurMaADF4F7Zk0JE912JBWXORgffU
 IrsRb8WxGP6wPp2veMKu5HWw7jSKrdpf+5N5gft0pdiF3AY2diJuPzBdvvyJDtGc
 MKLWL06H7wZsUygt+L9mW8PEH+VbgwErB83UjHgSsDHJfcD3xfXEpUslkRY5nldB
 73pXPLttGSVTf1xzN7YUANtcBiYaCoPRgdrHF9gs2/KSCUjcmBOViffsnpC32239
 NmJ71VvsWGFj2tRGoKsMxTxxWV9bfrPxdaa8W89HONvkcG2G1fh8escFEJtqxMUT
 F5f5rkf3O/z6/ikybqz/g1wsLB45zjP0StFTzvLquNwUeXkRdJqgurd6SzUDAFZA
 pmmdPQXMzpY7GxBBbrWhUGxJgXoz4DvYRscs1dXECYAtA4Jq5cBk9Ag1PSJRvkiy
 0X902jtauH1w76sYue2SJLI5XxbS8DDHGm79HrLR6ZhOr7gplAw=
 =og0e
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fix from Joerg Roedel:
 "Fix a build regression with !CONFIG_IOMMU_API"

* tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
2020-10-20 09:35:06 -07:00
Bartosz Golaszewski
9def3b1a07 iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
Since commit c40aaaac10 ("iommu/vt-d: Gracefully handle DMAR units
with no supported address widths") dmar.c needs struct iommu_device to
be selected. We can drop this dependency by not dereferencing struct
iommu_device if IOMMU_API is not selected and by reusing the information
stored in iommu->drhd->ignored instead.

This fixes the following build error when IOMMU_API is not selected:

drivers/iommu/intel/dmar.c: In function ‘free_iommu’:
drivers/iommu/intel/dmar.c:1139:41: error: ‘struct iommu_device’ has no member named ‘ops’
 1139 |  if (intel_iommu_enabled && iommu->iommu.ops) {
                                                ^

Fixes: c40aaaac10 ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20201013073055.11262-1-brgl@bgdev.pl
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-19 14:16:02 +02:00
Linus Torvalds
5a32c3413d dma-mapping updates for 5.10
- rework the non-coherent DMA allocator
  - move private definitions out of <linux/dma-mapping.h>
  - lower CMA_ALIGNMENT (Paul Cercueil)
  - remove the omap1 dma address translation in favor of the common
    code
  - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
  - support per-node DMA CMA areas (Barry Song)
  - increase the default seg boundary limit (Nicolin Chen)
  - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
  - various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw
 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD
 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey
 FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA
 gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76
 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF
 vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI
 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI
 D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B
 gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi
 JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg
 7fM=
 =Bkf/
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of <linux/dma-mapping.h>

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
  dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove <asm/dma-contiguous.h>
  dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split <linux/dma-mapping.h>
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement ->alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
2020-10-15 14:43:29 -07:00
Linus Torvalds
531d29b0b6 IOMMU Updates for Linux v5.10
Including:
 
 	- ARM-SMMU Updates from Will:
 
 	  - Continued SVM enablement, where page-table is shared with
 	    CPU
 
 	  - Groundwork to support integrated SMMU with Adreno GPU
 
 	  - Allow disabling of MSI-based polling on the kernel
 	    command-line
 
 	  - Minor driver fixes and cleanups (octal permissions, error
 	    messages, ...)
 
 	- Secure Nested Paging Support for AMD IOMMU. The IOMMU will
 	  fault when a device tries DMA on memory owned by a guest. This
 	  needs new fault-types as well as a rewrite of the IOMMU memory
 	  semaphore for command completions.
 
 	- Allow broken Intel IOMMUs (wrong address widths reported) to
 	  still be used for interrupt remapping.
 
 	- IOMMU UAPI updates for supporting vSVA, where the IOMMU can
 	  access address spaces of processes running in a VM.
 
 	- Support for the MT8167 IOMMU in the Mediatek IOMMU driver.
 
 	- Device-tree updates for the Renesas driver to support r8a7742.
 
 	- Several smaller fixes and cleanups all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl+Fy9MACgkQK/BELZcB
 GuNxtRAA0TdYHXt6XyLWmvRAX/ySZSz6KOneZWWwpsQ9wh2/iv1PtBsrV0ltf+6g
 CaX4ROZUVRbV9wPD+7maBRbzxrG3QhfEaaV+K45Q2J/QE1wjkyV8qj1eORWTUUoc
 nis4FhGDKk2ER/Gsajy2Hjs4+6i43gdWG/+ghVGaCRo8mCZyoz1/6AyMQyN3deuO
 NqWOv9E7hsavZjRs/w/LXG7eSE20cZwtt//kPVJF0r9eQqC6i1eJDQj48iRqJVqd
 R0dwBQZaLz++qQptyKebDNlmH/3aAsb+A8nCeS7ZwHqWC1QujTWOUYWpFyPPbOmC
 KVsQXzTzRfnVTDECF1Pk5d3yi45KILLU3B4zDJfUJjbL3KDYjuVUvhHF/pcGcjC3
 H1LWJqHSAL8sJwHvKhpi0VtQ5SOxXnLO5fGG/CZT/Xb4QyM+mkwkFLdn1TryZTR/
 M4XA+QuI96TzY7HQUJdSoEDANxoBef6gPnxdDKOnK1v4hfNsPAl7o8hZkM3w0DK8
 GoFZUV+vjBhFcymGcQegSNiea28Hfi+hBe+PPHCmw+tJm47cketD5uP5jJ5NGaUe
 MKU/QXWXc6oqeBTQT6ki5zJbJXKttbPa8eEmp+FrMatc9kruvBVhQoMbj7Vd3CA1
 dC4zK9Awy7yj24ZhZfnAFx2DboCmBTUI3QKjDt9K5PRZyMeyoP8=
 =C0Sg
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - ARM-SMMU Updates from Will:

      - Continued SVM enablement, where page-table is shared with CPU

      - Groundwork to support integrated SMMU with Adreno GPU

      - Allow disabling of MSI-based polling on the kernel command-line

      - Minor driver fixes and cleanups (octal permissions, error
        messages, ...)

 - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when
   a device tries DMA on memory owned by a guest. This needs new
   fault-types as well as a rewrite of the IOMMU memory semaphore for
   command completions.

 - Allow broken Intel IOMMUs (wrong address widths reported) to still be
   used for interrupt remapping.

 - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access
   address spaces of processes running in a VM.

 - Support for the MT8167 IOMMU in the Mediatek IOMMU driver.

 - Device-tree updates for the Renesas driver to support r8a7742.

 - Several smaller fixes and cleanups all over the place.

* tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits)
  iommu/vt-d: Gracefully handle DMAR units with no supported address widths
  iommu/vt-d: Check UAPI data processed by IOMMU core
  iommu/uapi: Handle data and argsz filled by users
  iommu/uapi: Rename uapi functions
  iommu/uapi: Use named union for user data
  iommu/uapi: Add argsz for user filled data
  docs: IOMMU user API
  iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
  iommu/arm-smmu-v3: Add SVA device feature
  iommu/arm-smmu-v3: Check for SVA features
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Move definitions to a header
  iommu/io-pgtable-arm: Move some definitions to a header
  iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer
  iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB
  iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR
  iommu/amd: Use 4K page for completion wait write-back semaphore
  iommu/tegra-smmu: Allow to group clients in same swgroup
  iommu/tegra-smmu: Fix iova->phys translation
  ...
2020-10-14 12:08:34 -07:00
Linus Torvalds
cf1d2b44f6 ACPI updates for 5.10-rc1
- Add support for generic initiator-only proximity domains to
    the ACPI NUMA code and the architectures using it (Jonathan
    Cameron).
 
  - Clean up some non-ACPICA code referring to debug facilities from
    ACPICA that are not actually used in there (Hanjun Guo).
 
  - Add new DPTF driver for the PCH FIVR participant (Srinivas
    Pandruvada).
 
  - Reduce overhead related to accessing GPE registers in ACPICA and
    the OS interface layer and make it possible to access GPE registers
    using logical addresses if they are memory-mapped (Rafael Wysocki).
 
  - Update the ACPICA code in the kernel to upstream revision 20200925
    including changes as follows:
    * Add predefined names from the SMBus sepcification (Bob Moore).
    * Update acpi_help UUID list (Bob Moore).
    * Return exceptions for string-to-integer conversions in iASL (Bob
      Moore).
    * Add a new "ALL <NameSeg>" debugger command (Bob Moore).
    * Add support for 64 bit risc-v compilation (Colin Ian King).
    * Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap).
 
  - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex
    Hung).
 
  - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out
    Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko).
 
  - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo).
 
  - Add missing config_item_put() to fix refcount leak (Hanjun Guo).
 
  - Drop lefrover field from struct acpi_memory_device (Hanjun Guo).
 
  - Make the ACPI extlog driver check for RDMSR failures (Ben
    Hutchings).
 
  - Fix handling of lid state changes in the ACPI button driver when
    input device is closed (Dmitry Torokhov).
 
  - Fix several assorted build issues (Barnabás Pőcze, John Garry,
    Nathan Chancellor, Tian Tao).
 
  - Drop unused inline functions and reduce code duplication by using
    kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing).
 
  - Serialize tools/power/acpi Makefile (Thomas Renninger).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl+F4IkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx1gIQAIZrt09fquEIZhYulGZAkuYhSX2U/DZt
 poow5+TiGk36JNHlbZS19kZ3F0tJ1wA6CKSfF/bYyULxL+gYaUjdLXzv2kArTSAj
 nzDXQ2CystpySZI/sEkl4QjsMg0xuZlBhlnCfNHzJw049TgdsJHnxMkJXb8T90A+
 l2JKm2OpBkNvQGNpwd3djLg8xSDnHUmuevsWZPHDp92/fLMF9DUBk8dVuEwa0ndF
 hAUpWm+EL1tJQnhNwtfV/Akd9Ypqgk/7ROFWFHGDtHMZGnBjpyXZw68vHMX7SL6N
 Ej90GWGPHSJs/7Fsg4Hiaxxcph9WFNLPcpck5lVAMIrNHMKANjqQzCsmHavV/WTG
 STC9/qwJauA1EOjovlmlCFHctjKE/ya6Hm299WTlfBqB+Lu1L3oMR2CC+Uj0YfyG
 sv3264rJCsaSw610iwQOG807qHENopASO2q5DuKG0E9JpcaBUwn1N4qP5svvQciq
 4aA8Ma6xM/QHCO4CS0Se9C0+WSVtxWwOUichRqQmU4E6u1sXvKJxTeWo79rV7PAh
 L6BwoOxBLabEiyzpi6HPGs6DoKj/N6tOQenBh4ibdwpAwMtq7hIlBFa0bp19c2wT
 vx8F2Raa8vbQ2zZ1QEiPZnPLJUoy2DgaCtKJ6E0FTDXNs3VFlWgyhIUlIRqk5BS9
 OnAwVAUrTMkJ
 =feLU
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These add support for generic initiator-only proximity domains to the
  ACPI NUMA code and the architectures using it, clean up some
  non-ACPICA code referring to debug facilities from ACPICA, reduce the
  overhead related to accessing GPE registers, add a new DPTF (Dynamic
  Power and Thermal Framework) participant driver, update the ACPICA
  code in the kernel to upstream revision 20200925, add a new ACPI
  backlight whitelist entry, fix a few assorted issues and clean up some
  code.

  Specifics:

   - Add support for generic initiator-only proximity domains to the
     ACPI NUMA code and the architectures using it (Jonathan Cameron)

   - Clean up some non-ACPICA code referring to debug facilities from
     ACPICA that are not actually used in there (Hanjun Guo)

   - Add new DPTF driver for the PCH FIVR participant (Srinivas
     Pandruvada)

   - Reduce overhead related to accessing GPE registers in ACPICA and
     the OS interface layer and make it possible to access GPE registers
     using logical addresses if they are memory-mapped (Rafael Wysocki)

   - Update the ACPICA code in the kernel to upstream revision 20200925
     including changes as follows:
      + Add predefined names from the SMBus sepcification (Bob Moore)
      + Update acpi_help UUID list (Bob Moore)
      + Return exceptions for string-to-integer conversions in iASL (Bob
        Moore)
      + Add a new "ALL <NameSeg>" debugger command (Bob Moore)
      + Add support for 64 bit risc-v compilation (Colin Ian King)
      + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap)

   - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex
     Hung)

   - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out
     Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko)

   - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo)

   - Add missing config_item_put() to fix refcount leak (Hanjun Guo)

   - Drop lefrover field from struct acpi_memory_device (Hanjun Guo)

   - Make the ACPI extlog driver check for RDMSR failures (Ben
     Hutchings)

   - Fix handling of lid state changes in the ACPI button driver when
     input device is closed (Dmitry Torokhov)

   - Fix several assorted build issues (Barnabás Pőcze, John Garry,
     Nathan Chancellor, Tian Tao)

   - Drop unused inline functions and reduce code duplication by using
     kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing)

   - Serialize tools/power/acpi Makefile (Thomas Renninger)"

* tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
  ACPICA: Update version to 20200925 Version 20200925
  ACPICA: Remove unnecessary semicolon
  ACPICA: Debugger: Add a new command: "ALL <NameSeg>"
  ACPICA: iASL: Return exceptions for string-to-integer conversions
  ACPICA: acpi_help: Update UUID list
  ACPICA: Add predefined names found in the SMBus sepcification
  ACPICA: Tree-wide: fix various typos and spelling mistakes
  ACPICA: Drop the repeated word "an" in a comment
  ACPICA: Add support for 64 bit risc-v compilation
  ACPI: button: fix handling lid state changes when input device closed
  tools/power/acpi: Serialize Makefile
  ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug()
  ACPI: memhotplug: Remove 'state' from struct acpi_memory_device
  ACPI / extlog: Check for RDMSR failure
  ACPI: Make acpi_evaluate_dsm() prototype consistent
  docs: mm: numaperf.rst Add brief description for access class 1.
  node: Add access1 class to represent CPU to memory characteristics
  ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
  ACPI: Let ACPI know we support Generic Initiator Affinity Structures
  x86: Support Generic Initiator only proximity domains
  ...
2020-10-14 11:42:04 -07:00
Rafael J. Wysocki
e4174ff78b Merge branch 'acpi-numa'
* acpi-numa:
  docs: mm: numaperf.rst Add brief description for access class 1.
  node: Add access1 class to represent CPU to memory characteristics
  ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
  ACPI: Let ACPI know we support Generic Initiator Affinity Structures
  x86: Support Generic Initiator only proximity domains
  ACPI: Support Generic Initiator only domains
  ACPI / NUMA: Add stub function for pxm_to_node()
  irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory
  ACPI: Remove side effect of partly creating a node in acpi_get_node()
  ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node()
  ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node()
  ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
  ACPI: Add out of bounds and numa_off protections to pxm_to_node()
2020-10-13 14:44:50 +02:00
Linus Torvalds
cc7343724e Surgery of the MSI interrupt handling to prepare the support of upcoming
devices which require non-PCI based MSI handling.
 
   - Cleanup historical leftovers all over the place
 
   - Rework the code to utilize more core functionality
 
   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.
 
   - Assign irqdomains to PCI devices at initialization time which allows
     to utilize the full functionality of hierarchical irqdomains.
 
   - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain
     which is assigned to the device for interrupt management.
 
   - Make the arch_.*_msi_irq() support conditional on a config switch and
     let the last few users select it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EUxcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoagLEACGp5U7a4mk24GsOZJDhrua1PHR/fhb
 enn/5yOPpxDXdYmtFHIjV5qrNjDTV/WqDlI96KOi+oinG1Eoj0O/MA1AcSRhp6nf
 jVdAuK1X0DHDUTEeTAP0JFwqd2j0KlIOphBrIMgeWIf1CRKlYiJaO+ioF9fKgwZ/
 /HigOTSykGYMPggm3JXnWTWtJkKSGFxeADBvVHt5RpVmbWtrI4YoSBxKEMtvjyeM
 5+GsqbCad1CnFYTN74N+QWVGmgGnUWGEzWsPYnJ9hW+yyjad1kWx3n6NcCWhssaC
 E4vAXl6JuCPntL7jBFkbfUkQsgq12ThMZYWpCq8pShJA9O2tDKkxIGasHWrIt4cz
 nYrESiv6hM7edjtOvBc086Gd0A2EyGOM879goHyaNVaTO4rI6jfZG7PlW1HHWibS
 mf/bdTXBtULGNgEt7T8Qnb8sZ+D01WqzLrq/wm645jIrTzvNHUEpOhT1aH/g4TFQ
 cNHD5PcM9OTmiBir9srNd47+1s2mpfwdMYHKBt2QgiXMO8fRgdtr6WLQE4vJjmG8
 sA0yGGsgdTKeg2wW1ERF1pWL0Lt05Iaa42Skm0D3BwcOG2n5ltkBHzVllto9cTUh
 kIldAOgxGE6QeCnnlrnbHz5mvzt/3Ih/PIKqPSUAC94Kx1yvVHRYuOvDExeO8DFB
 P+f0TkrscZObSg==
 =JlqV
 -----END PGP SIGNATURE-----

Merge tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq updates from Thomas Gleixner:
 "Surgery of the MSI interrupt handling to prepare the support of
  upcoming devices which require non-PCI based MSI handling:

   - Cleanup historical leftovers all over the place

   - Rework the code to utilize more core functionality

   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.

   - Assign irqdomains to PCI devices at initialization time which
     allows to utilize the full functionality of hierarchical
     irqdomains.

   - Remove arch_.*_msi_irq() functions from X86 and utilize the
     irqdomain which is assigned to the device for interrupt management.

   - Make the arch_.*_msi_irq() support conditional on a config switch
     and let the last few users select it"

* tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS
  x86/apic/msi: Unbreak DMAR and HPET MSI
  iommu/amd: Remove domain search for PCI/MSI
  iommu/vt-d: Remove domain search for PCI/MSI[X]
  x86/irq: Make most MSI ops XEN private
  x86/irq: Cleanup the arch_*_msi_irqs() leftovers
  PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
  x86/pci: Set default irq domain in pcibios_add_device()
  iommm/amd: Store irq domain in struct device
  iommm/vt-d: Store irq domain in struct device
  x86/xen: Wrap XEN MSI management into irqdomain
  irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
  x86/xen: Consolidate XEN-MSI init
  x86/xen: Rework MSI teardown
  x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
  PCI/MSI: Provide pci_dev_has_special_msi_domain() helper
  PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI
  irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
  x86/irq: Initialize PCI/MSI domain at PCI init time
  x86/pci: Reducde #ifdeffery in PCI init code
  ...
2020-10-12 11:40:41 -07:00
Linus Torvalds
ac74075e5d Initial support for sharing virtual addresses between the CPU and
devices which doesn't need pinning of pages for DMA anymore. Add support
 for the command submission to devices using new x86 instructions like
 ENQCMD{,S} and MOVDIR64B. In addition, add support for process address
 space identifiers (PASIDs) which are referenced by those command
 submission instructions along with the handling of the PASID state on
 context switch as another extended state. Work by Fenghua Yu, Ashok Raj,
 Yu-cheng Yu and Dave Jiang.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl996DIACgkQEsHwGGHe
 VUqM4A/+JDI3GxNyMyBpJR0nQ2vs23ru1o3OxvxhYtcacZ0cNwkaO7g3TLQxH+LZ
 k1QtvEd4jqI6BXV4de+HdZFDcqzikJf0KHnUflLTx956/Eop5rtxzMWVo69ZmYs8
 QrW0mLhyh8eq19cOHbQBb4M/HFc1DXBw+l7Ft3MeA1divOVESRB/uNxjA25K4PvV
 y+pipyUxqKSNhmBFf2bV8OVZloJiEtg3H6XudP0g/rZgjYe3qWxa+2iv6D08yBNe
 g7NpMDMql2uo1bcFON7se2oF34poAi49BfiIQb5G4m9pnPyvVEMOCijxCx2FHYyF
 nukyxt8g3Uq+UJYoolLNoWijL1jgBWeTBg1uuwsQOqWSARJx8nr859z0GfGyk2RP
 GNoYE4rrWBUMEqWk4xeiPPgRDzY0cgcGh0AeuWqNhgBfbbZeGL0t0m5kfytk5i1s
 W0YfRbz+T8+iYbgVfE/Zpthc7rH7iLL7/m34JC13+pzhPVTT32ECLJov2Ac8Tt15
 X+fOe6kmlDZa4GIhKRzUoR2aEyLpjufZ+ug50hznBQjGrQfcx7zFqRAU4sJx0Yyz
 rxUOJNZZlyJpkyXzc12xUvShaZvTcYenHGpxXl8TU3iMbY2otxk1Xdza8pc1LGQ/
 qneYgILgKa+hSBzKhXCPAAgSYtPlvQrRizArS8Y0k/9rYaKCfBU=
 =K9X4
 -----END PGP SIGNATURE-----

Merge tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 PASID updates from Borislav Petkov:
 "Initial support for sharing virtual addresses between the CPU and
  devices which doesn't need pinning of pages for DMA anymore.

  Add support for the command submission to devices using new x86
  instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support
  for process address space identifiers (PASIDs) which are referenced by
  those command submission instructions along with the handling of the
  PASID state on context switch as another extended state.

  Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang"

* tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction
  x86/asm: Carve out a generic movdir64b() helper for general usage
  x86/mmu: Allocate/free a PASID
  x86/cpufeatures: Mark ENQCMD as disabled when configured out
  mm: Add a pasid member to struct mm_struct
  x86/msr-index: Define an IA32_PASID MSR
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)
  iommu/vt-d: Change flags type to unsigned int in binding mm
  drm, iommu: Change type of pasid to u32
2020-10-12 10:40:34 -07:00
Joerg Roedel
7e3c3883c3 Merge branches 'arm/allwinner', 'arm/mediatek', 'arm/renesas', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/amd', 'x86/vt-d' and 'core' into next 2020-10-07 11:51:59 +02:00
David Woodhouse
c40aaaac10 iommu/vt-d: Gracefully handle DMAR units with no supported address widths
Instead of bailing out completely, such a unit can still be used for
interrupt remapping.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/549928db2de6532117f36c9c810373c14cf76f51.camel@infradead.org/
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-07 11:49:54 +02:00
Christoph Hellwig
0b1abd1fb7 dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:04 +02:00
Christoph Hellwig
0a0f0d8be7 dma-mapping: split <linux/dma-mapping.h>
Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers.  That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:03 +02:00
Lu Baolu
1a3f2fd7fc iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()
Lock(&iommu->lock) without disabling irq causes lockdep warnings.

[   12.703950] ========================================================
[   12.703962] WARNING: possible irq lock inversion dependency detected
[   12.703975] 5.9.0-rc6+ #659 Not tainted
[   12.703983] --------------------------------------------------------
[   12.703995] systemd-udevd/284 just changed the state of lock:
[   12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at:
               iommu_flush_dev_iotlb.part.57+0x2e/0x90
[   12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   12.704043]  (&iommu->lock){+.+.}-{2:2}
[   12.704045]

               and interrupts could create inverse lock ordering between
               them.

[   12.704073]
               other info that might help us debug this:
[   12.704085]  Possible interrupt unsafe locking scenario:

[   12.704097]        CPU0                    CPU1
[   12.704106]        ----                    ----
[   12.704115]   lock(&iommu->lock);
[   12.704123]                                local_irq_disable();
[   12.704134]                                lock(device_domain_lock);
[   12.704146]                                lock(&iommu->lock);
[   12.704158]   <Interrupt>
[   12.704164]     lock(device_domain_lock);
[   12.704174]
                *** DEADLOCK ***

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200927062428.13713-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:54:17 +02:00
Jacob Pan
6278eecba3 iommu/vt-d: Check UAPI data processed by IOMMU core
IOMMU generic layer already does sanity checks on UAPI data for version
match and argsz range based on generic information.

This patch adjusts the following data checking responsibilities:
- removes the redundant version check from VT-d driver
- removes the check for vendor specific data size
- adds check for the use of reserved/undefined flags

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/1601051567-54787-7-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:52:46 +02:00
Jacob Pan
8d3bb3b8cb iommu/uapi: Use named union for user data
IOMMU UAPI data size is filled by the user space which must be validated
by the kernel. To ensure backward compatibility, user data can only be
extended by either re-purpose padding bytes or extend the variable sized
union at the end. No size change is allowed before the union. Therefore,
the minimum size is the offset of the union.

To use offsetof() on the union, we must make it named.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/linux-iommu/20200611145518.0c2817d6@x1.home/
Link: https://lore.kernel.org/r/1601051567-54787-4-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:52:46 +02:00
Christoph Hellwig
efa70f2fdc dma-mapping: add a new dma_alloc_pages API
This API is the equivalent of alloc_pages, except that the returned memory
is guaranteed to be DMA addressable by the passed in device.  The
implementation will also be used to provide a more sensible replacement
for DMA_ATTR_NON_CONSISTENT flag.

Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
as its backend.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-25 06:20:47 +02:00
Christoph Hellwig
8c1c6c7588 Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next
Pull in the latest 5.9 tree for the commit to revert the
V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-25 06:19:19 +02:00
Jonathan Cameron
01feba590c ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
Several ACPI static tables contain references to proximity domains.

ACPI 6.3 has clarified that only entries in SRAT may define a new
domain (sec 5.2.16).

Those tables described in the ACPI spec have additional clarifying text.

NFIT: Table 5-132,

"Integer that represents the proximity domain to which the memory
 belongs. This number must match with corresponding entry in the
 SRAT table."

HMAT: Table 5-145,

"... This number must match with the corresponding entry in the SRAT
 table's processor affinity structure ... if the initiator is a processor,
 or the Generic Initiator Affinity Structure if the initiator is a generic
 initiator".

IORT and DMAR are defined by external specifications.

Intel Virtualization Technology for Directed I/O Rev 3.1 does not make any
explicit statements, but the general SRAT statement above will still apply.
https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf

IO Remapping Table, Platform Design Document rev D, also makes not explicit
statement, but refers to ACPI SRAT table for more information and again the
generic SRAT statement above applies.
https://developer.arm.com/documentation/den0049/d/

In conclusion, any proximity domain specified in these tables, should be a
reference to a proximity domain also found in SRAT, and they should not be
able to instantiate a new domain.  Hence we switch to pxm_to_node() which
will only return existing nodes.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-24 12:57:37 +02:00
Lu Baolu
d2ef096249 iommu/vt-d: Use device numa domain if RHSA is missing
If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
table, we could default to the device NUMA domain as fall back.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200904010303.2961-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20200922060843.31546-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-24 10:59:55 +02:00