mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
synced 2025-09-01 15:14:52 +00:00

Certain operations on memory, such as memory repair, are permitted only when the address and other attributes for the operation are from the current boot. This is determined by checking whether the memory attributes for the operation match those in the CXL gen_media or CXL DRAM memory event records reported during the current boot. The CXL event records must be backed up because they are cleared in the hardware after being processed by the kernel. Support is added for storing CXL gen_media or CXL DRAM memory event records in xarrays. Old records are deleted when they expire or when there is an overflow and which depends on platform correctly report Event Record Timestamp field of CXL spec Table 8-55 Common Event Record Format. Additionally, helper functions are implemented to find a matching record in the xarray storage based on the memory attributes and repair type. Add validity check, when matching attributes for sparing, using the validity flag in the DRAM event record, to ensure that all required attributes for a requested repair operation are valid and set. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250521124749.817-7-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
237 lines
8.4 KiB
Plaintext
237 lines
8.4 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
menuconfig CXL_BUS
|
|
tristate "CXL (Compute Express Link) Devices Support"
|
|
depends on PCI
|
|
select FW_LOADER
|
|
select FW_UPLOAD
|
|
select PCI_DOE
|
|
select FIRMWARE_TABLE
|
|
select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS
|
|
select FWCTL if CXL_FEATURES
|
|
help
|
|
CXL is a bus that is electrically compatible with PCI Express, but
|
|
layers three protocols on that signalling (CXL.io, CXL.cache, and
|
|
CXL.mem). The CXL.cache protocol allows devices to hold cachelines
|
|
locally, the CXL.mem protocol allows devices to be fully coherent
|
|
memory targets, the CXL.io protocol is equivalent to PCI Express.
|
|
Say 'y' to enable support for the configuration and management of
|
|
devices supporting these protocols.
|
|
|
|
if CXL_BUS
|
|
|
|
config CXL_PCI
|
|
tristate "PCI manageability"
|
|
default CXL_BUS
|
|
help
|
|
The CXL specification defines a "CXL memory device" sub-class in the
|
|
PCI "memory controller" base class of devices. Device's identified by
|
|
this class code provide support for volatile and / or persistent
|
|
memory to be mapped into the system address map (Host-managed Device
|
|
Memory (HDM)).
|
|
|
|
Say 'y/m' to enable a driver that will attach to CXL memory expander
|
|
devices enumerated by the memory device class code for configuration
|
|
and management primarily via the mailbox interface. See Chapter 2.3
|
|
Type 3 CXL Device in the CXL 2.0 specification for more details.
|
|
|
|
If unsure say 'm'.
|
|
|
|
config CXL_MEM_RAW_COMMANDS
|
|
bool "RAW Command Interface for Memory Devices"
|
|
depends on CXL_PCI
|
|
help
|
|
Enable CXL RAW command interface.
|
|
|
|
The CXL driver ioctl interface may assign a kernel ioctl command
|
|
number for each specification defined opcode. At any given point in
|
|
time the number of opcodes that the specification defines and a device
|
|
may implement may exceed the kernel's set of associated ioctl function
|
|
numbers. The mismatch is either by omission, specification is too new,
|
|
or by design. When prototyping new hardware, or developing / debugging
|
|
the driver it is useful to be able to submit any possible command to
|
|
the hardware, even commands that may crash the kernel due to their
|
|
potential impact to memory currently in use by the kernel.
|
|
|
|
If developing CXL hardware or the driver say Y, otherwise say N.
|
|
|
|
config CXL_ACPI
|
|
tristate "CXL ACPI: Platform Support"
|
|
depends on ACPI
|
|
depends on ACPI_NUMA
|
|
default CXL_BUS
|
|
select ACPI_TABLE_LIB
|
|
select ACPI_HMAT
|
|
select CXL_PORT
|
|
help
|
|
Enable support for host managed device memory (HDM) resources
|
|
published by a platform's ACPI CXL memory layout description. See
|
|
Chapter 9.14.1 CXL Early Discovery Table (CEDT) in the CXL 2.0
|
|
specification, and CXL Fixed Memory Window Structures (CEDT.CFMWS)
|
|
(https://www.computeexpresslink.org/spec-landing). The CXL core
|
|
consumes these resource to publish the root of a cxl_port decode
|
|
hierarchy to map regions that represent System RAM, or Persistent
|
|
Memory regions to be managed by LIBNVDIMM.
|
|
|
|
If unsure say 'm'.
|
|
|
|
config CXL_PMEM
|
|
tristate "CXL PMEM: Persistent Memory Support"
|
|
depends on LIBNVDIMM
|
|
default CXL_BUS
|
|
help
|
|
In addition to typical memory resources a platform may also advertise
|
|
support for persistent memory attached via CXL. This support is
|
|
managed via a bridge driver from CXL to the LIBNVDIMM system
|
|
subsystem. Say 'y/m' to enable support for enumerating and
|
|
provisioning the persistent memory capacity of CXL memory expanders.
|
|
|
|
If unsure say 'm'.
|
|
|
|
config CXL_MEM
|
|
tristate "CXL: Memory Expansion"
|
|
depends on CXL_PCI
|
|
default CXL_BUS
|
|
help
|
|
The CXL.mem protocol allows a device to act as a provider of "System
|
|
RAM" and/or "Persistent Memory" that is fully coherent as if the
|
|
memory were attached to the typical CPU memory controller. This is
|
|
known as HDM "Host-managed Device Memory".
|
|
|
|
Say 'y/m' to enable a driver that will attach to CXL.mem devices for
|
|
memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
|
|
specification for a detailed description of HDM.
|
|
|
|
If unsure say 'm'.
|
|
|
|
config CXL_FEATURES
|
|
bool "CXL: Features"
|
|
depends on CXL_PCI
|
|
help
|
|
Enable support for CXL Features. A CXL device that includes a mailbox
|
|
supports commands that allows listing, getting, and setting of
|
|
optionally defined features such as memory sparing or post package
|
|
sparing. Vendors may define custom features for the device.
|
|
|
|
If unsure say 'n'
|
|
|
|
config CXL_EDAC_MEM_FEATURES
|
|
bool "CXL: EDAC Memory Features"
|
|
depends on EXPERT
|
|
depends on CXL_MEM
|
|
depends on CXL_FEATURES
|
|
depends on EDAC >= CXL_BUS
|
|
help
|
|
The CXL EDAC memory feature is optional and allows host to
|
|
control the EDAC memory features configurations of CXL memory
|
|
expander devices.
|
|
|
|
Say 'y' if you have an expert need to change default settings
|
|
of a memory RAS feature established by the platform/device.
|
|
Otherwise say 'n'.
|
|
|
|
config CXL_EDAC_SCRUB
|
|
bool "Enable CXL Patrol Scrub Control (Patrol Read)"
|
|
depends on CXL_EDAC_MEM_FEATURES
|
|
depends on EDAC_SCRUB
|
|
help
|
|
The CXL EDAC scrub control is optional and allows host to
|
|
control the scrub feature configurations of CXL memory expander
|
|
devices.
|
|
|
|
When enabled 'cxl_mem' and 'cxl_region' EDAC devices are
|
|
published with memory scrub control attributes as described by
|
|
Documentation/ABI/testing/sysfs-edac-scrub.
|
|
|
|
Say 'y' if you have an expert need to change default settings
|
|
of a memory scrub feature established by the platform/device
|
|
(e.g. scrub rates for the patrol scrub feature).
|
|
Otherwise say 'n'.
|
|
|
|
config CXL_EDAC_ECS
|
|
bool "Enable CXL Error Check Scrub (Repair)"
|
|
depends on CXL_EDAC_MEM_FEATURES
|
|
depends on EDAC_ECS
|
|
help
|
|
The CXL EDAC ECS control is optional and allows host to
|
|
control the ECS feature configurations of CXL memory expander
|
|
devices.
|
|
|
|
When enabled 'cxl_mem' EDAC devices are published with memory
|
|
ECS control attributes as described by
|
|
Documentation/ABI/testing/sysfs-edac-ecs.
|
|
|
|
Say 'y' if you have an expert need to change default settings
|
|
of a memory ECS feature established by the platform/device.
|
|
Otherwise say 'n'.
|
|
|
|
config CXL_EDAC_MEM_REPAIR
|
|
bool "Enable CXL Memory Repair"
|
|
depends on CXL_EDAC_MEM_FEATURES
|
|
depends on EDAC_MEM_REPAIR
|
|
help
|
|
The CXL EDAC memory repair control is optional and allows host
|
|
to control the memory repair features (e.g. sparing, PPR)
|
|
configurations of CXL memory expander devices.
|
|
|
|
When enabled, the memory repair feature requires an additional
|
|
memory of approximately 43KB to store CXL DRAM and CXL general
|
|
media event records.
|
|
|
|
When enabled 'cxl_mem' EDAC devices are published with memory
|
|
repair control attributes as described by
|
|
Documentation/ABI/testing/sysfs-edac-memory-repair.
|
|
|
|
Say 'y' if you have an expert need to change default settings
|
|
of a memory repair feature established by the platform/device.
|
|
Otherwise say 'n'.
|
|
|
|
config CXL_PORT
|
|
default CXL_BUS
|
|
tristate
|
|
|
|
config CXL_SUSPEND
|
|
def_bool y
|
|
depends on SUSPEND && CXL_MEM
|
|
|
|
config CXL_REGION
|
|
bool "CXL: Region Support"
|
|
default CXL_BUS
|
|
# For MAX_PHYSMEM_BITS
|
|
depends on SPARSEMEM
|
|
select MEMREGION
|
|
select GET_FREE_REGION
|
|
help
|
|
Enable the CXL core to enumerate and provision CXL regions. A CXL
|
|
region is defined by one or more CXL expanders that decode a given
|
|
system-physical address range. For CXL regions established by
|
|
platform-firmware this option enables memory error handling to
|
|
identify the devices participating in a given interleaved memory
|
|
range. Otherwise, platform-firmware managed CXL is enabled by being
|
|
placed in the system address map and does not need a driver.
|
|
|
|
If unsure say 'y'
|
|
|
|
config CXL_REGION_INVALIDATION_TEST
|
|
bool "CXL: Region Cache Management Bypass (TEST)"
|
|
depends on CXL_REGION
|
|
help
|
|
CXL Region management and security operations potentially invalidate
|
|
the content of CPU caches without notifying those caches to
|
|
invalidate the affected cachelines. The CXL Region driver attempts
|
|
to invalidate caches when those events occur. If that invalidation
|
|
fails the region will fail to enable. Reasons for cache
|
|
invalidation failure are due to the CPU not providing a cache
|
|
invalidation mechanism. For example usage of wbinvd is restricted to
|
|
bare metal x86. However, for testing purposes toggling this option
|
|
can disable that data integrity safety and proceed with enabling
|
|
regions when there might be conflicting contents in the CPU cache.
|
|
|
|
If unsure, or if this kernel is meant for production environments,
|
|
say N.
|
|
|
|
config CXL_MCE
|
|
def_bool y
|
|
depends on X86_MCE && MEMORY_FAILURE
|
|
|
|
endif
|