mirror_ubuntu-kernels/arch/x86/kernel/cpu
Reinette Chatre 7b72c823dd x86/sgx: Reduce delay and interference of enclave release
commit 8795359e35 ("x86/sgx: Silence softlockup detection when
releasing large enclaves") introduced a cond_resched() during enclave
release where the EREMOVE instruction is applied to every 4k enclave
page. Giving other tasks an opportunity to run while tearing down a
large enclave placates the soft lockup detector but Iqbal found
that the fix causes a 25% performance degradation of a workload
run using Gramine.

Gramine maintains a 1:1 mapping between processes and SGX enclaves.
That means if a workload in an enclave creates a subprocess then
Gramine creates a duplicate enclave for that subprocess to run in.
The consequence is that the release of the enclave used to run
the subprocess can impact the performance of the workload that is
run in the original enclave, especially in large enclaves when
SGX2 is not in use.

The workload run by Iqbal behaves as follows:
Create enclave (enclave "A")
/* Initialize workload in enclave "A" */
Create enclave (enclave "B")
/* Run subprocess in enclave "B" and send result to enclave "A" */
Release enclave (enclave "B")
/* Run workload in enclave "A" */
Release enclave (enclave "A")

The performance impact of releasing enclave "B" in the above scenario
is amplified when there is a lot of SGX memory and the enclave size
matches the SGX memory. When there is 128GB SGX memory and an enclave
size of 128GB, from the time enclave "B" starts the 128GB SGX memory
is oversubscribed with a combined demand for 256GB from the two
enclaves.

Before commit 8795359e35 ("x86/sgx: Silence softlockup detection when
releasing large enclaves") enclave release was done in a tight loop
without giving other tasks a chance to run. Even though the system
experienced soft lockups the workload (run in enclave "A") obtained
good performance numbers because when the workload started running
there was no interference.

Commit 8795359e35 ("x86/sgx: Silence softlockup detection when
releasing large enclaves") gave other tasks opportunity to run while an
enclave is released. The impact of this in this scenario is that while
enclave "B" is released and needing to access each page that belongs
to it in order to run the SGX EREMOVE instruction on it, enclave "A"
is attempting to run the workload needing to access the enclave
pages that belong to it. This causes a lot of swapping due to the
demand for the oversubscribed SGX memory. Longer latencies are
experienced by the workload in enclave "A" while enclave "B" is
released.

Improve the performance of enclave release while still avoiding the
soft lockup detector with two enhancements:
- Only call cond_resched() after XA_CHECK_SCHED iterations.
- Use the xarray advanced API to keep the xarray locked for
  XA_CHECK_SCHED iterations instead of locking and unlocking
  at every iteration.

This batching solution is copied from sgx_encl_may_map() that
also iterates through all enclave pages using this technique.

With this enhancement the workload experiences a 5%
performance degradation when compared to a kernel without
commit 8795359e35 ("x86/sgx: Silence softlockup detection when
releasing large enclaves"), an improvement to the reported 25%
degradation, while still placating the soft lockup detector.

Scenarios with poor performance are still possible even with these
enhancements. For example, short workloads creating sub processes
while running in large enclaves. Further performance improvements
are pursued in user space through avoiding to create duplicate enclaves
for certain sub processes, and using SGX2 that will do lazy allocation
of pages as needed so enclaves created for sub processes start quickly
and release quickly.

Fixes: 8795359e35 ("x86/sgx: Silence softlockup detection when releasing large enclaves")
Reported-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com>
Link: https://lore.kernel.org/all/00efa80dd9e35dc85753e1c5edb0344ac07bb1f0.1667236485.git.reinette.chatre%40intel.com
2022-10-31 13:40:35 -07:00
..
mce x86/mce: Retrieve poison range from hardware 2022-08-29 09:33:42 +02:00
microcode x86/microcode/AMD: Apply the patch early on every logical thread 2022-10-18 11:03:27 +02:00
mtrr x86/mtrr: Replace deprecated CPU-hotplug functions. 2021-08-10 14:46:27 +02:00
resctrl x86/resctrl: Fix min_cbm_bits for AMD 2022-10-18 20:25:16 +02:00
sgx x86/sgx: Reduce delay and interference of enclave release 2022-10-31 13:40:35 -07:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
acrn.c x86/acrn: Set up timekeeping 2022-08-04 11:11:59 +02:00
amd.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
aperfmperf.c x86/aperfmperf: Integrate the fallback code from show_cpuinfo() 2022-04-27 20:22:20 +02:00
bugs.c x86/bugs: Add "unknown" reporting for MMIO Stale Data 2022-08-18 15:35:22 +02:00
cacheinfo.c x86/cacheinfo: move shared cache map definitions 2022-07-17 17:31:40 -07:00
centaur.c x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support 2020-09-11 10:53:19 +02:00
common.c x86/bugs: Add "unknown" reporting for MMIO Stale Data 2022-08-18 15:35:22 +02:00
cpu.h x86/cpu/amd: Add Spectral Chicken 2022-06-27 10:34:00 +02:00
cpuid-deps.c x86/fpu: Optimize out sigframe xfeatures when in init state 2021-11-03 22:42:35 +01:00
cyrix.c x86/cyrix: include header linux/isa-dma.h 2022-07-26 14:03:12 -05:00
feat_ctl.c x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype 2022-09-26 17:06:27 +02:00
hygon.c x86/cpu/amd: Add Spectral Chicken 2022-06-27 10:34:00 +02:00
hypervisor.c x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable 2019-07-17 08:09:59 +02:00
intel_epb.c x86: intel_epb: Allow model specific normal EPB value 2022-01-04 16:37:23 +01:00
intel_pconfig.c
intel.c x86/bus_lock: Don't assume the init value of DEBUGCTLMSR.BUS_LOCK_DETECT to be zero 2022-08-02 13:42:00 +02:00
Makefile x86: kmsan: disable instrumentation of unsupported code 2022-10-03 14:03:24 -07:00
match.c x86/cpu: Add a steppings field to struct x86_cpu_id 2020-04-20 12:19:21 +02:00
mkcapflags.sh x86/cpu: Print VMX flags in /proc/cpuinfo using VMX_FEATURES_* 2020-01-13 18:36:02 +01:00
mshyperv.c hyperv-next for 5.19 2022-05-28 11:39:01 -07:00
perfctr-watchdog.c x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs 2021-06-10 10:04:40 +02:00
powerflags.c
proc.c x86/aperfmperf: Integrate the fallback code from show_cpuinfo() 2022-04-27 20:22:20 +02:00
rdrand.c x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" 2022-07-18 15:04:04 +02:00
scattered.c x86/cpufeatures: Add LbrExtV2 feature bit 2022-08-27 00:05:42 +02:00
topology.c x86/topology: Fix duplicated core ID within a package 2022-10-17 11:58:52 -07:00
transmeta.c
tsx.c x86/tsx: Disable TSX development mode at boot 2022-04-11 09:58:40 +02:00
umc.c
umwait.c KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL 2020-06-22 20:54:57 -04:00
vmware.c x86/vmware: Use BIT() macro for shifting 2022-06-22 11:23:14 +02:00
vortex.c x86/CPU: Add support for Vortex CPUs 2021-10-21 15:49:07 +02:00
zhaoxin.c x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup 2020-06-15 14:18:37 +02:00