mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-08-15 16:23:10 +00:00

With KHO in place, let's add documentation that describes what it is and how to use it. Link: https://lkml.kernel.org/r/20250509074635.3187114-17-changyuanl@google.com Signed-off-by: Alexander Graf <graf@amazon.com> Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Co-developed-by: Changyuan Lyu <changyuanl@google.com> Signed-off-by: Changyuan Lyu <changyuanl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Gowans <jgowans@amazon.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Rob Herring <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
75 lines
3.2 KiB
ReStructuredText
75 lines
3.2 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0-or-later
|
|
.. _kho-concepts:
|
|
|
|
=======================
|
|
Kexec Handover Concepts
|
|
=======================
|
|
|
|
Kexec HandOver (KHO) is a mechanism that allows Linux to preserve memory
|
|
regions, which could contain serialized system states, across kexec.
|
|
|
|
It introduces multiple concepts:
|
|
|
|
KHO FDT
|
|
=======
|
|
|
|
Every KHO kexec carries a KHO specific flattened device tree (FDT) blob
|
|
that describes preserved memory regions. These regions contain either
|
|
serialized subsystem states, or in-memory data that shall not be touched
|
|
across kexec. After KHO, subsystems can retrieve and restore preserved
|
|
memory regions from KHO FDT.
|
|
|
|
KHO only uses the FDT container format and libfdt library, but does not
|
|
adhere to the same property semantics that normal device trees do: Properties
|
|
are passed in native endianness and standardized properties like ``regs`` and
|
|
``ranges`` do not exist, hence there are no ``#...-cells`` properties.
|
|
|
|
KHO is still under development. The FDT schema is unstable and would change
|
|
in the future.
|
|
|
|
Scratch Regions
|
|
===============
|
|
|
|
To boot into kexec, we need to have a physically contiguous memory range that
|
|
contains no handed over memory. Kexec then places the target kernel and initrd
|
|
into that region. The new kernel exclusively uses this region for memory
|
|
allocations before during boot up to the initialization of the page allocator.
|
|
|
|
We guarantee that we always have such regions through the scratch regions: On
|
|
first boot KHO allocates several physically contiguous memory regions. Since
|
|
after kexec these regions will be used by early memory allocations, there is a
|
|
scratch region per NUMA node plus a scratch region to satisfy allocations
|
|
requests that do not require particular NUMA node assignment.
|
|
By default, size of the scratch region is calculated based on amount of memory
|
|
allocated during boot. The ``kho_scratch`` kernel command line option may be
|
|
used to explicitly define size of the scratch regions.
|
|
The scratch regions are declared as CMA when page allocator is initialized so
|
|
that their memory can be used during system lifetime. CMA gives us the
|
|
guarantee that no handover pages land in that region, because handover pages
|
|
must be at a static physical memory location and CMA enforces that only
|
|
movable pages can be located inside.
|
|
|
|
After KHO kexec, we ignore the ``kho_scratch`` kernel command line option and
|
|
instead reuse the exact same region that was originally allocated. This allows
|
|
us to recursively execute any amount of KHO kexecs. Because we used this region
|
|
for boot memory allocations and as target memory for kexec blobs, some parts
|
|
of that memory region may be reserved. These reservations are irrelevant for
|
|
the next KHO, because kexec can overwrite even the original kernel.
|
|
|
|
.. _kho-finalization-phase:
|
|
|
|
KHO finalization phase
|
|
======================
|
|
|
|
To enable user space based kexec file loader, the kernel needs to be able to
|
|
provide the FDT that describes the current kernel's state before
|
|
performing the actual kexec. The process of generating that FDT is
|
|
called serialization. When the FDT is generated, some properties
|
|
of the system may become immutable because they are already written down
|
|
in the FDT. That state is called the KHO finalization phase.
|
|
|
|
Public API
|
|
==========
|
|
.. kernel-doc:: kernel/kexec_handover.c
|
|
:export:
|