Make it match its definition (size_t vs unsigned long). And declare
it in a shared header to fix the -Wmissing-prototypes warning, as it
is defined in the user code and called in the kernel code.
Fixes: 5b301409e8 ("UML: add support for KASAN under x86_64")
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The __switch_mm function is defined in the user code, and is called
by the kernel code. It should be declared in a shared header.
Fixes: 4dc706c2f2 ("um: take um_mmu.h to asm/mmu.h, clean asm/mmu_context.h a bit")
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The host PID tracked in 'cpu_tasks' is no longer used. Stopping
tracking it will also save some cycles.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This wrapps all external vmalloc allocation functions with the
alloc_hooks() wrapper, and switches internal allocations to _noprof
variants where appropriate, for the new memory allocation profiling
feature.
[surenb@google.com: arch/um: fix forward declaration for vmalloc]
Link: https://lkml.kernel.org/r/20240326073750.726636-1-surenb@google.com
[surenb@google.com: undo _noprof additions in the documentation]
Link: https://lkml.kernel.org/r/20240326231453.1206227-5-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-31-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This will address below -Wmissing-prototypes warnings:
arch/um/kernel/initrd.c:18:12: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes]
arch/um/kernel/um_arch.c:408:19: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes]
arch/um/os-Linux/start_up.c:301:12: warning: no previous prototype for ‘parse_iomem’ [-Wmissing-prototypes]
arch/x86/um/ptrace_32.c:15:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes]
arch/x86/um/ptrace_32.c:101:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes]
arch/x86/um/ptrace_32.c:153:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes]
arch/x86/um/ptrace_64.c:111:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes]
arch/x86/um/ptrace_64.c:171:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes]
arch/x86/um/syscalls_64.c:48:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes]
arch/x86/um/tls_32.c:184:5: warning: no previous prototype for ‘arch_switch_tls’ [-Wmissing-prototypes]
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The definition of vfree has changed since commit b3bdda02aa
("vmalloc: add const to void* parameters"). Update the declaration
of vfree in um_malloc.h to match the latest definition.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The ARCH=um build has its own idea about strscpy()'s definition. Adjust
the callers to remove the redundant sizeof() arguments ahead of treewide
changes, since it needs a manual adjustment for the newly named
sized_strscpy() export.
Cc: Richard Weinberger <richard@nod.at>
Cc: linux-um@lists.infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Using sizeof(dst) for the "size" argument in strscpy() is the
overwhelmingly common case. Instead of requiring this everywhere, allow a
2-argument version to be used that will use the sizeof() internally. There
are other functions in the kernel with optional arguments[1], so this
isn't unprecedented, and improves readability. Update and relocate the
kern-doc for strscpy() too, and drop __HAVE_ARCH_STRSCPY as it is unused.
Adjust ARCH=um build to notice the changed export name, as it doesn't
do full header includes for the string helpers.
This could additionally let us save a few hundred lines of code:
1177 files changed, 2455 insertions(+), 3026 deletions(-)
with a treewide cleanup using Coccinelle:
@needless_arg@
expression DST, SRC;
@@
strscpy(DST, SRC
-, sizeof(DST)
)
Link: https://elixir.bootlin.com/linux/v6.7/source/include/linux/pci.h#L1517 [1]
Reviewed-by: Justin Stitt <justinstitt@google.com>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
These functions were only used when calling PTRACE_ARCH_PRCTL, but this
code has been removed.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
These registers are saved/restored together with the other general
registers using ptrace. In arch_set_tls we then just need to set the
register and it will be synced back normally.
Most of this logic was introduced in commit f355559cf7 ("[PATCH] uml:
x86_64 thread fixes"). However, at least today we can rely on ptrace to
restore the base registers for us. As such, only the part of the patch
that tracks the FS register for use as thread local storage is actually
needed.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
These features have existed since Linux 2.6.14 and can be considered
widely available at this point. Also drop the backward compatibility
code for PTRACE_SETOPTIONS.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
----
v2:
* Continue to define PTRACE_SYSEMU_SINGLESTEP as glibc only added it in
version 2.27.
Signed-off-by: Richard Weinberger <richard@nod.at>
__cant_sleep was already used and exported by the scheduler.
The name had to be changed to a UML specific one.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Reviewed-by: Peter Lafreniere <peter@n8pjl.ca>
Signed-off-by: Richard Weinberger <richard@nod.at>
Fixes the following build errors observed from W=1 builds:
arch/um/drivers/xterm_kern.c:35:5: warning: no previous prototype for
function 'xterm_fd' [-Wmissing-prototypes]
35 | int xterm_fd(int socket, int *pid_out)
| ^
arch/um/drivers/xterm_kern.c:35:1: note: declare 'static' if the
function is not intended to be used outside of this translation unit
35 | int xterm_fd(int socket, int *pid_out)
| ^
| static
arch/um/drivers/chan_kern.c:183:6: warning: no previous prototype for
function 'free_irqs' [-Wmissing-prototypes]
183 | void free_irqs(void)
| ^
arch/um/drivers/chan_kern.c:183:1: note: declare 'static' if the
function is not intended to be used outside of this translation unit
183 | void free_irqs(void)
| ^
| static
arch/um/drivers/slirp_kern.c:18:6: warning: no previous prototype for
function 'slirp_init' [-Wmissing-prototypes]
18 | void slirp_init(struct net_device *dev, void *data)
| ^
arch/um/drivers/slirp_kern.c:18:1: note: declare 'static' if the
function is not intended to be used outside of this translation unit
18 | void slirp_init(struct net_device *dev, void *data)
| ^
| static
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308081050.sZEw4cQ5-lkp@intel.com/
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89
Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
[rw: Massaged subject]
Signed-off-by: Richard Weinberger <richard@nod.at>
There's a lot of code here that hard-codes that the
data is a single page, and right now that seems to
be sufficient, but to make it easier to change this
in the future, add a new STUB_DATA_PAGES constant
and use it throughout the code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
- KASAN support for x86_64
- noreboot command line option, just like qemu's -no-reboot
- Various fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdgfidid8lnn52cLTZvlZhesYu8EFAmLteZYACgkQZvlZhesY
u8F8bRAA2806QUzysg3Nj1AKPiTOj47TuluGu4SXytB0usQgYK/n3Fxr36ULJAOJ
3qZWf2fsAkBLvgX9Sw2QFGfulrpfKnLeTdBXSEbWYWhZ0ZoaEJztKmtfH02kRDOW
POedQT5FXMDVjGQdLC7Ycp+WyjaUwrccZ+KRkGWmlr7vNFlxcTlEqBb13mgLdjkY
ep8X+SgmAcdvWBd/os+nNn9Al6TbFd4XQCok82DtNrv0ggwXnVPov/ArvZvvn2Oj
F028X77180rbrGV+ZnDkV1KSv/ccT5EFebJkfEEcYVjre8o0QoPQmh2tFqXN0d83
2WpIOb1+mQL0VClpC4hKbScpIB5tw8vIHsUT+ifloIgY/puhezx6aWm0TKSA+aTM
WitJl1Nf4uNu1rqkBkn9o3VK8CYokTALQIRexHCzvZ70CSxmFbR7EVRSTf7Rr690
Oq7StHagfuTJpddh0wQwaMorIH4s0/bpPoA6m4OhwlppnCpY0Hfl3+AKluNRUtH6
lPeQwfxhd/LKqYW0COElEnReDLzer82kUx/keVyxVINqxpm6YTHVtOgtMCEuVNXg
GbS8PFCW2mIP8Is6HJavZYCzG8vnz3wZ9GENujanwLemiIJfINDauybu+nNsE5pO
7v12vWeZ0x2HGM/cFxODrpp4xAkdq8BBLap8/aXB8uJFagmYyhs=
=f3Bh
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- KASAN support for x86_64
- noreboot command line option, just like qemu's -no-reboot
- Various fixes and cleanups
* tag 'for-linus-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: include sys/types.h for size_t
um: Replace to_phys() and to_virt() with less generic function names
um: Add missing apply_returns()
um: add "noreboot" command line option for PANIC_TIMEOUT=-1 setups
um: include linux/stddef.h for __always_inline
UML: add support for KASAN under x86_64
mm: Add PAGE_ALIGN_DOWN macro
um: random: Don't initialise hwrng struct with zero
um: remove unused mm_copy_segments
um: remove unused variable
um: Remove straying parenthesis
um: x86: print RIP with symbol
arch: um: Fix build for statically linked UML w/ constructors
x86/um: Kconfig: Fix indentation
um/drivers: Kconfig: Fix indentation
um: Kconfig: Fix indentation
UML generally does not provide access to special CPU instructions like
RDRAND, and execution tends to be rather deterministic, with no real
hardware interrupts, making good randomness really very hard, if not
all together impossible. Not only is this a security eyebrow raiser, but
it's also quite annoying when trying to do various pieces of UML-based
automation that takes a long time to boot, if ever.
Fix this by trivially calling getrandom() in the host and using that
seed as "bootloader randomness", which initializes the rng immediately
at UML boot.
The old behavior can be restored the same way as on any other arch, by
way of CONFIG_TRUST_BOOTLOADER_RANDOMNESS=n or
random.trust_bootloader=0. So seen from that perspective, this just
makes UML act like other archs, which is positive in its own right.
Additionally, wire up arch_get_random_{int,long}() in the same way, so
that reseeds can also make use of the host RNG, controllable by
CONFIG_TRUST_CPU_RANDOMNESS and random.trust_cpu, per usual.
Cc: stable@vger.kernel.org
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Usually size_t comes from sys/types.h, not stddef.h. This code likely
worked only because something else in its usage chain was pulling in
sys/types.h. stddef.h is still required for NULL, however, so note this.
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
to_virt() and to_phys() are very generic and may be defined by drivers.
As it turns out, commit 9409c9b670 ("pmem: refactor pmem_clear_poison()")
did exactly that. This results in build errors such as the following
when trying to build um:allmodconfig.
drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
./arch/um/include/asm/page.h:105:20: error:
too few arguments to function ‘to_phys’
105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
| ^~~~~~~
Use less generic function names for the um specific to_phys() and to_virt()
functions to fix the problem and to avoid similar problems in the future.
Fixes: 9409c9b670 ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The UML function names to_virt() and to_phys() are exposed by UML
headers, and are very generic and may be defined by drivers. As it
turns out, commit 9409c9b670 ("pmem: refactor pmem_clear_poison()")
did exactly that.
This results in build errors such as the following when trying to build
um:allmodconfig:
drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’
105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
| ^~~~~~~
Use less generic function names for the um specific to_phys() and
to_virt() functions to fix the problem and to avoid similar problems in
the future.
Fixes: 9409c9b670 ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Devicetree support (for testing)
- Various cleanups and fixes: UBD, port_user, uml_mconsole
- Maintainer update
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmJFwUMWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wQqBD/9gLyeiVp2eu1YFVir64IASgVjK
lNdlAfUwfebtEsw65JcfY8K64910ahw6TvkjTT2A+QGeJIYaVwmw69bLXJUvQq31
C7ZDsMHptuNiZrHDL9SoA0DfwqRdJx3tgGzDnSkhX+2T7Zs5n1nLRMBmn/NJV9Qy
CmxG9fLH1VsU0p6RI76WST3GPLOqWa3jCeHK1vMGZNXI+eo5prHc59lkOcT7lEy7
M4vJRaAV6pCDDYMQdDOYr1PDEeG7/h49EqdKylkOhonDyYB649rL6Lc9nRBvSts3
NXX/qYy1Sj1AlOSR5IOon6QCyk1hap9kr85QoCtz3VMabD/yLlBovZzLOLaF+0S6
dQWgKg806g8QYQGxN03Ph0Pb5cA6hAjr8nVmAuICJDWgmY6Oo74pEvhI8toofFzk
NJzwa6G99xNhfggeTcGdG0ddQDT8N3enKspDPkzpN127GzU5cgvI1Z8wnZXB7JDM
zLMCxzwehocCSrFlh9aQDFK1XJfEWuP66xEPl5cX46//IMKqsrXEOjNlCTRUmA5F
OhU4qqb01OW3K4HPaAkBcGPZ0HhFn6JREUFyNW07dg6s73IWzf0CaNKeYJS7abln
tdvfPg3OPNXCjHd3aCW22EzuB9R/K8BNMkva3QQZxtUa+tOjBdBd9JBJ+vHGA1MN
7/k60wl1dt8/N9yHFg==
=YsK8
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- Devicetree support (for testing)
- Various cleanups and fixes: UBD, port_user, uml_mconsole
- Maintainer update
* tag 'for-linus-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: run_helper: Write error message to kernel log on exec failure on host
um: port_user: Improve error handling when port-helper is not found
um: port_user: Allow setting path to port-helper using UML_PORT_HELPER envvar
um: port_user: Search for in.telnetd in PATH
um: clang: Strip out -mno-global-merge from USER_CFLAGS
docs: UML: Mention telnetd for port channel
um: Remove unused timeval_to_ns() function
um: Fix uml_mconsole stop/go
um: Cleanup syscall_handler_t definition/cast, fix warning
uml: net: vector: fix const issue
um: Fix WRITE_ZEROES in the UBD Driver
um: Migrate vector drivers to NAPI
um: Fix order of dtb unflatten/early init
um: fix and optimize xor select template for CONFIG64 and timetravel mode
um: Document dtb command line option
lib/logic_iomem: correct fallback config references
um: Remove duplicated include in syscalls_64.c
MAINTAINERS: Update UserModeLinux entry
Call to fallocate with FALLOC_FL_PUNCH_HOLE on a device backed by a sparse
file can end up by missing data, zeroes data range, if the underlying file
is used with a tool like bmaptool which will referenced only used spaces.
Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].
This code was transformed with the help of Coccinelle:
(next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)
@@
identifier S, member, array;
type T1, T2;
@@
struct S {
...
T1 member;
T2 array[
- 0
];
};
UAPI and wireless changes were intentionally excluded from this patch
and will be sent out separately.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/78
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
get_vm(), add_iomem(), phys_offset() dead since 2004;
init_mem_user() and setup_memory() - since before the initial merge.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Only one extern in there is needed in processor-generic.h, and it's
not needed anywhere else. So move it over there and get rid of
the include in processor-generic.h, adding includes of registers.h
to the few files that need the declarations in it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
The function names init_registers() and restore_registers() are used
in several net/ethernet/ and gpu/drm/ drivers for other purposes (not
calls to UML functions), so rename them.
This fixes multiple build errors.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: linux-um@lists.infradead.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Rename set_signals() as there's at least one driver that
uses the same name and can now be built on UM due to PCI
support, and thus we can get symbol conflicts.
Also rename set_signals_trace() to be consistent.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmExXHoVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGAZwP/iHdEZzuQ4cz2uXUaV0fevj9jjPU
zJ8wrrNabAiT6f5x861DsARQSR4OSt3zN0tyBNgZwUdotbe7ED5GegrgIUBMWlML
QskhTEIZj7TexAX/20vx671gtzI3JzFg4c9BuriXCFRBvychSevdJPr65gMDOesL
vOJnXe+SGXG2+fPWi/PxrcOItNRcveqo2GiWHT3g0Cv/DJUulu81gEkz3hrufnMR
cjMeSkV0nJJcvI755OQBOUnEuigW64k4m2WxHPG24tU8cQOCqV6lqwOfNQBAn4+F
OoaCMyPQT9gvGYwGExQMCXGg0wbUt1qnxzOVoA2qFCwbo+MFhqjBvPXab6VJm7CE
mY3RrTtvxSqBdHI6EGcYeLjhycK9b+LLoJ1qc3S9FK8It6NoFFp4XV0R6ItPBls7
mWi9VSpyI6k0AwLq+bGXEHvaX/bnnf/vfqn8H+w6mRZdXjFV8EB2DiOSRX/OqjVG
RnvTtXzWWThLyXvWR3Jox4+7X6728oL7akLemoeZI6oTbJDm7dQgwpz5HbSyHXLh
d+gUF3Y/6lqxT5N9GSVDxpD1bEMh2I7nGQ4M7WGbGas/3yUemF8wbBqGQo4a+YeD
d9vGAUxDp2PQTtL2sjFo5Gd4PZEM9g7vwWzRvHe0o5NxKEXcBg25b8cD1hxrN9Y4
Y1AAnc0kLO+My3PC
=lw3M
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kbuild: redo fake deps at include/ksym/*.h
kbuild: clean up objtool_args slightly
modpost: get the *.mod file path more simply
checkkconfigsymbols.py: Fix the '--ignore' option
kbuild: merge vmlinux_link() between ARCH=um and other architectures
kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
kbuild: remove stale *.symversions
kbuild: remove unused quiet_cmd_update_lto_symversions
gen_compile_commands: extract compiler command from a series of commands
x86: remove cc-option-yn test for -mtune=
arc: replace cc-option-yn uses with cc-option
s390: replace cc-option-yn uses with cc-option
ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
sparc: move the install rule to arch/sparc/Makefile
security: remove unneeded subdir-$(CONFIG_...)
kbuild: sh: remove unused install script
kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
kbuild: Switch to 'f' variants of integrated assembler flag
kbuild: Shuffle blank line to improve comment meaning
...
Delete/fixup few includes in anticipation of global -isystem compile
option removal.
Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition
of uintptr_t error (one definition comes from <stddef.h>, another from
<linux/types.h>).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
We have a number of systems industry-wide that have a subset of their
functionality that works as follows:
1. Receive a message from local kmsg, serial console, or netconsole;
2. Apply a set of rules to classify the message;
3. Do something based on this classification (like scheduling a
remediation for the machine), rinse, and repeat.
As a couple of examples of places we have this implemented just inside
Facebook, although this isn't a Facebook-specific problem, we have this
inside our netconsole processing (for alarm classification), and as part
of our machine health checking. We use these messages to determine
fairly important metrics around production health, and it's important
that we get them right.
While for some kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably indicate the issue, in order
to react to production issues quickly we need to work with the interface
which most kernel developers naturally use when developing: printk.
Most production issues come from unexpected phenomena, and as such
usually the code in question doesn't have easily usable tracepoints or
other counters available for the specific problem being mitigated. We
have a number of lines of monitoring defence against problems in
production (host metrics, process metrics, service metrics, etc), and
where it's not feasible to reliably monitor at another level, this kind
of pragmatic netconsole monitoring is essential.
As one would expect, monitoring using printk is rather brittle for a
number of reasons -- most notably that the message might disappear
entirely in a new version of the kernel, or that the message may change
in some way that the regex or other classification methods start to
silently fail.
One factor that makes this even harder is that, under normal operation,
many of these messages are never expected to be hit. For example, there
may be a rare hardware bug which one wants to detect if it was to ever
happen again, but its recurrence is not likely or anticipated. This
precludes using something like checking whether the printk in question
was printed somewhere fleetwide recently to determine whether the
message in question is still present or not, since we don't anticipate
that it should be printed anywhere, but still need to monitor for its
future presence in the long-term.
This class of issue has happened on a number of occasions, causing
unhealthy machines with hardware issues to remain in production for
longer than ideal. As a recent example, some monitoring around
blk_update_request fell out of date and caused semi-broken machines to
remain in production for longer than would be desirable.
Searching through the codebase to find the message is also extremely
fragile, because many of the messages are further constructed beyond
their callsite (eg. btrfs_printk and other module-specific wrappers,
each with their own functionality). Even if they aren't, guessing the
format and formulation of the underlying message based on the aesthetics
of the message emitted is not a recipe for success at scale, and our
previous issues with fleetwide machine health checking demonstrate as
much.
This provides a solution to the issue of silently changed or deleted
printks: we record pointers to all printk format strings known at
compile time into a new .printk_index section, both in vmlinux and
modules. At runtime, this can then be iterated by looking at
<debugfs>/printk/index/<module>, which emits the following format, both
readable by humans and able to be parsed by machines:
$ head -1 vmlinux; shuf -n 5 vmlinux
# <level[,flags]> filename:line function "format"
<5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n"
<4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n"
<6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n"
<6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n"
<6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n"
This mitigates the majority of cases where we have a highly-specific
printk which we want to match on, as we can now enumerate and check
whether the format changed or the printk callsite disappeared entirely
in userspace. This allows us to catch changes to printks we monitor
earlier and decide what to do about it before it becomes problematic.
There is no additional runtime cost for printk callers or printk itself,
and the assembly generated is exactly the same.
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h}
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
Function 'os_flush_stdout' is declared twice, so remove the
repeated declaration.
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
1. Reflect host cpu flags into the UML instance so they can
be used to select the correct implementations for xor, crypto, etc.
2. Reflect host cache alignment into UML instance. This is
important when running 32 bit on a 64 bit host as 32 bit by
default aligns to 32 while the actual alignment should be 64.
Ditto for some Xeons which align at 128.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We should be able to ndelay() from any context, even from an
interrupt context! However, this is broken (not functionally,
but locking-wise) in time-travel because we'll get into the
time-travel code and enable interrupts to handle messages on
other time-travel aware subsystems (only virtio for now).
Luckily, I've already reworked the time-travel aware signal
(interrupt) delivery for suspend/resume to have a time travel
handler, which runs directly in the context of the signal and
not from the Linux interrupt.
In order to fix this time-travel issue then, we need to do a
few things:
1) rework the signal handling code to call time-travel handlers
(only) if interrupts are disabled but signals aren't blocked,
instead of marking it only pending there. This is needed to
not deadlock other communication.
2) rework time-travel to not enable interrupts while it's
waiting for a message;
3) rework time-travel to not (just) disable interrupts but
rather block signals at a lower level while it needs them
disabled for communicating with the controller.
Finally, since now we can actually spend even virtual time
in interrupts-disabled sections, the delay warning when we
deliver a time-travel delayed interrupt is no longer valid,
things can (and should) now get delayed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This will be necessary in the userspace side to fix the
signal/interrupt handling in time-travel=ext mode, which
is the next patch.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Use signals_enabled instead of always jumping through
a function call to read it, there's not much point in
that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This function doesn't exist, remove its declaration.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This mostly reverts the old commit 3963333fe6 ("uml: cover stubs
with a VMA") which had added a VMA to the existing PTEs. However,
there's no real reason to have the PTEs in the first place and the
VMA cannot be 'fixed' in place, which leads to bugs that userspace
could try to unmap them and be forcefully killed, or such. Also,
there's a bit of an ugly hole in userspace's address space.
Simplify all this: just install the stub code/page at the top of
the (inner) address space, i.e. put it just above TASK_SIZE. The
pages are simply hard-coded to be mapped in the userspace process
we use to implement an mm context, and they're out of reach of the
inner mmap/munmap/mprotect etc. since they're above TASK_SIZE.
Getting rid of the VMA also makes vma_merge() no longer hit one of
the VM_WARN_ON()s there because we installed a VMA while the code
assumes the stack VMA is the first one.
It also removes a lockdep warning about mmap_sem usage since we no
longer have uml_setup_stubs() and thus no longer need to do any
manipulation that would require mmap_sem in activate_mm().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The userspace stacks mostly have a stack (and in the case of the
syscall stub we can just set their stack pointer) that points to
the location of the stub data page already.
Rework the stubs to use the stack pointer to derive the start of
the data page, rather than requiring it to be hard-coded.
In the clone stub, also integrate the int3 into the stack remap,
since we really must not use the stack while we remap it.
This prepares for putting the stub at a variable location that's
not part of the normal address space of the userspace processes
running inside the UML machine.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If the two are mixed up, then it looks as though the parent
returned an error if the child failed (before) the mmap(),
and then the resulting process never gets killed. Fix this
by splitting the child and parent errors, reporting and
using them appropriately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In some cases we can get to fix_range_common() with mmap_sem held,
and in others we get there without it being held. For example, we
get there with it held from sys_mprotect(), and without it held
from fork_handler().
Avoid any issues in this and simply defer killing the task until
it runs the next time. Do it on the mm so that another task that
shares the same mm can't continue running afterwards.
Cc: stable@vger.kernel.org
Fixes: 468f65976a ("um: Fix hung task in fix_range_common()")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In external time-travel mode, where time is controlled via the
controller application socket, interrupt handling is a little
tricky. For example on virtio, the following happens:
* we receive a message (that requires an ACK) on the vhost-user socket
* we add a time-travel event to handle the interrupt
(this causes communication on the time socket)
* we ACK the original vhost-user message
* we then handle the interrupt once the event is triggered
This protocol ensures that the sender of the interrupt only continues
to run in the simulation when the time-travel event has been added.
So far, this was only done in the virtio driver, but it was actually
wrong, because only virtqueue interrupts were handled this way, and
config change interrupts were handled immediately. Additionally, the
messages were actually handled in the real Linux interrupt handler,
but Linux interrupt handlers are part of the simulation and shouldn't
run while there's no time event.
To really do this properly and only handle all kinds of interrupts in
the time-travel event when we are scheduled to run in the simulation,
rework this to plug in to the lower interrupt layers in UML directly:
Add a um_request_irq_tt() function that let's a time-travel aware
driver request an interrupt with an additional timetravel_handler()
that is called outside of the context of the simulation, to handle
the message only. It then adds an event to the time-travel calendar
if necessary, and no "real" Linux code runs outside of the time
simulation.
This also hooks in with suspend/resume properly now, since this new
timetravel_handler() can run while Linux is suspended and interrupts
are disabled, and decide to wake up (or not) the system based on the
message it received. Importantly in this case, it ACKs the message
before the system even resumes and interrupts are re-enabled, thus
allowing the simulation to progress properly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This reverts commit ef4459a6da ("um: allocate a guard page to
helper threads"), it's broken in multiple ways:
1) the free no longer matches the alloc; and
2) more importantly, the set_memory_ro() causes allocation of
page tables for the normal memory that doesn't have any,
and that later causes corruption and crashes (usually but
not always in vfree()).
We could fix the first bug and use vmalloc() to work around the
second, but set_memory_ro() actually doesn't do anything either
so I'll just revert that as well.
Reported-by: Benjamin Berg <benjamin@sipsolutions.net>
Fixes: ef4459a6da ("um: allocate a guard page to helper threads")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Lockdep (on 5.10-rc) points out that we're delivering IRQs while IRQs
are not even enabled, which clearly shouldn't happen. Defer the time
event IRQ delivery until they actually are enabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If the sigio workaround needed to be applied to a file descriptor,
set_irq_wake() wouldn't work for it since it would get polled by
the thread instead of causing SIGIO, and thus could never really
cause a wakeup, since the thread notification FD wasn't marked as
being able to wake up the system.
Fix this by marking the thread's notification FD explicitly as a
wake source FD, i.e. not suppressing SIGIO for it in suspend. In
order to not cause spurious wakeups, we then need to remove all
FDs that shouldn't wake up the system from the polling thread. In
order to do this, add unlocked versions of ignore_sigio_fd() and
add_sigio_fd() (nothing else is happening in suspend, so this is
fine), and also modify ignore_sigio_fd() to return -ENOENT if the
FD wasn't originally in there. This doesn't matter because nothing
else currently checks the return value, but the irq code needs to
know which ones to restore the workaround for.
All told, this lets us use a timerfd for the RTC clock in the next
patch, which doesn't send SIGIO.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We've been running into stack overflows in helper threads
corrupting memory (e.g. because somebody put printf() or
os_info() there), so to avoid those causing hard-to-debug
issues later on, allocate a guard page for helper thread
stacks and mark it read-only.
Unfortunately, the crash dump at that point is useless as
the stack tracer will try to backtrace the *kernel* thread,
not the helper thread, but at least we don't survive to a
random issue caused by corruption.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
With all the previous bits in place, we can now also support
suspend to RAM, in the sense that everything is suspended,
not just most, including userspace, processes like in s2idle.
Since um_idle_sleep() now waits forever, we can simply call
that to "suspend" the system.
As before, you can wake it up using SIGUSR1 since we're just
in a pause() call that only needs to return.
In order to implement selective resume from certain devices,
and not have any arbitrary device interrupt wake up, suspend
interrupts by removing SIGIO notification (O_ASYNC) from all
the FDs that are not supposed to wake up the system. However,
swap out the handler so we don't actually handle the SIGIO as
an interrupt.
Since we're in pause(), the mere act of receiving SIGIO wakes
us up, and then after things have been restored enough, re-set
O_ASYNC for all previously suspended FDs, reinstall the proper
SIGIO handler, and send SIGIO to self to process anything that
might now be pending.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In order to be able to experiment with suspend in UML, add the
minimal work to be able to suspend (s2idle) an instance of UML,
and be able to wake it back up from that state with the USR1
signal sent to the main UML process.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
There really is no reason to pass the amount of time we should
sleep, especially since it's just hard-coded to one second.
Additionally, one second isn't really all that long, and as we
are expecting to be woken up by a signal, we can sleep longer
and avoid doing some work every second, so replace the current
clock_nanosleep() with just an empty select() that can _only_
be woken up by a signal.
We can also remove the deliver_alarm() since we don't need to
do that when we got e.g. SIGIO that woke us up, and if we got
SIGALRM the signal handler will actually (have) run, so it's
just unnecessary extra work.
Similarly, in time-travel mode, just program the wakeup event
from idle to be S64_MAX, which is basically the most you could
ever simulate to. Of course, you should already have an event
in the list that's earlier and will cause a wakeup, normally
that's the regular timer interrupt, though in suspend it may
(later) also be an RTC event. Since actually getting to this
point would be a bug and you can't ever get out again, panic()
on it in the time control code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We don't actually use this in um_request_irq(), so it can
never be assigned. It's also not clear what that would be
useful for, so just remove it.
This results in quite a number of cleanups, all the way to
removing the "SIGIO on close" startup check, since the data
it assigns (pty_close_sigio) is not used anymore.
While at it, also make this an enum so we get a minimum of
type checking, and remove the IRQ_NONE hack in virtio since
we now no longer have the name twice.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We don't need an array of 4 entries to capture three and the
name 'MAX_IRQ_TYPE' really gets confusing as well. Remove it
and add a correct NUM_IRQ_TYPES, and use that correctly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This really shouldn't be called "irq_fd" since it doesn't
carry an fd. Well, it used to, apparently, but that struct
member is unused.
Rename it to "irq_reg" since it more accurately reflects a
registered interrupt, and remove the unused 'next' and 'fd'
members from the struct as well.
While at it, also move it to the implementation, it's not
used anywhere else, and the header file is shared with the
userspace components.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
It's cumbersome and error-prone to keep adding fixed IRQ numbers,
and for proper device wakeup support for the virtio/vhost-user
support we need to have different IRQs for each device. Even if
in theory two IRQs (with and without wake) might be sufficient,
it's much easier to reason about it when we have dynamic number
assignment. It also makes it easier to add new devices that may
dynamically exist or depending on the configuration, etc.
Add support for this, up to 64 IRQs (the same limit as epoll FDs
we have right now). Since it's not easy to port all the existing
places to dynamic allocation (some data is statically initialized)
keep the low numbers are reserved for the existing hard-coded IRQ
numbers.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.
Remove the quote operator # from compiler_attributes.h __section macro.
Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.
Conversion done using the script at:
https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements synchronized time-travel mode which - using a special
application on a unix socket - lets multiple machines take part in a
time-travelling simulation together.
The protocol for the unix domain socket is defined in the new file
include/uapi/linux/um_timetravel.h.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This file isn't really shared, it's only used on the kernel side,
not on the user side. Remove the include from the user-side and
move the file to a better place.
While at it, rename it to time-internal.h, it's not really just
timers but all kinds of things related to timekeeping.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The ubd code suffers from a possible y2038 overflow on 32-bit
architectures, both for the cow header and the os_file_modtime()
function.
Replace time_t with time64_t to extend the ubd_kern side as much
as possible.
Whether this makes a difference for the user side depends on
the host libc implementation that may use either 32-bit or 64-bit
time_t.
For the cow file format, the header contains an unsigned 32-bit
timestamp, which is good until y2106, passing this through a
'long long' gives us a consistent interpretation between 32-bit
and 64-bit um kernels.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Convert files to use SPDX header. All files are licensed under the GPLv2.
Signed-off-by: Alex Dewar <alex.dewar@gmx.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
This module allows virtio devices to be used over a vhost-user socket.
Signed-off-by: Erel Geron <erelx.geron@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Periodic timers are broken, because the also only fire once.
As it happens, Linux doesn't care because it only sets the
timer to periodic very briefly during boot, and then switches
it only between one-shot and off later.
Nevertheless, fix the logic (we shouldn't even be looking at
time_travel_timer_expiry unless the timer is enabled) and
change the code to fire the timer periodically in periodic
mode, in case it ever gets used in the future.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
UML enables TRACE_IRQFLAGS_SUPPORT but doesn't actually implement
it. It seems to have been added for lockdep support, but that can't
actually really work well without IRQ flags tracing, as is also
very noisily reported when enabling CONFIG_DEBUG_LOCKDEP.
Implement it now.
Fixes: 711553efa5 ("[PATCH] uml: declare in Kconfig our partial LOCKDEP support")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Unfortunately, my build fix for when time travel mode isn't
enabled broke time travel mode, because I forgot that we need
to use the timer time after the timer has been marked disabled,
and thus need to leave the time stored instead of zeroing it.
Fix that by splitting the inline into two, so we can call only
the _mode() one in the relevant code path.
Fixes: b482e48d29 ("um: fix build without CONFIG_UML_TIME_TRAVEL_SUPPORT")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
When CONFIG_UML_TIME_TRAVEL_SUPPORT isn't set, the build was broken.
Fix this.
Fixes: 065038706f ("um: Support time travel mode")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Sometimes it can be useful to run with "time travel" inside the
UML instance, for example for testing. For example, some tests
for the wireless subsystem and userspace are based on hwsim, a
virtual wireless adapter. Some tests can take a long time to
run because they e.g. wait for 120 seconds to elapse for some
regulatory checks. This obviously goes faster if it need not
actually wait that long, but time inside the test environment
just "bumps up" when there's nothing to do.
Add CONFIG_UML_TIME_TRAVEL_SUPPORT to enable code to support
such modes at runtime, selected on the command line:
* just "time-travel", in which time inside the UML instance
can move faster than real time, if there's nothing to do
* "time-travel=inf-cpu" in which time also moves slower and
any CPU processing takes no time at all, which allows to
implement consistent behaviour regardless of host CPU load
(or speed) or debug overhead.
An additional "time-travel-start=<seconds>" parameter is also
supported in this case to start the wall clock at this time
(in unix epoch).
With this enabled, the test mentioned above goes from a runtime
of about 140 seconds (with startup overhead and all) to being
CPU bound and finishing in 15 seconds (on my slow laptop).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This makes the code clearer and lets the time travel patch have
the actual time used for these functions in just one place.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
There are some unused functions, and some others that have
unused arguments; clean up the timer code a bit.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
os_timer_one_shot() gets passed a value "unsigned long delta",
so must not have an "int ticks" as that actually ends up being
-1, and thus triggering a timer over and over again.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
reenable_fd has been a NOP since the introduction of the EPOLL
based interrupt controller.
reenable_channel() is no longer needed as the flow control is
now handled via the write IRQs on the channel.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Support for DISCARD and WRITE_ZEROES in the ubd driver using
fallocate.
DISCARD is enabled by default and can be disabled using a new
UBD command line flag.
If the underlying fs on which the UBD image is stored does not
support DISCARD the support for both DISCARD and WRITE_ZEROES
is turned off.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
__uml_initcall() is not used and .uml.initcall.init section is empty:
$ grep -r '__uml_initcall('
arch/um/include/shared/init.h:#define __uml_initcall(fn) \
$ readelf -s ../umobj/linux | grep __uml_initcall
23214: 00000000603b75d8 0 NOTYPE GLOBAL DEFAULT 32 __uml_initcall_start
25337: 00000000603b75d8 0 NOTYPE GLOBAL DEFAULT 32 __uml_initcall_end
So it is unnecessary.
Signed-off-by: Alexander Pateenok <pateenoc@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
1. Provides infrastructure for vector IO using recvmmsg/sendmmsg.
1.1. Multi-message read.
1.2. Multi-message write.
1.3. Optimized queue support for multi-packet enqueue/dequeue.
1.4. BQL/DQL support.
2. Implements transports for several transports as well support
for direct wiring of PWEs to NIC. Allows direct connection of VMs
to host, other VMs and network devices with no switch in use.
2.1. Raw socket >4 times higher PPS and 10 times higher tcp RX
than existing pcap based transport (> 4Gbit)
2.2. New tap transport using socket RX and tap xmit. Similar
performance improvements (>4Gbit)
2.3. GRE transport - direct wiring to GRE PWE
2.4. L2TPv3 transport - direct wiring to L2TPv3 PWE
3. Tuning, performance and offload related setting support via ethtool.
4. Initial BPF support - used in tap/raw to avoid software looping
5. Scatter Gather support.
6. VNET and checksum offload support for raw socket transport.
7. TSO/GSO support where applicable or available
8. Migrates all error messages to netdevice_*() and rate limits
them where needed.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
1. Removes the need to walk the IRQ/Device list to determine
who triggered the IRQ.
2. Improves scalability (up to several times performance
improvement for cases with 10s of devices).
3. Improves UML baseline IO performance for one disk + one NIC
use case by up to 10%.
4. Introduces write poll triggered IRQs.
5. Prerequisite for introducing high performance mmesg family
of functions in network IO.
6. Fixes RNG shutdown which was leaking a file descriptor
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
linux/compiler.h is included indirectly by linux/types.h via
uapi/linux/types.h -> uapi/linux/posix_types.h -> linux/stddef.h
-> uapi/linux/stddef.h and is needed to provide a proper definition of
offsetof.
Unfortunately, compiler.h requires a definition of
smp_read_barrier_depends() for defining lockless_dereference() and soon
for defining READ_ONCE(), which means that all
users of READ_ONCE() will need to include asm/barrier.h to avoid splats
such as:
In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from arch/h8300/kernel/asm-offsets.c:11:
include/linux/list.h: In function 'list_empty':
>> include/linux/compiler.h:343:2: error: implicit declaration of function 'smp_read_barrier_depends' [-Werror=implicit-function-declaration]
smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
^
A better alternative is to include asm/barrier.h in linux/compiler.h,
but this requires a type definition for "bool" on some architectures
(e.g. x86), which is defined later by linux/types.h. Type "bool" is also
used directly in linux/compiler.h, so the whole thing is pretty fragile.
This patch splits compiler.h in two: compiler_types.h contains type
annotations, definitions and the compiler-specific parts, whereas
compiler.h #includes compiler-types.h and additionally defines macros
such as {READ,WRITE.ACCESS}_ONCE().
uapi/linux/stddef.h and linux/linkage.h are then moved over to include
linux/compiler_types.h, which fixes the build for h8 and blackfin.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add os_warn() for printing out pre-boot warning/error
messages in stderr. The messages via os_warn() are not
suppressed by quiet option.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Add os_info() for printing out pre-boot information
level messages in stderr. The messages via os_info()
are suppressed by "quiet" kernel command line.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64(). Also rename the
second argument of that function from 'addr' to 'arg2', because it will no
longer always be an address.
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-5-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The x86 specific arch_prctl() arbitrarily changed prctl's 'option' to
'code'. Before adding new options, rename it.
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-3-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch extends save_fp_registers() and restore_fp_registers() to use
PTRACE_GETREGSET and PTRACE_SETREGSET with the XSTATE note type, adding
support for new processor state extensions between context switches.
When the new ptrace requests are unavailable, it falls back to the old
PTRACE_GETFPREGS and PTRACE_SETFPREGS methods, which have been renamed to
save_i387_registers() and restore_i387_registers().
Now these functions expect *fp_regs to have the space of an _xstate struct.
Thus, this also makes ptrace in UML responde to PTRACE_GETFPREGS/_SETFPREG
requests with a user_i387_struct (thus independent from HOST_FP_SIZE), and
by calling save_i387_registers() and restore_i387_registers() instead of
the extended save_fp_registers() and restore_fp_registers() functions.
Signed-off-by: Eli Cooper <elicooper@gmx.com>
This fix two related bugs:
* PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
* PTRACE_SETREGS can't set the orig_ax value (erased by initial value)
Get rid of the now useless and error-prone get_syscall().
Fix inconsistent behavior in the ptrace implementation for i386 when
updating orig_eax automatically update the syscall number as well. This
is now updated in handle_syscall().
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Anton Ivanov <aivanov@brocade.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>
This decreases the number of syscalls per read/write by half.
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
UML is using an obsolete itimer call for
all timers and "polls" for kernel space timer firing
in its userspace portion resulting in a long list
of bugs and incorrect behaviour(s). It also uses
ITIMER_VIRTUAL for its timer which results in the
timer being dependent on it running and the cpu
load.
This patch fixes this by moving to posix high resolution
timers firing off CLOCK_MONOTONIC and relaying the timer
correctly to the UML userspace.
Fixes:
- crashes when hosts suspends/resumes
- broken userspace timers - effecive ~40Hz instead
of what they should be. Note - this modifies skas behavior
by no longer setting an itimer per clone(). Timer events
are relayed instead.
- kernel network packet scheduling disciplines
- tcp behaviour especially under load
- various timer related corner cases
Finally, overall responsiveness of userspace is better.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
Once x86 exports its do_signal(), the prototypes will clash.
Fix the clash and also improve the code a bit: remove the
unnecessary kern_do_signal() indirection. This allows
interrupt_end() to share the 'regs' parameter calculation.
Also remove the unused return code to match x86.
Minimally build and boot tested.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As we got rid of the __KERNEL__ abuse, we can directly
include linux/compiler.h now.
This also allows gcc 5 to build UML.
Reported-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Currently UML is abusing __KERNEL__ to distinguish between
kernel and host code (os-Linux). It is better to use a custom
define such that existing users of __KERNEL__ don't get confused.
Signed-off-by: Richard Weinberger <richard@nod.at>
atomic_notifier_chain_register() and uml_postsetup() do call kernel code
that rely on the "current" kernel macro and a valid task_struct resp.
thread_info struct. Give those functions a valid stack by moving
uml_postsetup() in the init_thread stack. This moves enables a panic()
call in this early code to generate a valid stacktrace, instead of
crashing.
E.g. when an UML kernel is started with an initrd but too few physical
memory the panic() call get's actually processed.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>