error_num seems to be trying to remember the exit code of the init process,
except that nothing actually keeps track of it anywhere. So, let's add a
field to the handler, so that we can keep track of the process' exit
status, and the propagate it to error_num in struct lxc_container so that
people can use it.
Note that this is a slight behavior change, essentially instead of making
error_num always == the return code from start, now it contains slightly
more useful information (the actual exit status). But, there is only one
internal user of error_num which I'll fix in later in the series, so IMO
this is ok.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Based on the comments in the code (and the have_status flag), the intent
here (and IMO, the desired behavior) should be for init.lxc to propagate
the actual exit code from the real application process up through.
Otherwise, it is swallowed and nobody can access it.
The bug being fixed here is that ret held the correct exit code, but when
it went around the loop again (to wait for other children) ret is
clobbered. Let's save the desired exit status somewhere else, so it can't
get clobbered, and we propagate things correctly.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
The documentation for this function says if the task was killed by a
signal, the return code will be 128+n, where n is the signal number. Let's
make that actually true.
(We'll use this behavior in later patches.)
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
This non-init forwarding check should really be before all the log messages
about "init continued" or "init stopped", since they will otherwise lie
about some process that wasn't init being stopped or continued.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
otherwise, we just get a return value of false from setting config failure,
with no indication as to what actually failed in the log.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
* exit(1) when there is an option parsing error
* exit(0) when the user explicitly asks for help
* exit(1) when the user specifies an invalid option
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
When we deleted cgroups for unprivileged containers we used to allocate a new
mapping and clone a new user namespace each time we delete a cgroup. This of
course meant - on a cgroup v1 system - doing this >= 10 times when all
controllers were used. Let's not to do this and only allocate and establish a
mapping once.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
When fully unprivileged users run a container that only maps their own {g,u}id
and they do not have access to setuid new{g,u}idmap binaries we will write the
idmapping directly. This however requires us to write "deny" to
/proc/[pid]/setgroups otherwise any write to /proc/[pid]/gid_map will be
denied.
On a sidenote, this patch enables fully unprivileged containers. If you now set
lxc.net.[i].type = empty no privilege whatsoever is required to run a container.
Enhances #2033.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Felix Abecassis <fabecassis@nvidia.com>
Cc: Jonathan Calmels <jcalmels@nvidia.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
The existing check doesn't work, because when you statically
link a program against libc, any functions not called are not
included. So cap_init() which we check for is not there in
the built binary.
So instead just check whether a "gcc -lcap -static" works.
If libcap.a is not available it will fail, if it is it will
succeed.
Signed-off-by: Serge Hallyn <shallyn@cisco.com>