Commit Graph

8713 Commits

Author SHA1 Message Date
Alexander Kriventsov
b9f80409d7 getgrgid_r fails with ERANGE if buffer is too small. Retry with a larger buffer.
Signed-off-by: Alexander Kriventsov <akriventsov@nic.ru>
2019-06-03 18:11:56 +03:00
Christian Brauner
3e8a11cb1c
Merge pull request #3018 from tych0/comment-stack-size
lxc_clone: add a comment about stack size
2019-05-29 17:38:23 +02:00
Tycho Andersen
edb808d130 lxc_clone: add a comment about stack size
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2019-05-29 09:36:51 -06:00
Christian Brauner
18a405ee88
Merge pull request #2987 from tych0/pass-zero-to-clone
Pass zero to clone
2019-05-29 17:14:00 +02:00
Tycho Andersen
3df90604ec lxc_clone: bump stack size to 8MB
This is the default thread size for glibc, so it is reasonable to match
that when we clone().

Mostly this is a science experiment suggested by brauner, and who doesn't
love science?

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2019-05-29 08:47:35 -06:00
Christian Brauner
0cfec4f757
Merge pull request #3015 from avkvl/issue-2765
fix issue 2765
2019-05-28 16:45:36 +02:00
Alexander Kriventsov
d871a9f1e5 fix issue 2765
Signed-off-by: Alexander Kriventsov <akriventsov@nic.ru>
2019-05-28 16:21:22 +03:00
Christian Brauner
36f7018103
cgroups: handle offline cpus in v1 hierarchy
Handle offline cpus in v1 hierarchy.

In addition to isolated cpus we also need to account for offline cpus when our
ancestor cgroup is the root cgroup and we have not been initialized yet.

Closes #2953.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-24 15:59:57 +02:00
Stéphane Graber
c54cf53fad
Merge pull request #3011 from brauner/2019-05-21/android_the_bane_of_my_existence
configure: remove additional comma
2019-05-21 10:15:08 -04:00
Christian Brauner
d4df64143e
configure: remove additional comma
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-21 15:58:03 +02:00
Stéphane Graber
ddf4b77e11
Merge pull request #3010 from brauner/2019-05-17/bugfixes
lxccontainer: cleanup attach functions
2019-05-17 09:10:47 +02:00
Christian Brauner
d643014317
lxccontainer: cleanup attach functions
Specifically, refloat function arguments and remove useless comments.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-17 07:50:45 +02:00
Stéphane Graber
07c5b72a11
Merge pull request #3009 from brauner/2019-05-16/rework_attach
attach: do not reload container
2019-05-16 19:33:41 +02:00
Christian Brauner
908fbc1a2e
attach: do not reload container
Let lxc_attach() reuse the already initialized container.

Closes https://github.com/lxc/lxd/issues/5755.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-16 16:11:22 +02:00
Christian Brauner
6ae34d2169
Merge pull request #3006 from tomponline/tp-phys-downhook
network: Fixes bug that stopped down hook from running for phys netdevs
2019-05-16 10:11:42 +02:00
Thomas Parrott
b3259dc669 network: Fixes bug that stopped down hook from running for phys netdevs
Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
2019-05-15 17:09:47 +01:00
Christian Brauner
e2f2d86a41
Merge pull request #3005 from tomponline/tp-phys-ns-restore
network: move phys netdevs back to monitor's net ns rather than pid 1's
2019-05-15 17:40:52 +02:00
Thomas Parrott
0037ab49d6 network: move phys netdevs back to monitor's net ns rather than pid 1's
Updates lxc_restore_phys_nics_to_netns() to move phys netdevs back to the monitor's network namespace rather than the previously hardcoded PID 1 net ns.

This is to fix instances where LXC is started inside a net ns different from PID 1 and physical devices are moved back to a different net ns when the container is shutdown than the net ns than where the container was started from.

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
2019-05-15 15:58:46 +01:00
Stéphane Graber
8f06ff5491
Merge pull request #3004 from brauner/master
configure: handle checks when cross-compiling
2019-05-15 16:19:19 +02:00
Tycho Andersen
5e7b4b3c16 lxc_clone: get rid of some indirection
We have a do_clone(), which just calls a void f(void *) that it gets
passed. We build up a struct consisting of two args that are just the
actual arg and actual function. Let's just have the syscall do this for us.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2019-05-15 07:56:29 -06:00
Tycho Andersen
8de9038436 doc: add a little note about shared ns + LSMs
We should add a little not about the race in the previous patch.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2019-05-15 07:56:01 -06:00
Tycho Andersen
c74e921744 lxc_clone: pass non-stack allocated stack to clone
There are two problems with this code:

1. The math is wrong. We allocate a char *foo[__LXC_STACK_SIZE]; which
   means it's really sizeof(char *) * __LXC_STACK_SIZE, instead of just
   __LXC_STACK SIZE.

2. We can't actually allocate it on our stack. When we use CLONE_VM (which
   we do in the shared ns case) that means that the new thread is just
   running one page lower on the stack, but anything that allocates a page
   on the stack may clobber data. This is a pretty short race window since
   we just do the shared ns stuff and then do a clone without CLONE_VM.

However, it does point out an interesting possible privilege escalation if
things aren't configured correctly: do_share_ns() sets up namespaces while
it shares the address space of the task that spawned it; once it enters the
pid ns of the thing it's sharing with, the thing it's sharing with can
ptrace it and write stuff into the host's address space. Since the function
that does the clone() is lxc_spawn(), it has a struct cgroup_ops* on the
stack, which itself has function pointers called later in the function, so
it's possible to allocate shellcode in the address space of the host and
run it fairly easily.

ASLR doesn't mitigate this since we know exactly the stack offsets; however
this patch has the kernel allocate a new stack, which will help. Of course,
the attacker could just check /proc/pid/maps to find the location of the
stack, but they'd still have to guess where to write stuff in.

The thing that does prevent this is the default configuration of apparmor.
Since the apparmor profile is set in the second clone, and apparmor
prevents ptracing things under a different profile, attackers confined by
apparmor can't do this. However, if users are using a custom configuration
with shared namespaces, care must be taken to avoid this race.

Shared namespaces aren't widely used now, so perhaps this isn't a problem,
but with the advent of crio-lxc for k8s, this functionality will be used
more.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
2019-05-15 07:56:01 -06:00
Christian Brauner
4e900c18a7
configure: handle checks when cross-compiling
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-15 15:44:36 +02:00
Christian Brauner
7aea50feb9
Merge pull request #3001 from Rachid-Koucha/patch-11
Use %m instead of strerror() when available
2019-05-13 15:57:29 +02:00
Rachid Koucha
9a719a64e5
Error prone semicolon
Suppressed error prone semicolon in SYSTRACE() macro.

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-13 14:57:02 +02:00
Rachid Koucha
a1d652c25b
Use %m instead of strerror() when available
Use %m under HAVE_M_FORMAT instead of strerror()

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-13 13:21:14 +02:00
Christian Brauner
612e48a364
Merge pull request #2999 from rikardfalkeborn/fix-realloc-memleak-proctitle
initutils: Fix memleak on realloc failure
2019-05-13 13:19:55 +02:00
Christian Brauner
7d4188ce71
Merge pull request #2998 from rikardfalkeborn/fix-returning-non-bool
Fix returning -1 in functions with return type bool
2019-05-13 13:19:22 +02:00
Christian Brauner
fa9aa1fabb
Merge pull request #3000 from Rachid-Koucha/patch-11
Config: check for %m availability
2019-05-13 13:18:54 +02:00
Rachid Koucha
720bbb3118
Config: check for %m availability
GLIBC supports %m to avoid calling strerror(). Using it saves some code space.
==> This check will define HAVE_M_FORMAT to be use wherever possible (e.g. log.h)

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-13 13:13:18 +02:00
Rikard Falkeborn
e1d4305384 initutils: Fix memleak on realloc failure
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
2019-05-12 03:16:39 +02:00
Rikard Falkeborn
cdcaad4868 zfs: Fix return value on zfs_snapshot error
Returning -1 in a function with return type bool is the same as
returning true. Change to return false to indicate error properly.

Detected with cppcheck.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
2019-05-12 01:55:34 +02:00
Rikard Falkeborn
4d927e7f42 lvm: Fix return value if lvm_create_clone fails
Returning -1 in a function with return type bool is the same as
returning true. Change to return false to indicate error properly.

Detected with cppcheck.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
2019-05-12 01:55:34 +02:00
Rikard Falkeborn
17e68c49cf criu: Remove unnecessary return after _exit()
Since _exit() will terminate, the return statement is dead code. Also,
returning -1 from a function with bool as return type is confusing.

Detected with cppcheck.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
2019-05-12 01:55:34 +02:00
Christian Brauner
ad4dddd85e
Merge pull request #2997 from rst0git/criu-v-option
criu: Use -v4 instead of -vvvvvv
2019-05-10 23:47:28 +02:00
Radostin Stoyanov
582cb4785a criu: Use -v4 instead of -vvvvvv
CRIU has only 4 levels of verbosity (errors, warnings, info, debug).
Thus, using `-v4` is more appropriate.

https://criu.org/Logging

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-05-10 22:39:15 +01:00
Christian Brauner
da161bc1a2
Merge pull request #2993 from Rachid-Koucha/patch-9
New --bbpath option and unecessary --rootfs checks
2019-05-10 21:35:56 +02:00
Rachid Koucha
5f0fb855f8
Option --busybox-path instead of --bbpath
As suggested during the review.

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-10 21:28:35 +02:00
Christian Brauner
e269d99b02
Merge pull request #2996 from brauner/Rachid-Koucha-patch-10
lxccontainer: do not display if missing privileges
2019-05-10 21:20:20 +02:00
Rachid Koucha
9fbe07f68d
lxccontainer: do not display if missing privileges
lxc-ls without root privileges on privileged containers should not display
information. In lxc_container_new(), ongoing_create()'s result is not checked
for all possible returned values. Hence, an unprivileged user can send command
messages to the container's monitor. For example:

$ lxc-ls -P /.../tests -f
NAME     STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
ctr -     0         -      -    -    false
$ sudo lxc-ls -P /.../tests -f
NAME     STATE   AUTOSTART GROUPS IPV4      IPV6 UNPRIVILEGED
ctr RUNNING 0         -      10.0.3.51 -    false

After this change:

$ lxc-ls -P /.../tests -f      <-------- No more display without root privileges
$ sudo lxc-ls -P /.../tests -f
NAME     STATE   AUTOSTART GROUPS IPV4      IPV6 UNPRIVILEGED
ctr RUNNING 0         -      10.0.3.37 -    false
$

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-10 21:02:17 +02:00
Rachid Koucha
e796239406
New --bbpath option and unecessary --rootfs checks
. Add the "--bbpath" option to pass an alternate busybox pathname instead of the one found from ${PATH}.
. Take this opportunity to add some formatting in the usage display
. As a try is done to pick rootfs from the config file and set it to ${path}/rootfs, it is unnecessary to make it mandatory

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-10 17:01:13 +02:00
Stéphane Graber
792ea40042
Merge pull request #2992 from brauner/2019-05-10/coding_style_update
coding style: update
2019-05-10 08:36:56 -04:00
Christian Brauner
a8e63d6904
coding style: update
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-10 13:35:13 +02:00
Christian Brauner
9e19503641
Merge pull request #2985 from tomponline/tp-mtu
network: Adds mtu support for phys and macvlan types
2019-05-10 09:30:35 +02:00
Christian Brauner
70aa3c7f58
Merge pull request #2989 from Rachid-Koucha/patch-8
Redirect error messages to stderr
2019-05-10 08:48:59 +02:00
Rachid Koucha
634ad9358e
Redirect error messages to stderr
Some error messages were not redirected to stderr.
Moreover, do "exit 0" instead of "exit 1" when "help" option is passed.

Signed-off-by: Rachid Koucha <rachid.koucha@gmail.com>
2019-05-10 07:39:03 +02:00
Stéphane Graber
3e860bdac0
Merge pull request #2986 from brauner/2019-05-09/clone_pidfd
start: use CLONE_PIDFD
2019-05-09 15:19:58 -04:00
Christian Brauner
33942046c5
start: use CLONE_PIDFD
Use CLONE_PIDFD when possible.

Note the clone() syscall ignores unknown flags which is usually a design
mistake. However, for us this bug is a feature since we can just pass the flag
along and see whether the kernel has given us a pidfd.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-05-09 19:40:23 +02:00
Thomas Parrott
bc99910758 api: Adds the network_phys_macvlan_mtu extension
This will allow LXD to check for custom MTU support for phys and macvlan devices.

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
2019-05-09 16:55:51 +01:00
Thomas Parrott
0b15498976 network: Restores phys device MTU on container shutdown
The phys devices will now have their original MTUs recorded at start and restored at shutdown.

This is to protect the original phys device from having any container level MTU customisation being applied to the device once it is restored to the host.

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
2019-05-09 16:55:45 +01:00