This is not a fatal error and the fallback codepath is equally safe.
When we use TIOCGPTPEER we're using a stashed fd to the container's
devpts mount's ptmx device and allocating a new fd non-path based
through this ioctl. If this ioctl can't be used we're falling back to
allocating a pts device from the host's devpts mount's ptmx device which
is path-based but is not under control of the container and so that's
safe. The difference is just that the first method gets you a nice
native terminal with all the pleasantries of having tty and friends
working whereas the latter method does not.
Fixes: #3625
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Use as much as possible from each command `--help` for completion.
Some options require a long list of completions that should be dumped by
some command option. These are not added here yet.
Examples of those are: `lxc-info --config`, `lxc-execute --define` and
`lxc-start --define`.
Signed-off-by: Edenis Freindorfer Azevedo <edenisfa@gmail.com>
By default, there is no out-of-the-box bash completion for lxc tools.
This is due to dynamic loading of completions, that requires the
completion filename to be the same as the command (e.g. `lxc-start`
expects a completion filename `lxc-start`). But all commands are in file
`lxc`, which is not read.
Signed-off-by: Edenis Freindorfer Azevedo <edenisfa@gmail.com>
If an include directive ends with a trailing slash, we now
always assume it is a directory and do not treat the
non-existence as an error.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Old versions of Docker emulate a cgroup namespace by bind-mounting the
container's cgroup over the corresponding controller:
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime master:11 - cgroup cgroup rw,xattr,name=systemd
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime master:15 - cgroup cgroup rw,net_cls,net_prio
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime master:16 - cgroup cgroup rw,cpu,cpuacct
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime master:17 - cgroup cgroup rw,memory
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime master:18 - cgroup cgroup rw,devices
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime master:19 - cgroup cgroup rw,hugetlb
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime master:20 - cgroup cgroup rw,perf_event
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime master:21 - cgroup cgroup rw,cpuset
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime master:22 - cgroup cgroup rw,blkio
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime master:23 - cgroup cgroup rw,pids
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime master:24 - cgroup cgroup rw,freezer
New versions of LXC always stash a file descriptor for the root of the
cgroup mount at /sys/fs/cgroup and then resolve the current cgroup
parsed from /proc/{1,self}/cgroup relative to that file descriptor. This
doesn't work when the caller's cgroup is mouned over the controllers.
Older versions of LXC simply counted such layouts as having no cgroups
available for delegation at all and moved on provided no cgroup limits
were requested. But mainline LXC would fail such layouts. While I would
argue that failing such layouts is the semantically clean approach we
shouldn't regress users so make mainline LXC treat such cgroup layouts
as having no cgroups available for delegation.
Fixes: #3890
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Some tools require /sys/devices/virtual/net to be read-write. At the
same time we want all other parts of /sys to be read-only. To do this we
created a layout where we hade a read-only instance of sysfs mounted on
top of a read-write instance of sysfs:
`-/sys sysfs sysfs rw,nosuid,nodev,noexec,relatime
`-/sys sysfs sysfs ro,nosuid,nodev,noexec,relatime
|-/sys/devices/virtual/net sysfs sysfs rw,relatime
| `-/sys/devices/virtual/net sysfs[/devices/virtual/net] sysfs rw,nosuid,nodev,noexec,relatime
This causes issues for systemd services that create a separate mount
namespace as they get confused to what mount options need to be
respected.
Simplify our mounting logic so we end up with a single read-only mount
of sysfs on /sys and a read-write bind-mount of /sys/devices/virtual/net:
├─/sys sysfs sysfs ro,nosuid,nodev,noexec,relatime
│ ├─/sys/devices/virtual/net sysfs[/devices/virtual/net] sysfs rw,nosuid,nodev,noexec,relatime
Link: systemd/systemd#20032
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
lxc_container_init() creates the container payload process as it's child
so lxc_container_init() itself never really exits and thus the parent
isn't notified about the child exec'ing since the sync file descriptor
is never closed. Make sure it's closed to notify the parent about the
child's exec.
In addition we're currently leaking all file descriptors associated with
the handler into the stub init. Make sure that all file descriptors
other than stderr are closed.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Instead of having a statically linked init that we put on the host fs
somewhere via packaging, have to either bind mount in or detect fexecve()
functionality, let's just call it as a library function. This way we don't
have to do any of that.
This also fixes up a bunch of conditions from:
if (quiet)
fprintf(stderr, "log message");
to
if (!quiet)
fprintf(stderr, "log message");
:)
and it drops all the code for fexecve() detection and bind mounting our
init in, since we no longer need any of that.
A couple other thoughts:
* I left the lxc-init binary in since we ship it, so someone could be using
it outside of the internal uses.
* There are lots of unused arguments to lxc-init (including presumably
--quiet, since nobody noticed the above); those may be part of the API
though and so we don't want to drop them.
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
and the item is copied (strdup()) to the array.
Thus, when an item is removed from an array, memory allocated for that item
should be freed, successive items should be left-shifted and the array
realloc()ed again (size-1).
Additional changes:
- If strdup() fails in add_to_array(), then an array should be
realloc()ed again to original size.
- Initialize an array in list_all_containers().
Signed-off-by: Tomasz Blaszczak <tomasz.blaszczak@consult.red>