When we set LXC_DEBUG_CGFSNG=1 we print out info about detected cgroup
hierarchies. When there's no named cgroup mounted we need to make sure that we
don't try to index an unallocated pointer.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
With the criu release 2.8 criu deprecated the --veth-pair command-line
option in favor of --external:
f2037e6 veth: Make --external support --veth-pair
git tag --contains f2037e6d3445fc400
v2.8
With this commit lxc-checkpoint will automatically switch between
the new and old command-line option dependent on the detected
criu version.
For criu version older than 2.8 something like this will be used:
--veth-pair eth0=vethYOK6RW@lxcbr0
and starting with criu version 2.8 it will look like this:
--external veth[eth0]:vethCRPEYL@lxcbr0
Signed-off-by: Adrian Reber <areber@redhat.com>
CRIU supports dirty memory tracking to take incremental checkpoints.
Incremental checkpoints are one way of reducing downtime during
migration. The first checkpoint dumps all the memory pages and the
second (and third, and fourth, ...) only dumps pages which have changed.
Most of the necessary code has already been implemented. This just adds
the existing functionality to lxc-checkpoint:
-p, --pre-dump Only pre-dump the memory of the container.
Container keeps on running and following
checkpoints will only dump the changes.
--predump-dir=DIR path to images from previous dump (relative to -D)
The following is an example from a container running CentOS 7 with psql
and tomcat:
# lxc-checkpoint -n c7 -D /tmp/cp -p
Container keeps on running
# du -h /tmp/cp
229M /tmp/cp
Sync initial checkpoint to destination
# rsync -a /tmp/cp host2:/tmp/
Sync file-system
# rsync -a /var/lib/lxc/c7 host2:/var/lib/lxc/
Final dump; container is stopped
# lxc-checkpoint -n c7 -D /tmp/cp --predump-dir=../cp -s
# du -h /tmp/cp2
90M /tmp/cp2
After transferring the second (incremental checkpoint) and the changes
to the container's file system the container can be restored on the
second host by pointing lxc-checkpoint to the second checkpoint
directory:
# lxc-checkpoint -n c7 -D /tmp/cp2 -r
Signed-off-by: Adrian Reber <areber@redhat.com>
This package doesn't exist in stretch anymore, and it's unclear why we
were depending on a library to begin with (as opposed to having it
brought by whatever needs it).
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
In case the system was booted with
isolcpus=n_i-n_j,n_k,n_m
we cannot simply copy the cpuset.cpus file from our parent cgroup. For example,
in the root cgroup cpuset.cpus will contain all of the cpus including the
isolated cpus. Copying the values of the root cgroup into a child cgroup will
lead to a wrong view in /proc/self/status: For the root cgroup
/sys/fs/cgroup/cpuset /proc/self/status will correctly show
Cpus_allowed_list: 0-1,3
even though cpuset.cpus will show
0-3
However, initializing a subcgroup in the cpuset controller by copying the
cpuset.cpus setting from the root cgroup will cause /proc/self/status to
incorrectly show
Cpus_allowed_list: 0-3
Hence, we need to make sure to remove the isolated cpus from cpuset.cpus. Seth
has argued that this is not a kernel bug but by design. So let us be the smart
guys and fix this in liblxc.
The solution is straightforward: To avoid having to work with raw cpulist
strings we create cpumasks based on uint32_t bit arrays.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
lxc_append_string() appends strings without separator. This is mostly useful
for reading in whole files line-by-line.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
this patch create /var/run link to point to /run.
This will fix various issue present when /var/run is persistent.
Signed-off-by: Marc Gariepy <gariepy.marc@gmail.com>
If we do it earlier we end up with a wrong view of /proc/self/cgroup. For
example, assume we unshare(CLONE_NEWCGROUP) first, and then create the cgroup
for the container, say /sys/fs/cgroup/cpuset/lxc/c, then /proc/self/cgroup
would show us:
8:cpuset:/lxc/c
whereas it should actually show
8:cpuset:/
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
This would already fail, but with a not-as-good error message. Let's make
the error better.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
On shutdown we move physical network interfaces back to the
host namespace and rename them afterwards as well as in the
later lxc_network_delete() step. However, if the device had
a name which already exists in the host namespace then the
moving fails and so do the subsequent rename attempts. When
the namespace ceases to exist the devices finally end up
in the host namespace named 'dev<ID>' by the kernel.
In order to avoid this, we do the moving and renaming in a
single step (lxc_netdev_move_by_*()'s move & rename happen
in a single netlink transaction).
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
We switched to --ext-mount-map auto because of "system" (liblxc) added
mounts like the cgmanager socket that weren't in the config file. This had
the added advantage that we could drop all the mount processing code,
because we no longer needed an --ext-mount-map argument.
The problem here is that mounts can move between hosts. While
--ext-mount-map auto does its best to detect this situation, it explicitly
disallows moves that change the path name. In LXD, we bind mount
/var/lib/lxd/shmounts/$container to /dev/.lxd-mounts for each container,
and so when a container is renamed in a migration, the name changes.
--ext-mount-map auto won't detect this, and so the migration fails.
We *could* implement mount rewriting in CRIU, but my experience with cgroup
and apparmor rewriting is that this is painful and error prone. Instead, it
is much easier to go back to explicitly listing --ext-mount-map arguments
from the config file, and allow the source of the bind to change. We leave
--ext-mount-map auto to catch any stragling (or future) system added
mounts.
I believe this should fix Launchpad Bug 1580765
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>