Commit Graph

5245 Commits

Author SHA1 Message Date
Christian Brauner
64d2fcb5cf
conf: use lxc_preserve_ns()
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-19 05:11:23 +01:00
Christian Brauner
738d0deb13
start: add netnsfd to lxc_handler
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-19 05:11:17 +01:00
Christian Brauner
a687256f1d
utils: add lxc_preserve_ns()
This allows to retrieve a file descriptor referring to a namespace.

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-19 05:11:12 +01:00
Stéphane Graber
122aaf5094 Merge pull request #1305 from brauner/2016-11-16/cgfsng_debug
cgroups: prevent segfault in cgfsng
2016-11-17 09:48:06 -07:00
Christian Brauner
a7b0cc4c91
cgroups: prevent segfault in cgfsng
When we set LXC_DEBUG_CGFSNG=1 we print out info about detected cgroup
hierarchies. When there's no named cgroup mounted we need to make sure that we
don't try to index an unallocated pointer.

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-17 16:32:28 +01:00
Christian Brauner
67c933b6f8 Merge pull request #1303 from adrianreber/master
lxc-checkpoint: automatically detect if --external or --veth-pair
2016-11-16 21:00:19 -05:00
Adrian Reber
46c8ffd5e5 lxc-checkpoint: automatically detect if --external or --veth-pair
With the criu release 2.8 criu deprecated the --veth-pair command-line
option in favor of --external:

f2037e6 veth: Make --external support --veth-pair

git tag --contains f2037e6d3445fc400
v2.8

With this commit lxc-checkpoint will automatically switch between
the new and old command-line option dependent on the detected
criu version.

For criu version older than 2.8 something like this will be used:

  --veth-pair eth0=vethYOK6RW@lxcbr0

and starting with criu version 2.8 it will look like this:

  --external veth[eth0]:vethCRPEYL@lxcbr0

Signed-off-by: Adrian Reber <areber@redhat.com>
2016-11-16 07:31:34 +00:00
Stéphane Graber
471a304df4 Merge pull request #1301 from brauner/2016-11-15/isolcpus
cgroups: use %zu format specifier to print size_t
2016-11-15 09:03:21 -07:00
Stéphane Graber
a8bae5522a Merge pull request #1299 from adrianreber/master
lxc-checkpoint: enable dirty memory tracking in criu
2016-11-15 08:56:55 -07:00
Adrian Reber
9f99a33fa9 lxc-checkpoint: enable dirty memory tracking in criu
CRIU supports dirty memory tracking to take incremental checkpoints.
Incremental checkpoints are one way of reducing downtime during
migration. The first checkpoint dumps all the memory pages and the
second (and third, and fourth, ...) only dumps pages which have changed.

Most of the necessary code has already been implemented. This just adds
the existing functionality to lxc-checkpoint:

  -p, --pre-dump            Only pre-dump the memory of the container.
                            Container keeps on running and following
                            checkpoints will only dump the changes.
  --predump-dir=DIR         path to images from previous dump (relative to -D)

The following is an example from a container running CentOS 7 with psql
and tomcat:

 # lxc-checkpoint -n c7 -D /tmp/cp -p
Container keeps on running
 # du -h /tmp/cp
 229M	/tmp/cp
Sync initial checkpoint to destination
 # rsync -a /tmp/cp host2:/tmp/
Sync file-system
 # rsync -a /var/lib/lxc/c7 host2:/var/lib/lxc/
Final dump; container is stopped
 # lxc-checkpoint -n c7 -D /tmp/cp --predump-dir=../cp -s
 # du -h /tmp/cp2
 90M	/tmp/cp2

After transferring the second (incremental checkpoint) and the changes
to the container's file system the container can be restored on the
second host by pointing lxc-checkpoint to the second checkpoint
directory:

 # lxc-checkpoint -n c7 -D /tmp/cp2 -r

Signed-off-by: Adrian Reber <areber@redhat.com>
2016-11-15 14:10:03 +00:00
Christian Brauner
657f890799
cgroups: use %zu format specifier to print size_t
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-15 06:19:55 +01:00
Serge Hallyn
748c52b52c Merge pull request #1282 from brauner/2016-11-03/isolcpus
cgroups: remove isolated cpus from cpuset.cpus  …
2016-11-14 13:53:56 -06:00
Serge Hallyn
5b40ec9292 Merge pull request #1300 from stgraber/master
debian: Don't depend on libui-dialog-perl
2016-11-14 11:17:52 -06:00
Stéphane Graber
4fd968818c debian: Don't depend on libui-dialog-perl
This package doesn't exist in stretch anymore, and it's unclear why we
were depending on a library to begin with (as opposed to having it
brought by whatever needs it).

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2016-11-14 11:53:07 -05:00
Serge Hallyn
a3524e9147 Merge pull request #1297 from brauner/2016-11-13/fix_tmpfile_errno
conf: do not use %m format specifier
2016-11-14 00:33:40 -06:00
Christian Brauner
9e4e7b0dad
conf: do not use %m format specifier
This is a GNU extension and some libcs might be missing it.

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-13 17:10:41 +01:00
Christian Brauner
d3c57812b5 Merge pull request #1293 from evgeni/always-stop-lxc-net
also stop lxc-net in runlevels 0 and 6
2016-11-12 11:13:25 -05:00
Christian Brauner
0379b4ffbd Merge pull request #1294 from evgeni/ignore-lxc.egg-info
add lxc.egg-info to gitignore
2016-11-12 11:13:18 -05:00
Christian Brauner
d06df88abb Merge pull request #1295 from evgeni/bash-completion-pkg-config
install bash completion where pkg-config tells us to
2016-11-12 11:13:10 -05:00
Evgeni Golov
23f4c8a01a install bash completion where pkg-config tells us to
Signed-off-by: Evgeni Golov <evgeni@debian.org>
2016-11-12 14:57:34 +01:00
Evgeni Golov
8467eee707 add lxc.egg-info to gitignore
Signed-off-by: Evgeni Golov <evgeni@debian.org>
2016-11-12 14:47:33 +01:00
Evgeni Golov
79c07e4b11 also stop lxc-net in runlevels 0 and 6
there is no reason to not do this :)

Signed-off-by: Evgeni Golov <evgeni@debian.org>
2016-11-12 12:29:26 +01:00
Serge Hallyn
f3d7477c37 Merge pull request #1290 from brauner/2016-11-09/named_controllers
cgroups: skip v2 hierarchy entry
2016-11-10 20:40:23 -06:00
Christian Brauner
ff8d6ee936
cgroups: skip v2 hierarchy entry
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-11 00:31:04 +01:00
Christian Brauner
bedea59739 Merge pull request #1289 from Cypresslin/ubuntu-cloud-squashfs
templates: add squashfs support to lxc-ubuntu-cloud.in
2016-11-10 09:29:23 -05:00
Po-Hsu Lin
5d58fc90a6 templates: add squashfs support to lxc-ubuntu-cloud.in
Add squashfs format file support for lxc-ubuntu-cloud.in

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
2016-11-10 16:48:29 +08:00
Christian Brauner
7a8082f47b Merge pull request #1288 from Cypresslin/known-release-zesty
Update Ubuntu release name: add zesty
2016-11-10 00:22:05 -05:00
Po-Hsu Lin
0815a59287 Update Ubuntu release name: add zesty and remove wily
Add zesty to KNOWN_RELEASES
Remove EOL wily from KNOWN_RELEASES

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
2016-11-10 11:06:09 +08:00
Christian Brauner
a54694f86d
cgroups: remove isolated cpus from cpuset.cpus
In case the system was booted with

    isolcpus=n_i-n_j,n_k,n_m

we cannot simply copy the cpuset.cpus file from our parent cgroup. For example,
in the root cgroup cpuset.cpus will contain all of the cpus including the
isolated cpus. Copying the values of the root cgroup into a child cgroup will
lead to a wrong view in /proc/self/status: For the root cgroup
/sys/fs/cgroup/cpuset /proc/self/status will correctly show

    Cpus_allowed_list:      0-1,3

even though cpuset.cpus will show

    0-3

However, initializing a subcgroup in the cpuset controller by copying the
cpuset.cpus setting from the root cgroup will cause /proc/self/status to
incorrectly show

    Cpus_allowed_list:      0-3

Hence, we need to make sure to remove the isolated cpus from cpuset.cpus. Seth
has argued that this is not a kernel bug but by design. So let us be the smart
guys and fix this in liblxc.

The solution is straightforward: To avoid having to work with raw cpulist
strings we create cpumasks based on uint32_t bit arrays.

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-09 19:28:02 +01:00
Christian Brauner
000dfda7f3
utils: add lxc_append_string()
lxc_append_string() appends strings without separator. This is mostly useful
for reading in whole files line-by-line.

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-09 19:27:58 +01:00
Stéphane Graber
5e8b774630 Merge pull request #1286 from mgariepy/patch-1
create symlink for /var/run
2016-11-09 05:18:11 -07:00
mgariepy
1c5a3c5854 create symlink for /var/run
this patch create /var/run link to point to /run.

This will fix various issue present when /var/run is persistent.

Signed-off-by: Marc Gariepy <gariepy.marc@gmail.com>
2016-11-08 12:19:42 -05:00
Serge Hallyn
f79750ace9 Merge pull request #1262 from brauner/2016-10-29/lxc_free_cgroup_sigsegv
cgfs: various fixes
2016-11-07 10:09:06 -07:00
Stéphane Graber
f5795427b1 Merge pull request #1275 from brauner/2016-11-04/unshare_cgroup_after_clone
start: CLONE_NEWCGROUP after we have setup cgroups
2016-11-03 15:27:37 -06:00
Christian Brauner
deefdf8a79
start: CLONE_NEWCGROUP after we have setup cgroups
If we do it earlier we end up with a wrong view of /proc/self/cgroup. For
example, assume we unshare(CLONE_NEWCGROUP) first, and then create the cgroup
for the container, say /sys/fs/cgroup/cpuset/lxc/c, then /proc/self/cgroup
would show us:

     8:cpuset:/lxc/c

whereas it should actually show

     8:cpuset:/

Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-11-03 21:41:46 +01:00
Christian Brauner
8813bb24f8 Merge pull request #1274 from tych0/check-state-before-checkpoint
c/r: check state before doing a checkpoint/restore
2016-11-03 14:38:42 -06:00
Tycho Andersen
7ad13c9123 c/r: check state before doing a checkpoint/restore
This would already fail, but with a not-as-good error message. Let's make
the error better.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-11-03 12:01:09 -06:00
Christian Brauner
293eeeac72 Merge pull request #1273 from Blub/trivial/bin-bash-consistency
cleanup: /usr/bin/bash vs /bin/bash consistency
2016-11-03 06:54:06 -06:00
Wolfgang Bumiller
c5ec44f289 cleanup: /usr/bin/bash vs /bin/bash consistency
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2016-11-03 09:07:05 +01:00
Christian Brauner
b4b43e9e32 Merge pull request #1058 from hallyn/2016-06-24/eric.cgns
container start: clone newcgroup immediately
2016-11-02 19:56:28 -06:00
Christian Brauner
0fa988d1ac Merge pull request #1269 from Blub/phynet-rename-2
conf: merge network namespace move & rename on shutdown
2016-11-02 14:05:33 -06:00
Christian Brauner
59f1c5ca63 Merge pull request #1270 from tych0/save-dump-state-too
c/r: save dump stdout too
2016-11-02 12:05:15 -06:00
Tycho Andersen
2735dfae4c c/r: fix off-by-one error
When we read sizeof(buf) bytes here, we'd write off the end of the array,
which is bad :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-11-02 15:59:00 +00:00
Tycho Andersen
9f1f54b0c5 c/r: remove extra \ns from logs
The macros put a \n in for us, so let's not put another one in.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-11-02 15:10:13 +00:00
Tycho Andersen
5af85cb144 c/r: save criu's stdout during dump too
This also allows us to commonize some bits of the dup2 code.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-11-02 15:04:56 +00:00
Wolfgang Bumiller
5610055a11 conf: merge network namespace move & rename on shutdown
On shutdown we move physical network interfaces back to the
host namespace and rename them afterwards as well as in the
later lxc_network_delete() step. However, if the device had
a name which already exists in the host namespace then the
moving fails and so do the subsequent rename attempts. When
the namespace ceases to exist the devices finally end up
in the host namespace named 'dev<ID>' by the kernel.

In order to avoid this, we do the moving and renaming in a
single step (lxc_netdev_move_by_*()'s move & rename happen
in a single netlink transaction).

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2016-11-02 14:59:44 +01:00
Stéphane Graber
52e129450e Merge pull request #1266 from tych0/do-mount-rewriting
Do mount rewriting
2016-10-31 17:34:57 -04:00
Tycho Andersen
ed408e6674 log: bump LXC_LOG_BUFFER_SIZE to 4096
We need to log longer lines due to CRIU arguments.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-10-31 20:45:50 +00:00
Tycho Andersen
5f4e44a22d c/r: explicitly emit bind mounts as criu arguments
We switched to --ext-mount-map auto because of "system" (liblxc) added
mounts like the cgmanager socket that weren't in the config file. This had
the added advantage that we could drop all the mount processing code,
because we no longer needed an --ext-mount-map argument.

The problem here is that mounts can move between hosts. While
--ext-mount-map auto does its best to detect this situation, it explicitly
disallows moves that change the path name. In LXD, we bind mount
/var/lib/lxd/shmounts/$container to /dev/.lxd-mounts for each container,
and so when a container is renamed in a migration, the name changes.
--ext-mount-map auto won't detect this, and so the migration fails.

We *could* implement mount rewriting in CRIU, but my experience with cgroup
and apparmor rewriting is that this is painful and error prone. Instead, it
is much easier to go back to explicitly listing --ext-mount-map arguments
from the config file, and allow the source of the bind to change. We leave
--ext-mount-map auto to catch any stragling (or future) system added
mounts.

I believe this should fix Launchpad Bug 1580765

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-10-31 20:45:50 +00:00
Stéphane Graber
a99e57fe9b Merge pull request #1264 from brauner/2016-10-30/fix_lxc_stop_exit_code
tools: use correct exit code for lxc-stop
2016-10-30 14:26:54 -04:00