Commit Graph

57 Commits

Author SHA1 Message Date
Christian Brauner
c56a9652d7
tools: lxc_deslashify() handle special cases
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
2016-09-26 19:41:34 +02:00
Tycho Andersen
a7fb6043b9 c/r: detatch from controlling tty on restore
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-09-21 21:46:20 +00:00
Tycho Andersen
09e80d0cc4 c/r: check that cgroup_num_hierarchies > 0
Otherwise in the error case, we end up subtracting two from the
static_args, which would lead to a segfault :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-09-16 20:26:31 -06:00
Tycho Andersen
0ab5703fcf c/r: pass --cgroup-roots on checkpoint
CRIU has added support for passing --cgroup-root on dump, which we should
use (see the criu commit 07d259f365f224b32914de26ea0fd59fc6db0001 for
details). Note that we don't have to do any version checking or anything,
because CRIU just ignored --cgroup-root on checkpoint before, so passing it
is safe, and will result in correct behavior when a sufficient version of
CRIU is present.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-09-16 15:19:07 -06:00
Tycho Andersen
5f178bc983 c/r: fix typo in comment
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-09-16 15:17:03 -06:00
Tycho Andersen
f3886023c1 c/r: write status only after trying to parse the pid
Previously, we write a "success" status but tried to parse the pid. This
meant that we wouldn't notice a successful restore but failure to parse the
pid, which was a little strange.

We still don't know the child pid, so we will end up with a restored
process tree and a running container, but at least in this case the API
will return false indicating that something failed.

We could kill(-1, 9) in this case, but since liblxc runs as root sometimes
(e.g. LXD), that would be a Very Bad Thing.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-08-26 16:29:45 -04:00
Tycho Andersen
1f56665557 remove extra 'ret'
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-08-26 16:13:06 -04:00
Stéphane Graber
3eba9b495e c/r: Fix pid_t on some arches
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2016-08-26 15:43:48 -04:00
Tycho Andersen
75d219f0cc c/r: use /proc/self/tid/children instead of pidfile
All we really needed a unique temp file for was passing the pid. Since CRIU
opened this with O_EXCL | O_CREAT, this was "safe" (users could still
overwrite it afterwards, but the monitor would immediately die since the
only valid number in there was the init process).

In any case, we can just read /proc/self/tid/children, which lists the
child process.

Closes #1150

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-08-26 16:26:50 +00:00
Christian Brauner
d8e4899290
bdev: add subdirectories to search path
This allows us to avoid using relative includes which is cleaner in the long
run when we create subdirectories for other components of liblxc.

Signed-off-by: Christian Brauner <cbrauner@suse.de>
2016-07-31 12:47:58 +02:00
Christian Brauner
9b1e2e6e2c
criu: replace tmpnam() with mkstemp()
Signed-off-by: Christian Brauner <cbrauner@suse.de>
2016-07-29 00:53:53 +02:00
Christian Brauner
9b945f1320
c/r: use PRIu64 format specifier
Fixes build failures on arm:

criu.c: In function ‘exec_criu’:
criu.c:310:4: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Werror=format=]
    ret = sprintf(ghost_limit, "%lu", opts->user->ghost_limit);
    ^
In file included from criu.c:42:0:
log.h:285:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Werror=format=]
  struct lxc_log_locinfo locinfo = LXC_LOG_LOCINFO_INIT;  \
         ^
criu.c:312:5: note: in expansion of macro ‘ERROR’
     ERROR("failed to print ghost limit %lu", opts->user->ghost_limit);
     ^

Signed-off-by: Christian Brauner <cbrauner@suse.de>
2016-07-22 11:16:43 +02:00
Tycho Andersen
b2b7b0d223 c/r: add support for ghost-limit in CRIU
This is an old option that we probably should have exposed long ago :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-07-19 14:27:26 -06:00
Adrian Reber
c33b0338fa c/r: make local function static
This is a minimal commit which makes the function 'do_restore()' static
as it is not used anywhere else in the code. This also removes a
trailing space my editor complained about.

Signed-off-by: Adrian Reber <areber@redhat.com>
2016-07-15 10:54:30 +02:00
Adrian Reber
f195450384 c/r: drop in-flight connections during CRIU dump
Shortly after CRIU 2.3 has been released a patch has been added to skip
in-flight TCP connections. In-flight connections are not completely
established connections (SYN, SYN-ACK). Skipping in-flight TCP
connections means that the client has to re-initiate the connection
establishment.

This patch stores the CRIU version detected during version check, so
that during dump/checkpoint options can be dynamically enabled depending
on the available CRIU version.

v2:
   * use the newly introduced criu version interface
   * add an option to disable skipping in-flight connections

Signed-off-by: Adrian Reber <areber@redhat.com>
2016-07-12 14:09:17 +02:00
Serge Hallyn
c80de904c9 Merge pull request #1073 from brauner/bugfix_branch
store criu version
2016-07-08 08:16:39 -05:00
Tycho Andersen
b9ee6643cb c/r: add support for CRIU's --action-script
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-07-06 23:45:15 +00:00
Christian Brauner
5407e2abae store criu version
- If version != NULL criu_version_ok() stores the detected criu version in
  version. Allocates memory for version which must be freed by caller.
- If version == NULL criu_version_ok() will return true when the version
  matches, false in all other cases.

Signed-off-by: Christian Brauner <cbrauner@suse.de>
2016-07-06 16:07:34 +02:00
Tycho Andersen
0a5fc6dfa7 c/r: use criu's "full" mode for cgroups
A while ago cgroup modes were introduced to CRIU, which slightly changed
the behavior w.r.t. cgroups under the hood. What we're really after is
criu's --full mode, i.e. even if a particular cgroup directory exists
(in particular /lxc/$container[-$number] will, since we create it), we
should restore perms on that cgroup.

Things worked just fine for actual properties (except "special" properties
as criu refers to them, which I've just sent a patch for) because liblxc
creates no subdirectories, just the TLD.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-06-27 22:24:09 +00:00
Tycho Andersen
19d1509c39 c/r: add an option to use faster inotify support in CRIU
The idea here is that criu can use open_by_handle on a configuration which
will preserve inodes on moves across hosts, but shouldn't do that on
configurations which won't preserve inodes. Before, we forced it to always
be slow, but we don't have to do this.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-05-11 13:59:48 +00:00
Tycho Andersen
b2c3710f74 c/r: rearrange things to pass struct migrate_opts all the way down
If we don't do this, we'll end up changing the function signatures for the
internal __criu_* functions each time we add a new parameter, which will
get very annoying very quickly. Since we already have the user's arguments
struct, let's just pass that all the way down.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-05-10 16:26:20 -06:00
Niklas Eiling
74eb576cef fixed indentation and comments
Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
2016-03-31 20:09:42 +02:00
Niklas Eiling
4c0c0319a5 c/r: support for the criu pageserver
this enables lxc to perform "disk-less migrations" where memory pages are sent directly to the destination machine instead of being written to the sources filesystem first.
For this, the migrate_opts struct has been added the strings "pageserver_address" and "pageserver_port" so that criu can be told where to look for a pageserver.

Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
2016-03-31 12:14:50 +02:00
Niklas Eiling
72a30576da use snprintf instead of strncat
Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
2016-03-30 23:34:37 +02:00
Niklas Eiling
a17fa3c081 fix possible buffer overflow
strncat only returns its first argument and not the end of the written string.
Thus "buf-pos" is always 0 and consquently no range check is performed.

Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
2016-03-30 20:10:21 +02:00
Tycho Andersen
b7088add70 c/r: rename restore & friends to __criu_restore
Hopefully this will avoid name collisions with any user binaries, since
criu is just an implementation detail.

Closes #907

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-22 09:26:55 -06:00
Tycho Andersen
97e4f1a91f c/r: don't fail if there is no console_fd on restore
If we set lxc.console=none, this fd won't exist, so let's not fail if it
doesn't. We already partially handled this case correctly, so let's
actually handle it correctly :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-21 16:56:03 -06:00
Tycho Andersen
36d2096cf4 c/r: don't pass --ext-mount-map flag when console=none
We don't pass anything on the restore side since we didn't save anything,
but the restore side will expect something if we pass this. Instead, let's
not pass anything.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-21 16:50:39 -06:00
Tycho Andersen
3d9a5c85fd c/r: print criu's stdout when it fails
In particular, when CRIU fails before it has its log completely initialized
(e.g. if the log directory doesn't exist, or if the argument parser fails),
it prints this to stdout. Let's log that.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-18 13:13:17 -06:00
Tycho Andersen
cf4b07a5af c/r: log the exact command we exec
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-18 10:19:36 -06:00
Tycho Andersen
f03280a760 build: fix build on android (and ppc)
The problem here is that dev_t on most platforms is `long unsigned`, but on
android (and ppc?) it's `long long unsigned`. Let's just upcast to `long
long unsigned` and use that format string to keep the compilers happy.

Safety first!

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-15 12:01:36 -06:00
Tycho Andersen
4b54788e85 c/r: drop lxc.console=none config requirement
There are a few things going on in this patch.

1. /dev/console is an external mount since it is bind mounted from the
   host. However, we don't want to use criu's --ext-mount-map auto handling
   here, because that will bind mount exactly the same path from the host
   on restore, but if the pts device is different on the target host, we'll
   bind mount the wrong one, which is obviously wrong.

2. We need to tell CRIU how to restore the TTY. Since we declare the tty as
   --external, we need to provide it via --inherit-fd (even though we've
   already fixed up the environment).

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-15 09:31:15 -06:00
Tycho Andersen
73d467522b criu: hide more stuff in criu.c
Various other functions/structures are now only used in criu.c, so let's
hide stuff there so as not to pollute headers.

This commit also bumps the required CRIU versions to 2.0. While we don't
*require* any features that aren't in 1.8 patchlevel 21 or above, 2.0 is a
vast improvement, and so we should use that instead.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-11 19:15:32 -07:00
Tycho Andersen
7103fe6f08 cgroup: cgroup_escape takes no arguments
cgroup_escape() is a slight abuse of the cgroup code: what we really want
here is to escape the *current* process, whether it happens to be the LXC
monitor or not, into the / cgroups.

In the case of dump, we can't do an lxc_init(), because:

lxc 20160310103501.547 ERROR    lxc_commands - commands.c:lxc_cmd_init:993 - ##
lxc 20160310103501.547 ERROR    lxc_commands - commands.c:lxc_cmd_init:994 - # The container appears to be already running!
lxc 20160310103501.547 ERROR    lxc_commands - commands.c:lxc_cmd_init:995 - ##

We don't want to make this a command to send to the handler, because again,
cgroup_escape() is intended to escape the *current* task to the root
cgroups.

So, let's just have cgroup_escape() build its own handler when required.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-10 12:01:34 -07:00
Tycho Andersen
9451eeffb0 criu: make exec_criu static
This is no longer needed outside of criu.c with the ->migrate API call, so
let's mark it that way.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
2016-03-10 12:01:34 -07:00
Serge Hallyn
ccb4cabe02 cgfsng: next generation filesystem-backed cgroup implementation
This makes simplifying assumptions:  all usable cgroups must be
mounted under /sys/fs/cgroup/controller or /sys/fs/cgroup/contr1,contr2.

Currently this will only work with cgroup namespaces, because
lxc.mount.auto = cgroup is not implemented.  So cgfsng_ops_init()
returns NULL if cgroup namespaces are not enabled.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2016-03-04 18:19:30 -08:00
Wim Coekaerts
a90277dfb5 criu.c: protect from buffer overrun of version in fscanf()
while highly unlikely to happen...
char version[1024];

fscanf(.. %[1024] .., version  );

should leave room for null termination

Signed-off-by: Wim Coekaerts <wim.coekaerts@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2016-01-04 12:52:26 -05:00
Tycho Andersen
13389b2963 c/r: use --lsm-profile if provided
Since we can rename a container on a migrate, let's tell CRIU to use the
LSM profile name the user has specified. This change is motivated by LXD,
which sets an LSM profile name based on the container name, so if a user
changes the name of a container during migration, the old profile name
(that criu has saved) won't exist on the new host.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2015-12-20 22:42:28 -05:00
Christian Brauner
4ec31c5224 Adapt #includes for bdev.h to bdev/bdev.h
Signed-off-by: Christian Brauner <christian.brauner@mailbox.org>
2015-12-15 17:03:58 +01:00
Tycho Andersen
f8a41688ec c/r: add more logging when restore fails
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-12-09 23:00:26 -05:00
Tycho Andersen
e9195050b4 c/r: escape cgroups before exec()ing criu
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-12-09 23:00:23 -05:00
Tycho Andersen
fa07124900 c/r: remove random line continuations
No idea how these got there, but let's get rid of them since they're weird.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-12-09 22:56:52 -05:00
Tycho Andersen
aef3d51e61 c/r: add a new ->migrate API call
This patch adds a new ->migrate API call with three commands:

MIGRATE_DUMP: this is basically just ->checkpoint()
MIGRATE_RESTORE: this is just ->restore()
MIGRATE_PRE_DUMP: this can be used to invoke criu's pre-dump command on the
    container.

A small addition to the (pre-)dump commands is the ability to specify a
previous partial dump directory, so that one can use a pre-dump of a
container.

Finally, this new API call uses a structure to pass options so that it can
be easily extended in the future (e.g. to CRIU's --leave-frozen option in
the future, for potentially smarter failure handling on restore).

v2: remember to flip the return code for legacy ->checkpoint and ->restore
    calls

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-12-09 22:53:59 -05:00
Tycho Andersen
dc259399a4 c/r: use freezer to seize tasks
Instead of relying on the old ptrace loop, we should instead put all the
tasks in the container into the freezer. This will stop them all at the
same time, preventing fork bombs from causing criu to infinite loop (and is
also simply a lot faster).

Note that this uses --freeze-cgroup which isn't in criu 1.7, so it should
only go into master.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2015-11-06 23:24:31 -05:00
Tycho Andersen
c1fd648dd8 c/r: don't require a veth link to c/r
veths can be unconnected in the container's config, and we should handle
this case.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2015-11-06 15:02:36 -05:00
Tycho Andersen
5b4543292d c/r: enable tracefs
tracefs is a new filesystem that can be mounted by users. Only the options
and fs name need to be passed to restore the state, so we can use criu's
auto fs feature.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2015-08-14 12:29:24 -04:00
Tycho Andersen
ec8449f8dc c/r: get rid of dump_net_info()
This was originally used to propagate the bridge and veth names across
hosts, but now we extract both from the container's config file, and
nothing reads the files that dump_net_info() writes, so let's just get rid
of them.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-08-13 16:26:05 -04:00
Tycho Andersen
65b2022137 c/r: allow empty networks to be checkpointed/restored
Empty networks don't have anything (besides lo) for us to dump and restore,
so we should allow these as well.

Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2015-08-13 16:26:01 -04:00
Tycho Andersen
bd9e78f570 c/r: remove unused variable mnts
Reported-by: Coverity
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2015-06-10 23:04:45 -05:00
Tycho Andersen
3158ab5b9e c/r: use fclose instead of close
We're leaking the FILE* here while closing the underlying fd; let's just
close the file and thus close both.

Reported-by: Coverity
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2015-06-10 23:04:43 -05:00