Otherwise in the error case, we end up subtracting two from the
static_args, which would lead to a segfault :)
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CRIU has added support for passing --cgroup-root on dump, which we should
use (see the criu commit 07d259f365f224b32914de26ea0fd59fc6db0001 for
details). Note that we don't have to do any version checking or anything,
because CRIU just ignored --cgroup-root on checkpoint before, so passing it
is safe, and will result in correct behavior when a sufficient version of
CRIU is present.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Previously, we write a "success" status but tried to parse the pid. This
meant that we wouldn't notice a successful restore but failure to parse the
pid, which was a little strange.
We still don't know the child pid, so we will end up with a restored
process tree and a running container, but at least in this case the API
will return false indicating that something failed.
We could kill(-1, 9) in this case, but since liblxc runs as root sometimes
(e.g. LXD), that would be a Very Bad Thing.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
All we really needed a unique temp file for was passing the pid. Since CRIU
opened this with O_EXCL | O_CREAT, this was "safe" (users could still
overwrite it afterwards, but the monitor would immediately die since the
only valid number in there was the init process).
In any case, we can just read /proc/self/tid/children, which lists the
child process.
Closes#1150
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
This allows us to avoid using relative includes which is cleaner in the long
run when we create subdirectories for other components of liblxc.
Signed-off-by: Christian Brauner <cbrauner@suse.de>
Fixes build failures on arm:
criu.c: In function ‘exec_criu’:
criu.c:310:4: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Werror=format=]
ret = sprintf(ghost_limit, "%lu", opts->user->ghost_limit);
^
In file included from criu.c:42:0:
log.h:285:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Werror=format=]
struct lxc_log_locinfo locinfo = LXC_LOG_LOCINFO_INIT; \
^
criu.c:312:5: note: in expansion of macro ‘ERROR’
ERROR("failed to print ghost limit %lu", opts->user->ghost_limit);
^
Signed-off-by: Christian Brauner <cbrauner@suse.de>
This is a minimal commit which makes the function 'do_restore()' static
as it is not used anywhere else in the code. This also removes a
trailing space my editor complained about.
Signed-off-by: Adrian Reber <areber@redhat.com>
Shortly after CRIU 2.3 has been released a patch has been added to skip
in-flight TCP connections. In-flight connections are not completely
established connections (SYN, SYN-ACK). Skipping in-flight TCP
connections means that the client has to re-initiate the connection
establishment.
This patch stores the CRIU version detected during version check, so
that during dump/checkpoint options can be dynamically enabled depending
on the available CRIU version.
v2:
* use the newly introduced criu version interface
* add an option to disable skipping in-flight connections
Signed-off-by: Adrian Reber <areber@redhat.com>
- If version != NULL criu_version_ok() stores the detected criu version in
version. Allocates memory for version which must be freed by caller.
- If version == NULL criu_version_ok() will return true when the version
matches, false in all other cases.
Signed-off-by: Christian Brauner <cbrauner@suse.de>
A while ago cgroup modes were introduced to CRIU, which slightly changed
the behavior w.r.t. cgroups under the hood. What we're really after is
criu's --full mode, i.e. even if a particular cgroup directory exists
(in particular /lxc/$container[-$number] will, since we create it), we
should restore perms on that cgroup.
Things worked just fine for actual properties (except "special" properties
as criu refers to them, which I've just sent a patch for) because liblxc
creates no subdirectories, just the TLD.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
The idea here is that criu can use open_by_handle on a configuration which
will preserve inodes on moves across hosts, but shouldn't do that on
configurations which won't preserve inodes. Before, we forced it to always
be slow, but we don't have to do this.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
If we don't do this, we'll end up changing the function signatures for the
internal __criu_* functions each time we add a new parameter, which will
get very annoying very quickly. Since we already have the user's arguments
struct, let's just pass that all the way down.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
this enables lxc to perform "disk-less migrations" where memory pages are sent directly to the destination machine instead of being written to the sources filesystem first.
For this, the migrate_opts struct has been added the strings "pageserver_address" and "pageserver_port" so that criu can be told where to look for a pageserver.
Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
strncat only returns its first argument and not the end of the written string.
Thus "buf-pos" is always 0 and consquently no range check is performed.
Signed-off-by: Niklas Eiling <niklas.eiling@rwth-aachen.de>
Hopefully this will avoid name collisions with any user binaries, since
criu is just an implementation detail.
Closes#907
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
If we set lxc.console=none, this fd won't exist, so let's not fail if it
doesn't. We already partially handled this case correctly, so let's
actually handle it correctly :)
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
We don't pass anything on the restore side since we didn't save anything,
but the restore side will expect something if we pass this. Instead, let's
not pass anything.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
In particular, when CRIU fails before it has its log completely initialized
(e.g. if the log directory doesn't exist, or if the argument parser fails),
it prints this to stdout. Let's log that.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
The problem here is that dev_t on most platforms is `long unsigned`, but on
android (and ppc?) it's `long long unsigned`. Let's just upcast to `long
long unsigned` and use that format string to keep the compilers happy.
Safety first!
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
There are a few things going on in this patch.
1. /dev/console is an external mount since it is bind mounted from the
host. However, we don't want to use criu's --ext-mount-map auto handling
here, because that will bind mount exactly the same path from the host
on restore, but if the pts device is different on the target host, we'll
bind mount the wrong one, which is obviously wrong.
2. We need to tell CRIU how to restore the TTY. Since we declare the tty as
--external, we need to provide it via --inherit-fd (even though we've
already fixed up the environment).
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Various other functions/structures are now only used in criu.c, so let's
hide stuff there so as not to pollute headers.
This commit also bumps the required CRIU versions to 2.0. While we don't
*require* any features that aren't in 1.8 patchlevel 21 or above, 2.0 is a
vast improvement, and so we should use that instead.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
cgroup_escape() is a slight abuse of the cgroup code: what we really want
here is to escape the *current* process, whether it happens to be the LXC
monitor or not, into the / cgroups.
In the case of dump, we can't do an lxc_init(), because:
lxc 20160310103501.547 ERROR lxc_commands - commands.c:lxc_cmd_init:993 - ##
lxc 20160310103501.547 ERROR lxc_commands - commands.c:lxc_cmd_init:994 - # The container appears to be already running!
lxc 20160310103501.547 ERROR lxc_commands - commands.c:lxc_cmd_init:995 - ##
We don't want to make this a command to send to the handler, because again,
cgroup_escape() is intended to escape the *current* task to the root
cgroups.
So, let's just have cgroup_escape() build its own handler when required.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
This is no longer needed outside of criu.c with the ->migrate API call, so
let's mark it that way.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
This makes simplifying assumptions: all usable cgroups must be
mounted under /sys/fs/cgroup/controller or /sys/fs/cgroup/contr1,contr2.
Currently this will only work with cgroup namespaces, because
lxc.mount.auto = cgroup is not implemented. So cgfsng_ops_init()
returns NULL if cgroup namespaces are not enabled.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Since we can rename a container on a migrate, let's tell CRIU to use the
LSM profile name the user has specified. This change is motivated by LXD,
which sets an LSM profile name based on the container name, so if a user
changes the name of a container during migration, the old profile name
(that criu has saved) won't exist on the new host.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
No idea how these got there, but let's get rid of them since they're weird.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This patch adds a new ->migrate API call with three commands:
MIGRATE_DUMP: this is basically just ->checkpoint()
MIGRATE_RESTORE: this is just ->restore()
MIGRATE_PRE_DUMP: this can be used to invoke criu's pre-dump command on the
container.
A small addition to the (pre-)dump commands is the ability to specify a
previous partial dump directory, so that one can use a pre-dump of a
container.
Finally, this new API call uses a structure to pass options so that it can
be easily extended in the future (e.g. to CRIU's --leave-frozen option in
the future, for potentially smarter failure handling on restore).
v2: remember to flip the return code for legacy ->checkpoint and ->restore
calls
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Instead of relying on the old ptrace loop, we should instead put all the
tasks in the container into the freezer. This will stop them all at the
same time, preventing fork bombs from causing criu to infinite loop (and is
also simply a lot faster).
Note that this uses --freeze-cgroup which isn't in criu 1.7, so it should
only go into master.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
veths can be unconnected in the container's config, and we should handle
this case.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
tracefs is a new filesystem that can be mounted by users. Only the options
and fs name need to be passed to restore the state, so we can use criu's
auto fs feature.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This was originally used to propagate the bridge and veth names across
hosts, but now we extract both from the container's config file, and
nothing reads the files that dump_net_info() writes, so let's just get rid
of them.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Empty networks don't have anything (besides lo) for us to dump and restore,
so we should allow these as well.
Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
We're leaking the FILE* here while closing the underlying fd; let's just
close the file and thus close both.
Reported-by: Coverity
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>