This function safely parses an unsigned integer. On success it returns 0 and
stores the unsigned integer in @converted. On error it returns a negative
errno.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
If the file "/sys/devices/system/cpu/isolated" doesn't exist, we can't just
simply bail. We still need to check whether we need to copy the parents cpu
settings.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
- add more logging
- only write to cpuset.cpus if we really have to
- simplify cleanup on error and success
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
Move the user namespace at the first position in the array so that we always
attach to it first when iterating over the struct and using setns() to switch
namespaces. This especially affects lxc_attach(): Suppose you cloned a new user
namespace and mount namespace as an unprivileged user on the host and want to
setns() to the mount namespace. This requires you to attach to the user
namespace first otherwise the kernel will fail this check:
if (!ns_capable(mnt_ns->user_ns, CAP_SYS_ADMIN) ||
!ns_capable(current_user_ns(), CAP_SYS_CHROOT) ||
!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
return -EPERM;
in
linux/fs/namespace.c:mntns_install().
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
Using custom structs in attach.c risks getting out of sync with the commonly
used ns_info[LXC_NS_MAX] struct and thus attaching to wrong namespaces. Switch
to using ns_info[LXC_NS_MAX].
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
- simply check /proc/self/ns
- improve SYSERROR() report
- use #define to prevent gcc & clang to use a VLA
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
Improve log and comments in a bunch of places to make it easier for us on bug
reports.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
- Allocating an error message that the caller must free seems pointless. We can
just print the error message in preserve_ns() itself. This also allows us to
avoid using the GNU extension asprintf().
- Improve lxc_preserve_ns(): By passing in NULL or "" as the second argument
the function can now also be used to check whether namespaces are supported
by the kernel.
- Use lxc_preserve_ns() in preserve_ns().
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
- So far we blindly called lxc_delete_network() to make sure that we deleted
all network interfaces. This resulted in pointless netlink calls, especially
when a container had multiple networks defined. Let's be smarter and have
lxc_delete_network() return a boolean that indicates whether *all* configured
networks have been deleted. If so, don't needlessly try to delete them again
in start.c. This also decreases confusing error messages a user might see.
- When we receive -ENODEV from one of our lxc_netdev_delete_*() functions,
let's assume that either the network device already got deleted or that it
got moved to a different network namespace. Inform the user about this but do
not report an error in this case.
- When we have explicitly deleted the host side of a veth pair let's
immediately free(priv.veth_attr.pair) and NULL it, or
memset(priv.veth_attr.pair, ...) the corresponding member so we don't
needlessly try to destroy them again when we have to call
lxc_delete_network() again in start.c
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
When we set LXC_DEBUG_CGFSNG=1 we print out info about detected cgroup
hierarchies. When there's no named cgroup mounted we need to make sure that we
don't try to index an unallocated pointer.
Signed-off-by: Christian Brauner <christian.brauner@canonical.com>
With the criu release 2.8 criu deprecated the --veth-pair command-line
option in favor of --external:
f2037e6 veth: Make --external support --veth-pair
git tag --contains f2037e6d3445fc400
v2.8
With this commit lxc-checkpoint will automatically switch between
the new and old command-line option dependent on the detected
criu version.
For criu version older than 2.8 something like this will be used:
--veth-pair eth0=vethYOK6RW@lxcbr0
and starting with criu version 2.8 it will look like this:
--external veth[eth0]:vethCRPEYL@lxcbr0
Signed-off-by: Adrian Reber <areber@redhat.com>
CRIU supports dirty memory tracking to take incremental checkpoints.
Incremental checkpoints are one way of reducing downtime during
migration. The first checkpoint dumps all the memory pages and the
second (and third, and fourth, ...) only dumps pages which have changed.
Most of the necessary code has already been implemented. This just adds
the existing functionality to lxc-checkpoint:
-p, --pre-dump Only pre-dump the memory of the container.
Container keeps on running and following
checkpoints will only dump the changes.
--predump-dir=DIR path to images from previous dump (relative to -D)
The following is an example from a container running CentOS 7 with psql
and tomcat:
# lxc-checkpoint -n c7 -D /tmp/cp -p
Container keeps on running
# du -h /tmp/cp
229M /tmp/cp
Sync initial checkpoint to destination
# rsync -a /tmp/cp host2:/tmp/
Sync file-system
# rsync -a /var/lib/lxc/c7 host2:/var/lib/lxc/
Final dump; container is stopped
# lxc-checkpoint -n c7 -D /tmp/cp --predump-dir=../cp -s
# du -h /tmp/cp2
90M /tmp/cp2
After transferring the second (incremental checkpoint) and the changes
to the container's file system the container can be restored on the
second host by pointing lxc-checkpoint to the second checkpoint
directory:
# lxc-checkpoint -n c7 -D /tmp/cp2 -r
Signed-off-by: Adrian Reber <areber@redhat.com>