If a container has created its own cgroups, i.e. by running libvirtd,
then if we don't delete all child cgroups, then the rmdir will fail.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Use the correct path for the container's cgroup task file.
Also exit out early and cleanly if the container is not running,
and bind-mount /proc/$pid/net with '-n' to keep the entry out
of mtab, else the mtab entry will never go away.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
This patch looks for Daniel's kernel patch allowing the lxc monitor
to tell container reboot from shutdown based on the exit signal. If
that patch is not there, utmp monitoring is used. Otherwise, it only
looks for the signal. Note that the 'conf->need_utmp_watch' is
technically not necessary, as there is no harm in watching the utmp
file.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
when --lvname is given, use that for lvcreate instead of using
lxc_name, which is wrong.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
1. Some templates copy the cached pristine rootfs using 'cp a b' where b is
$lxc_path/$name/rootfs. That doesn't do the right thing if rootfs already
exists, as it will when it is an lvm or other mount. So switch to
'rsync a/ b/'. (cp can be made to work too of course).
2. Update lxc-create to support backing stores. For now only lvm is
implemented.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Don't delete a running container. If it's running, abort the delete
unless a new '-f' (force) flag is given, in which case, stop it first.
Handle the case where we can't find $rootfs in config
Fix broken detection of lvm backing store
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
With this patch, I can start a container 'o1' inside another container 'o1'.
(Of course, the containers must be on a different subnet)
Detail:
1. Create cgroups for containers under /lxc.
2. Support nested lxc: respect init's cgroup:
Create cgroups under init's cgroup. So if we start a container c2
inside a container 'c1', we'll use /sys/fs/cgroup/freezer/lxc/c1/lxc/c2
instead of /sys/fs/cgroup/freezer/c2. This allows a container c1
to be created inside container c1 It also allow a container's limits
to be enforced on all a container's children (which a MAC policy could
already enforce, in which case current lxc code would be unable to nest
altogether).
3. Finally, if a container's cgroup already exists, rename it rather than
failing to start the container. Try to WARN the user so they might go
clean the old cgroup up.
Whereas without this patch, container o1's cgroup would be
/sys/fs/cgroup/<subsys>/o1,
it now becomes
/sys/fs/cgroup/<subsys>/<initcgroup>/lxc/o1
so if init is in cgroup '/' then o1's freezer cgroup would be:
/sys/fs/cgroup/freezer/lxc/o1
Changelog:
. make lxc-ps work with separate mtab. If cgroups were mounted with -n,
and mtab is not linked to /proc/self/mounts, then 'mount -t cgroup' won't
show these mounts. So make lxc-ps not use it, but rather use
/proc/self/mounts directly.
. lxc-ls in the past assumed that a container's cgroup was just '/<name>'.
Now it is '/<host-init-cgroup>/lxc/<name>'. Handle that.
. first version of this patch was setting clone_children on
<path-to-cpusets-cgroup>/<init-cgroup>/lxc, not the parent of that dir.
That failed to initialize that cgroup, so tasks could not enter it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Particularly for LTS releases, which many people will want to use in
their containers, it is not wise to not use -security and -updates.
Furthermore the fix allowing ssh to allow the container to shut down
is in lucid-updates only.
With this patch, after debootstrapping a container, we add -updates
and -security to sources.list and do an apt-get upgrade under chroot.
Unfortunately we need to do this because debootstrap doesn't know how
to.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Thanks for Scott Moser for these, which allows qemu to run inside a container.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
mac_admin stops the container from loading LSM policy. Neither
selinux nor apparmor currently will do well with automatic namespacing
of policy (though it's coming in apparmor, after which we can re-enable
this).
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The issue is similar to what was fixed in commit e7eb632c for ARM:
the "configure" script errors out because it is unable to set
LINUX_SRCARCH. Fix is to add MIPS to the list.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
## 0001-Replace-pkglib_PROGRAMS-with-pkglibexec_PROGRAMS.patch [diff]
From 95c566740bba899acc7792c11fcdb3f4d32dcfc9 Mon Sep 17 00:00:00 2001
From: Jon Nordby <jononor@gmail.com>
Date: Fri, 10 Feb 2012 11:38:35 +0100
Subject: [PATCH] Replace pkglib_PROGRAMS with pkglibexec_PROGRAMS
Without this change, autogen.sh fails with automake 1.11.3
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
lxc-attach will now put the process that is attached to the container into
the correct cgroups corresponding to the container, set the correct
personality and drop the privileges.
The information is extracted from entries in /proc of the init process of
the container. Note that this relies on the (reasonable) assumption that the
init process does not in fact drop additional capabilities from its bounding
set.
Additionally, 2 command line options are added to lxc-attach: One to prevent
the capabilities from being dropped and the process from being put into the
cgroup (-e, --elevated-privileges) and a second one to explicitly state the
architecture which the process will see, (-a, --arch) which defaults to the
container's current architecture.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Since lxc-attach helper functions now have an own source file, lxc_attach is
moved from namespace.c to attach.c and is renamed to lxc_attach_to_ns,
because that better reflects what the function does (attaching to a
container can also contain the setting of the process's personality, adding
it to the corresponding cgroups and dropping specific capabilities).
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The following helper functions for lxc-attach are added to a new file
attach.c:
- lxc_proc_get_context_info: Get cgroup memberships, personality and
capability bounding set from /proc for a given process.
- lxc_proc_free_context_info: Free the data structure responsible
- lxc_attach_proc_to_cgroups: Add the process specified by the pid
parameter to the cgroups given by the ctx parameter.
- lxc_attach_drop_privs: Drop capabilities to the capability mask given in
the ctx parameter.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Add the function lxc_config_parse_arch that parses an architecture string
(x86, i686, x86_64, amd64) and returns the corresponding personality. This
is required for lxc-attach, which accepts architectures independently of
lxc.arch. The parsing of lxc.arch now also uses the same function to ensure
consistency.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
lxc-attach needs to be able to attach a process to specific cgroup, so
cgroup_attach is renamed to lxc_cgroup_attach and now also defined in the
header file.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
lxc-attach functionality reads /proc/init_pid/cgroup to determine the cgroup
of the container for a given subsystem. However, since subsystems may be
mounted together, we want to be on the safe side and be sure that we really
find the correct mount point, so we allow get_cgroup_mount to check for
*all* the subsystems; the subsystem parameter may now be a comma-separated
list.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
lxc.cap.drop now also accepts numeric values for capabilities. This allows
the user to specify capabilities LXC doesn't know about yet or capabilities
that were not part of the kernel headers LXC was compiled against.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The function lxc_caps_last_cap() determines CAP_LAST_CAP of the current kernel
dynamically. It first tries to read /proc/sys/kernel/cap_last_cap. If that
fails, because the kernel does not support this interface yet, it loops
through all capabilities and tries to determine whether the current capability
is part of the bounding set. The first capability for which prctl() fails is
considered to be CAP_LAST_CAP.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
This patch is to correct the manipulation of signal masks when
installing signal handlers for lxc-init.
Signed-off-by: Jian Xiao <jian@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
All the signals (except fatal ones) are redirected to signalfd at lxc_init,
so the LXC_TTY_HANDLERs are redundant. This patch removes them.
Signed-off-by: Jian Xiao <jian@linux.vnet.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
This lxc-monitor limitation deserves some lines in the manpage, until
something is done to allow several monitors to run concurrently.
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
A typical usage is to start lxc-monitor in popen() and parse the ouput.
Unfortunately, glibc defaults to block buffering for pipes and you may
have to wait several lines before anything is written to stdout... this
prevent the use of lxc-monitor to implement automatons. Let's go line
buffered !
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Particularly for LTS releases, which many people will want to use in
their containers, it is not wise to not use release-security and
release-updates. Furthermore the fix allowing ssh to allow the container
to shut down is in lucid-updates only.
With this patch, after debootstrapping a container, we add -updates and
-security to sources.list and do an upgrade under chroot. Unfortunately
we need to do this because debootstrap doesn't know how to.
Changelog:
Nov 14: as Stéphane Graber suggested, make sure no daemons start on
the host while doing dist-upgrade from chroot.
Nov 15: use security.ubuntu.com, not mirror. (stgraber)
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
When the cgroup is not mounted, we silently exit without giving
some clues to the user with what is happening.
Give some info and an explicit error.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
When used in conjunction with a bridge, veth devices with random addresses
may change the mac address of the bridge itself if the mac address of the
interface newly added is numerically lower than the previous mac address
of the bridge. This is documented kernel behavior. To avoid changing the
host's mac address back and forth when starting and/or stopping containers,
this patch ensures that the high byte of the mac address of the veth
interface visible from the host side is set to 0xfe.
A similar logic is also implemented in libvirt.
Fixes SF bug #3411497
See also: <http://thread.gmane.org/gmane.linux.kernel.containers.lxc.general/2709>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Allow mknod (fixing udev upgrades) and drop mac_override and mac_admin
from lxc.cap.drop as apparmor has/will have support for namespaces
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
To avoid name collisions between local and system header
files. For example, if you try to include the <pty.h>
system file, you end up including the one from lxc...
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The "" notation is preferrable if the header file is local.
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The hardcoded URL seems to be broken and 404 error was not
checked. Now the mirror is selected from mirrorlist (instead of
hardcoding to funet.fi) and fetch errors are checked.
Also added a retry loop (with 3 tries) to find a working mirror, since
some of the mirrors are not OK.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
There is no i686 variant of Fedora, but Ubuntu seems to return i686
from the arch command.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
The text says that 14 is default, but release=14 was not set anywhere
in the script.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
It prevents containers from getting a good resolv.conf without doing
ifdown eth0; ifup eth0.
(see pad.lv/880020)
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
This patch adds a private argument to extend the struct
lxc_arguments. This is useful to develop custom lxc commands
outside mainline lxc.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>