This adds new functionality to lxc-autostart.
*) The -g / --groups option is multiple cummulative entry.
This may be mixed freely with the previous comma separated
group list convention. Groups are processed in the
order they first appear in the aggregated group list.
*) The NULL group may be specified in the group list using either a
leading comma, a trailing comma, or an embedded comma.
*) Booting proceeds in order of the groups specified on the command line
then ordered by lxc.start.order and name collalating sequence.
*) Default host bootup is now specified as "-g onboot," meaning that first
the "onboot" group is booted and then any remaining enabled
containers in the NULL group are booted.
*) Adds documentation to lxc-autostart for -g processing order and
combinations.
*) Parameterizes bootgroups, options, and shutdown delay in init scripts
and services.
*) Update the various init scripts to use lxc-autostart in a similar way.
Reported-by: CDR <venefax@gmail.com>
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
AC_SEARCH_LIBS always places the library being queried into LIBS. We
don't want that - we were only checking whether a function is
available. Not everything (notably not init.lxc.static) needs to
link against -lcgmanager.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Move choose_init into utils.c so we can re-use it. Make it and on_path
accept an optional rootfs argument to prepend to the paths when checking
whether the file exists.
Also add lxc.init.static to .gitignore
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Changelog:
May 19: put init.lxc.static into container's root dir
rather than under SBINDIR [stgraber].
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
To avoid having to copy all the library dependencies into the container.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
If you attach a file to /dev/nbd0, it may take some time for /dev/nbd0p1
to show up. Allow up to 5 seconds in that case, then bail.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
It is not possible to mount a block device from a non-init user namespace.
Therefore if root on the host is starting a container with a uid
mapping, and the rootfs is a block device, then mount the rootfs before
we spawn the container init task.
This addresses https://github.com/lxc/lxc/issues/221
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Newer kernels optionally disallow reading /proc/$$/personality by
non-root users. We can get the personality through the lxc command
interface, so do so.
Also try to be more consistent about personality being a signed long.
We had it as int, unsigned long, signed long throughout the code.
(This addresses bug
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1322067 :
3.15.0-1.x breaks lxc-attach for unprivileged containers)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Otherwise the name=systemd cgroup isn't changed to one which
the lxc-unpriv user can write to, causing the test to fail.
This allows lxc-test-unpriv and lxc-test-usernic to pass when run in an
unprivileged container with cgmanager.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
When I converted attach and enter to using move_pid_abs, these needed
to use the new get_pid_cgroup_abs method to get an absolute path. But
for some inexplicable reason I also converted the functions which get
and set cgroup properties to use the absolute paths. These are simply
not compatible with the cgmanager set_value and get_value methods.
This breaks for instance lxc-test-cgpath.
So undo that. With this patch lxc-test-cgpath, lxc-test-autotest,
and lxc-test-concurrent once again pass in a nested container.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
For years it has been best practice to use a relative path as
the mount target. But the manpage hasn't reflect that. Fix it.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
backing stores supported by qemu-nbd can be attached to a nbd block
device using qemu-nbd. This user-space process (pair) stays around for
the duration of the device attachment. Obviously we want it to go away
when the container shuts down, but not before the filesystems have been
cleanly unmounted.
The device attachment is done from the task which will become the
container monitor before the container setup+init task is spawned.
That task starts in a new pid namespace to ensure that the qemu-nbd
process will be killed if need be. It sets its parent death signal
to sighup, and, on receiving sighup, attempts to do a clean
qemu-device detach, then exits. This should ensure that the
device is detached if the qemu monitor crashes or exits.
It may be worth adding a delay before the qemu-nbd is detached, but
my brief tests haven't seen any data corruption.
Only the parts required for running a nbd-backed container are
implemented here. Create, destroy, and clone are not. The first
use of this that I imagine is for people to use downloaded nbd-backed
images (like ubuntu cloud images, or anything previously used with
qemu). I imagine people will want to create/clone/destroy out of
band using qemu-img, but if I'm wrong about that we can implement
the rest later.
Because attach_block_device() is done before the bdev is initialized,
and bdev_init needs to know the nbd index so that it can mount the
filesystem, we now need to pass the lxc_conf.
file_exists() is moved to utils.c so we can use it from bdev.c
The nbd attach/detach should lay the groundwork for trivial implementation
of qed and raw images.
changelog (may 12): fix idx check at detach
changelog (may 15): generalize qcow2 to nbd
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
This is a fix to commit 5f2ea8cfcb.
Sorry, not sure how I missed this in testing the original patch.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
when using btrfs backend lxc-create first creates rootfs in /usr/lib/lxc/rootfs
directory before moving it to /var/lib/lxc or other directory supplied by the
command line. Archlinux template relied in $rootfs_path which made containers
created with btrfs backend have lxc.rootfs set to /usr/lib/lxc/rootfs. By using
$path instead of $rootfs_path we make sure that lxc.rootfs is always correct.
Signed-off-by: Edvinas Klovas <edvinas@pnd.io>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Don't spawn a getty on /dev/console when running under libvirt-lxc
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
On older cgmanager the support was broken. So rather than
fail container starts altogether, just keep the old lxc behavior
in this case by not using name= subsystems.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
archlinux is using systemd and systemd's configuration does not have any
services setup to handle sigpwr hook which is sent by lxc-stop command. By
enabling sigpwr service we make sure that lxc-stop will work.
Signed-off-by: Edvinas Klovas <edvinas@pnd.io>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
If an unprivileged user does 'lxc-start -n u1' in one
login session, followed by 'lxc-attach -n u1' in another
session, the attach will fail if the sessions are in different
cgroups. The same is true of lxc-cgroup commands.
Address this by using the GetPidCgroupAbs and MovePidAbs
which work with the containers' cgroup path relative to
the cgproxy.
Since GetPidCgroupAbs is new to api version 3 in cgmanager,
use the old method if we are on an older cgmanager.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: "S.Çağlar Onur" <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Read /proc/self/cgroup instead of /proc/cgroups, so as to catch
named subsystems. Otherwise the contaienrs will not be fully
moved into the container cgroups.
Also free line which was being leaked.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Do this by calling the bdev->destroy() hook from a user namespace
configured as the container's.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
btrfs subvolume ioctls are usable by unprivileged users, so allow
unprivileged containers to reside on btrfs.
This patch does not yet enable destroy.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
If the user specifies cgroup or cgroup-full without a specifier (:ro,
:rw or :mixed), this changes the behavior. Previously, these were
simple aliases for the :mixed variants; now they depend on whether the
container also has CAP_SYS_ADMIN; if it does they resolve to the :rw
variants, if it doesn't to the :mixed variants (as before).
If a container has CAP_SYS_ADMIN privileges, any filesystem can be
remounted read-write from within, so initially mounting the cgroup
filesystems partially read-only as a default creates a false sense of
security. It is better to default to full read-write mounts to show the
administrator what keeping CAP_SYS_ADMIN entails.
If an administrator really wants both CAP_SYS_ADMIN and the :mixed
variant of cgroup or cgroup-full automatic mounts, they can still
specify that explicitly; this commit just changes the default without
specifier.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Currently, setup_caps and dropcaps_except both use the same parsing
logic for parsing capabilities (try to identify by name, but allow
numerical specification). Since this is a common routine, separate it
out to improve maintainability and reuseability.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Ubuntu containers have had trouble with automatic cgroup mounting that
was not read-write (i.e. lxc.mount.auto = cgroup{,-full}:{ro,mixed}) in
containers without CAP_SYS_ADMIN. Ubuntu's mountall program reads
/lib/init/fstab, which contains an entry for /sys/fs/cgroup. Since
there is no ro option specified for that filesystem, mountall will try
to remount it readwrite if it is already mounted. Without
CAP_SYS_ADMIN, that fails and mountall will interrupt boot and wait for
user input on whether to proceed anyway or to manually fix it,
effectively hanging container bootup.
This patch makes sure that /sys/fs/cgroup is always a readwrite tmpfs,
but that the actual cgroup hierarchy paths (/sys/fs/cgroup/$subsystem)
are readonly if :ro or :mixed is used. This still has the desired
effect within the container (no cgroup escalation possible and programs
get errors if they try to do so anyway), while keeping Ubuntu
containers happy.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Set a base class for the network object and set the encoding in the
header. Neither of those changes are required for python3 but they do
make it easier for anyone trying to make a python2 binding.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Corrected a small oversight when locking related code was moved from
src/lxc/utils.c to src/lxc/lxclock.c.
Signed-off-by: Stephen M Bennett <stephen_m_bennett@hotmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When using --nesting, we exec ourselves in the container context, if we
somehow need to dynamically-load modules from there, things break. So
make sure we pre-load everything we may need.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>