1. deeper hierarchy has steep performance costs
2. init may be under /init, but containers should be under /lxc
3. in a nested container we like to bind-mount $cgroup_path/$c/$c.real
into $cgroup_path - but task 1's cgroup is $c/$c.real, so a nested
container would be in $c/$c.real/lxc, which would become
/$c/$c.real/$c/$c.real/lxc when expanded
4. this pulls quite a bit of code (of mine) which is always nice
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
The kernel requires a single atomic write for setting the /proc
idmap files. We were calling write(2) more than once when multiple
ranges were configured so instead build a buffer to pass in one write(2)
call.
Change id types to unsigned long to handle large id mappings gracefully.
Fix max id in example comment.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
I remember discussion about implementing proper way to shutdown
guests using different signals, so here's a patch proposal.
It allows to use specific signal numbers to shutdown guests
gracefully, for example SIGRTMIN+4 starts poweroff.target in
systemd.
Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This fixes some issues found by Oracle QA, including several cosmetic
errors seen during container bootup.
The rpm database needs moving on Debian hosts similar to on Ubuntu.
I took Serge's suggestions: Do the yum install in an unshared
mount namespace so the /proc mount done during OL4 install doesn't
pollute the host. No need to blacklist ipv6 modules.
Make the default release 6.3, unless the host is OL, then default
to the same version as the host (same as Ubuntu template does).
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
The id ordering and case of u,g is also consistent with uidmapshift,
reducing confusion.
doc: Moved example to the the EXAMPLES section, and used values
corresponding to the defaults in the pending shadow-utils subuid patch.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Debian 5.0 Lenny turned out of support on the 6th of February 2012.
From now on, the only supported Debian template is lxc-debian.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
1. if there's no rootfs, return -2, not 0.
2. don't close pinfd unconditionally in do_start().
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: David Ward <david.ward@ll.mit.edu>
This should eventually make the source releases available on sourceforge
also contain the tests.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
If we're not attaching to the mount ns , then don't enter the
container's apparmor policy. Since we're running binaries from the host
and not the container, that actually seems the sane thing to do (besides
also the lazier thing).
If we dont' do this patch, then we will need to move the apparmor attach
past the procfs remount, will need to also mount securityfs if available,
and for the !remount_proc_sys case we'll want to mount those just long
enough to do the apparmor transition.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
When attaching to a container with a user namespace, try to detect the
user and group ids of init via /proc and attach as that same user. Only
if that is unsuccessful, fall back to (0, 0).
Signed-off-by: Christian Seiler <christian@iwakd.de>
If getpwuid() fails and also the fallback of spawning of a 'getent'
process, and the user specified no command to execute, default to
/bin/sh and only fail if even that is not available. This should ensure
that unless the container is *really* weird, no matter what, the user
should always end up with a shell when calling lxc-attach with no
further arguments.
Signed-off-by: Christian Seiler <christian@iwakd.de>
If no command is specified, and using getpwuid() to determine the login
shell fails, try to spawn a process that executes the utility 'getent'.
getpwuid() may fail because of incompatibilities between the NSS
implementations on the host and in the container.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Add a monitor command to get the cgroup for a running container. This
allows container r1 started from /var/lib/lxc and container r1 started
from /home/ubuntu/lxcbase to pick unique cgroup directories (which
will be /sys/fs/cgroup/$subsys/lxc/r1 and .../r1-1), and all the lxc-*
tools to get that path over the monitor at lxcpath.
Rework the cgroup code. Before, if /sys/fs/cgroup/$subsys/lxc/r1
already existed, it would be moved to 'deadXXXXX', and a new r1 created.
Instead, if r1 exists, use r1-1, r1-2, etc.
I ended up removing both the use of cgroup.clone_children and support
for ns cgroup. Presumably we'll want to put support for ns cgroup
back in for older kernels. Instead of guessing whether or not we
have clone_children support, just always explicitly do the only thing
that feature buys us - set cpuset.{cpus,mems} for newly created cgroups.
Note that upstream kernel is working toward strict hierarchical
limit enforcements, which will be good for us.
NOTE - I am changing the lxc_answer struct size. This means that
upgrades to this version while containers are running will result
in lxc_* commands on pre-running containers will fail.
Changelog: (v3)
implement cgroup attach
fix a subtle bug arising when we lxc_get_cgpath() returned
STOPPED rather than -1 (STOPPED is 0, and 0 meant success).
Rename some functions and add detailed comments above most.
Drop all my lxc_attach changes in favor of those by Christian
Seiler (which are mostly the same, but improved).
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
As Kees pointed out, write() errors can be delayed and returned as
close() errors. So don't ignore error on close when writing the
userns id mapping.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When you clone a new user_ns, the child cannot write to the fds
opened by the parent. Hnadle this by doing an extra fork. The
grandparent hangs around and waits for its child to tell it the
pid of of the grandchild, which will be the one attached to the
container. The grandparent then moves the grandchild into the
right cgroup, then waits for the child who in turn is waiting on
the grandchild to complete.
Secondly, when attaching to a new user namespace, your old uid is
not valid, so you are uid -1. This patch simply does setid+setuid
to 0 if that is the case. We probably want to be smarter, but
for now this allows lxc-attach to work.
Signed-off-by: Christian Seiler <christian@iwakd.de>
This patch enables lxc-attach to join the profile of the container it
is attaching to. Builds/runs fine with apparmor enabled and disabled.
Export new aa_get_profile(), and use it for attach_apparmor, but also
handle profile names longer than 100 chars in lxc_start apparmor
support.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
The python api test script was using @LXCPATH@ for one of its checks.
Now that the lxcpath is exposed by the lxc python module directly, this
can be dropped and api_test.py can now become a simple python file without
needing pre-processing by autoconf.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Add initial support for showing and querying nested containers.
This is done through a new --nesting argument to lxc-ls and uses
lxc-attach to go look for sub-containers.
Known limitations include the dependency on setns support for the PID
and NETWORK namespaces and the assumption that LXCPATH for the sub-containers
matches that of the host.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This adaptation of systemd. We also add network configuration support.
Jiri Slaby: cleanups, rebase
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
- create /etc/hostname as symlink to /etc/HOSTNAME
- fix inadequate space in lxc.mount config, preventing lxc-clone to work
Jiri Slaby: some cleanups
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
got link error liblxc.so: undefined reference to `clock_gettime'
clock_gettime is used by lxclock.c and is in librt, or bionic libc.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Allow for an additional --host parameter to lxc-ps hiding all processes running
in containers.
Signed-off-by: Guido Jäkel <G.Jaekel@dnb.de>
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This adds -P/--lxcpath to the various python scripts.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
We've been shipping those two hooks for a while in Ubuntu.
Yesterday I reworked them to use the new environment variables and
avoid hardcoding any path that we have available as a variable.
I tested both to work on Ubuntu 13.04 but they should work just as well
on any distro shipping with the cgroup hierarchy in /sys/fs/cgroup and
with ecryptfs available.
Those are intended as example and distros are free to drop them, they
should however be working without any change required, at least on Ubuntu.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Ok, took a look, what happened was the clearenv calls used to be
in lxc_start and lxccontainer and lxc_execute (do lxc_start() callers)
themselves. I moved those into do_start(), but the calls in
lxccontainer.c were never removed.
They should simply be removed altogether. Trivial patch follows.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This commit tweaks the layout of the config file for the Ubuntu templates.
With this, we now get a clear network config group, then a path related group,
then a bunch of random config options and the end of the config is apparmor,
capabilities and cgroups.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This is needed for lxc_wait and lxc_monitor to handle lxcpath. However,
the full path name is limited to 108 bytes. Should we use a md5sum of
the lxcpath instead of the path itself?
In any case, with this patch, lxc-wait and lxc-monitor work right with
respect to multiple lxcpaths.
The lxcpath is added to the lxc_handler to make it available most of the
places we need it.
I also remove function prototypes in monitor.h for two functions which
are not defined or used anywhere.
TODO: make cgroups tolerate multiple same-named containers.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Use AC_SEARCH_LIBS to detect what library provides sem_*.
This allows us to stop hardcoding the ld arguments in the various MakeFiles.
Suggested-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
For the lxc-* C binaries, introduce a -P|--lxcpath command line option
to override the system default.
With this, I can
lxc-create -t ubuntu -n r1
lxc-create -t ubuntu -n r1 -P /home/ubuntu/lxcbase
lxc-start -n r1 -d
lxc-start -n r1 -d -P /home/ubuntu/lxcbase
lxc-console -n r1 -d -P /home/ubuntu/lxcbase
lxc-stop -n r1
all working with the right containers (module cgroup stuff).
To do:
* lxc monitor needs to be made to handle cgroups.
This is another very invasive one. I started doing this as
a part of this set, but that gets hairy, so I'm sending this
separately. Note that lxc-wait and lxc-monitor don't work
without this, and there may be niggles in what I said works
above - since start.c is doing lxc_monitor_send_state etc
to the shared abstract unix domain socket.
* Need to handle the cgroup conflicts.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Replace deprecated AM_CONFIG_HEADER with AC_CONFIG_HEADERS.
This is needed for automake-1.13.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
For lxc-ls without --active, only output a directory in lxc_path if it
contains a file named config. This avoids extra directories that may
exist in lxc_path, for example .snapshot if lxc_path is an nfs mount.
For lxc-ls with --active, don't output . if there are no active
containers.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>