If the user specifies cgroup or cgroup-full without a specifier (:ro,
:rw or :mixed), this changes the behavior. Previously, these were
simple aliases for the :mixed variants; now they depend on whether the
container also has CAP_SYS_ADMIN; if it does they resolve to the :rw
variants, if it doesn't to the :mixed variants (as before).
If a container has CAP_SYS_ADMIN privileges, any filesystem can be
remounted read-write from within, so initially mounting the cgroup
filesystems partially read-only as a default creates a false sense of
security. It is better to default to full read-write mounts to show the
administrator what keeping CAP_SYS_ADMIN entails.
If an administrator really wants both CAP_SYS_ADMIN and the :mixed
variant of cgroup or cgroup-full automatic mounts, they can still
specify that explicitly; this commit just changes the default without
specifier.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Currently, setup_caps and dropcaps_except both use the same parsing
logic for parsing capabilities (try to identify by name, but allow
numerical specification). Since this is a common routine, separate it
out to improve maintainability and reuseability.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Ubuntu containers have had trouble with automatic cgroup mounting that
was not read-write (i.e. lxc.mount.auto = cgroup{,-full}:{ro,mixed}) in
containers without CAP_SYS_ADMIN. Ubuntu's mountall program reads
/lib/init/fstab, which contains an entry for /sys/fs/cgroup. Since
there is no ro option specified for that filesystem, mountall will try
to remount it readwrite if it is already mounted. Without
CAP_SYS_ADMIN, that fails and mountall will interrupt boot and wait for
user input on whether to proceed anyway or to manually fix it,
effectively hanging container bootup.
This patch makes sure that /sys/fs/cgroup is always a readwrite tmpfs,
but that the actual cgroup hierarchy paths (/sys/fs/cgroup/$subsystem)
are readonly if :ro or :mixed is used. This still has the desired
effect within the container (no cgroup escalation possible and programs
get errors if they try to do so anyway), while keeping Ubuntu
containers happy.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Set a base class for the network object and set the encoding in the
header. Neither of those changes are required for python3 but they do
make it easier for anyone trying to make a python2 binding.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Corrected a small oversight when locking related code was moved from
src/lxc/utils.c to src/lxc/lxclock.c.
Signed-off-by: Stephen M Bennett <stephen_m_bennett@hotmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When using --nesting, we exec ourselves in the container context, if we
somehow need to dynamically-load modules from there, things break. So
make sure we pre-load everything we may need.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This makes sure we only query lxc.group once and then reuse that list
for filtering, listing groups and autostart.
When a container is auto-started only as part of a group, autostart will
now show by-group instead of yes.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
/sys/fs/cgroup is just a size-limited tmpfs, and making it ro does
nothing to affect our ability alter mount settings of its subdirs.
OTOH making it ro can upset mountall in the container which tries
to remount it rw, which may be refused.
So just don't do it.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Christian Seiler <christian@iwakd.de>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
There wasn't a good reason for that limit, we can simply make the code
slightly slower when --groups is passed and still have the expected
output even without --fancy.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
This introduces a new -g/--group argument to filter containers based on
their groups.
This supports the rather obvious: --group blah
Which will only list containers that are in group blah.
It may also be passed multiple times: --group blah --group bleh
Which will list containers that are in either (or both) blah or bleh.
And it also takes: --group blah,bleh --group doh
Which will list containers that are either in BOTH blah and bleh or in doh.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Michael H. Warfield <mhw@WittsEnd.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
lxc-init got moved into SBINDIR/init.lxc recently.
This broke sshd template because path wasn't updated there.
This patch should fix this issue.
Signed-off-by: Nikolay Martynov <mar.kolya@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
- Some scriptlets expect fstab to exist so create it before doing the
yum install
- Set the rootfs selinux label same as the hosts or else the PREIN script
from initscripts will fail when running groupadd utmp, which prevents
creation of OL4.x containers on hosts > OL6.x.
- Move creation of devices into a separate function
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
/proc/sys/kernel/sem* and /proc/sys/kernel/msg* are ipc sysctls
which are properly namespaced. Allow writes to them from
containers.
Reported-by: Dan Kegel <dank@kegel.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
According to Serge, we no longer need to keep cgmanager connection open.
As long as my tests go it seems to be working fine.
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When outputing the lxc.arch setting, use i686 instead of x86 since the
later is not a valid input to setarch, nor will the kernel output
UTS_MACHINE as x86. The kernel sets utsname.machine to i[3456]86, which
all map to PER_LINUX32.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This change accepts all the same strings for lxc.arch that setarch(8) does.
Note that we continue to parse plain x86 as PER_LINUX32 so as not to break
existing lxc configuration files.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Failures were being ignored, leading up to an eventual segfault.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This only converts punctuation marks from FULLWIDTH COMMA/FULL STOP to
IDEOGRAPHIC COMMA/FULL STOP in Japanese man pages. The contents of man
pages do not change at all.
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
I inadvertently introduced this with commit 8bf1e61e.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Check for symlinks before attempting create.
When attempting to create the compulsory symlinks in /dev,
check for the existence of the link using stat first before
blindly attempting to create the link.
This works around an apparent quirk in the kernel VFS on read-only
file systems where the returned error code might be EEXIST or EROFS
depending on previous access to the /dev directory and its entries.
Reported-by: William Dauchy <william@gandi.net>
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com>
Tested-by: William Dauchy <william@gandi.net>
Originally we kept snapshots under /var/lib/lxcsnaps. If a
separate btrfs is mounted at /var/lib/lxc, then we can't
make btrfs snapshots under /var/lib/lxcsnaps.
This patch moves the default directory to /var/lib/lxc/lxcsnaps.
If /var/lib/lxcsnaps already exists, then use that. Don't allow
any container to be used with the name 'lxcsnaps'.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
If you 'ip netns add x1', this creates /run/netns and /run/netns/x1
as shared mounts. When a container starts, it umounts these after
pivot_root, and the umount is propagated to the host.
Worse, doing mount("", "/", NULL, MS_SLAVE|MS_REC, NULL) does not
suffice to change those, even after binding /proc/mounts onto
/etc/mtab.
So, I give up. Do this manually, walking over /proc/self/mountinfo
and changing the mount propagation on everything marked as shared.
With this patch, lxc-start no longer unmounts /run/netns/* on the
host.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
In the body of the manpage, replace a few errant 'fssize's with the
more appropriate word.
Reported-by: MegaBrutal <megabrutal@megabrutal.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
it actually sets us up to run the nih_mainloop, but we will never run
that.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This makes it so that the host doesn't need to have an old, compat
version of db43_load installed by using the db_load from the just
installed container. Some newer distributions do not even have an old
enough compat-db4 package available.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
With this change, you can install a container from a mounted .iso, or any
yum repo with the necessary packages. Unlike the --url option, the repo
does not need to be a mirror of public-yum, but the arch and release must
be specified. For example to install OL6.5 from an .iso image:
mount -o loop OracleLinux-R6-U5-Server-x86_64-dvd.iso /mnt
lxc-create -n OL6.5 -t oracle -- --baseurl=file:///mnt -a x86_64 -R 6.5
The template will create two yum .repo files within the container such that
additional packages can be installed from local media, or the container can
be updated from public-yum, whichever is available. Local media must be bind
mounted from the host onto the containers' /mnt for the former .repo to work:
mount --bind /mnt $LXCPATH/OL6.5/rootfs/mnt
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Recent fixes in the apparmor kernel code is now making at least the CI
environment and quite possibly some others fail due to an invalid path
in the pivot_root stanza.
So update both lines to allow a more generic pivot_root call for
anything in LXC's work directory.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
In this patch I tried to stick with each file's coding style, however I
think we should probably change that. Every main() should always not
return and only exit; they should always return EXIT_SUCCESS or EXIT_FAILURE
with the only exceptions being cases where we are returning a child's
exit status (lxc_execute, lxc_attach, lxc_init).
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
So that exit status doesn't show up as 255.
Reported-by: Andrey Khozov <avkhozov@googlemail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>