Commit Graph

3097 Commits

Author SHA1 Message Date
Stéphane Graber
6e39e4cbff Enable default seccomp profile for all distros
This updates the common config to include Serge's seccomp profile by
default for privileged containers.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-07-01 23:41:11 -04:00
hallyn
616d626b4e Merge pull request #244 from xose/btrfs
lxc-ubuntu: use btrfs subvolumes and snapshots
2014-06-30 16:18:35 -05:00
Jesse Tane
f2f545857c Apparmor: allow hugetlbfs mounts everywhere
Signed-off-by: Jesse Tane <jesse.tane@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-30 17:06:52 -04:00
Stéphane Graber
b4c1e35d24 Cast to gid_t to fix android build failure
stat.st_gid is unsigned long in bionic instead of the expected gid_t, so
just cast it to gid_t.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-30 15:19:19 -04:00
TAMUKI Shoichi
8b227008f6 Fix to work lxc-destroy with unprivileged containers on recent kernel
Change idmap_add_id() to add both ID_TYPE_UID and ID_TYPE_GID entries
to an existing lxc_conf, not just an ID_TYPE_UID entry, so as to work
lxc-destroy with unprivileged containers on recent kernel.

Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
Acked-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-30 12:25:13 -04:00
TAMUKI Shoichi
7b50c609e4 Fix to work lxc-start with unprivileged containers on recent kernel
Change chown_mapped_root() to map in both the root uid and gid, not
just the uid, so as to work lxc-start with unprivileged containers on
recent kernel.

Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-30 12:24:48 -04:00
Alexander Vladimirov
3fdd0ca89e Don't call sig_name twice, use pointer instead
Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-27 15:23:48 -04:00
Serge Hallyn
1070908147 cgm_get: make sure @value is null-terminated
Previously this was done by strncpy, but now we just read
the len bytes - not including \0 - from a pipe, so pre-fill
@value with 0s to be safe.

This fixes the python3 api_test failure.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-27 13:51:33 -05:00
Serge Hallyn
d4ff964559 cgmanager: have cgm_set and cgm_get use absolute path when possible
This allows users to get/set cgroup settings when logged into a different
session than that from which they started the container.

There is no cgmanager command to do an _abs variant of cgmanager_get_value
and cgmanager_set_value.  So we fork off a new task, which enters the
parent cgroup of the started container, then can get/set the value from
there.  The reason not to go straight into the container's cgroup is that
if we are freezing the container, or the container is already frozen, we'll
freeze as well :)  The reason to fork off a new task is that if we are
in a cgroup which is set to remove-on-empty, we may not be able to return
to our original cgroup after making the change.

This should fix https://github.com/lxc/lxc/issues/246

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-27 13:08:54 -04:00
Alexander Vladimirov
23cc88bae0 lxc-archlinux.in: update securetty when lxc.devttydir is set
Update container's /etc/securetty to allow console logins when lxc.devttydir is not empty.
Also use config entries provided by shared and common configuration files.

Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-27 13:05:39 -04:00
Alexander Vladimirov
99cbd2996b lxc-archlinux.in: Add pacman keyring initialization back
Shuffle around usage text a bit and add missing -d while there.

Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-27 13:03:57 -04:00
Stéphane Graber
0d7cf7e9da attach: Fix querying for the current personality
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-25 15:34:46 -04:00
Stéphane Graber
5b99af0079 Reduce duplication in new style configs
This is a rather massive cleanup of config/templates/*

As new templates were added, I've noticed that we pretty much all share
the tty/pts configs, some capabilities being dropped and most of the
cgroup configuration. All the userns configs were also almost identical.

As a result, this change introduces two new files:
 - common.conf.in
 - userns.conf.in

Each is included by the relevant <template>.<type>.conf.in templates,
this means that the individual per-template configs are now overlays on
top of the default config.

Once we see a specific key becoming popular, we ought to check whether
it should also be applied to the other templates and if more than 50% of
the templates have it set to the same value, that value ought to be
moved to the master config file and then overriden for the templates
that do not use it.

This change while pretty big and scary, shouldn't be very visible from a
user point of view, the actual changes can be summarized as:
 - Extend clonehostname to work with Debian based distros and use it for
   all containers.
 - lxc.pivotdir is now set to lxc_putold for all templates, this means
   that instead of using /mnt in the container, lxc will create and use
   /lxc_putold instead. The reason for this is to avoid failures when the
   user bind-mounts something else on top of /mnt.
 - Some minor cgroup limit changes, the main one I remember is
   /dev/console now being writable by all of the redhat based containers.
   The rest of the set should be identical with additions in the per-distro
   ones.
 - Drop binfmtmisc and efivars bind-mounts for non-mountall based
   unpriivileged containers as I assumed they got those from copy/paste
   from Ubuntu and not because they actually need those entries. (If I'm
   wrong, we probably should move those to userns.conf then).

Additional investigation and changes to reduce the config delta between
distros would be appreciated. In practice, I only expect lxc.cap.drop
and lxc.mount.entry to really vary between distros (depending on the
init system, the rest should be mostly common.

Diff from the RFC:
 - Add archlinux to the mix
 - Drop /etc/hostname from the clone hook

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-24 16:40:48 -04:00
Alexander Vladimirov
fd986e0874 Prevent write_config from corrupting container config
write_config doesn't check the value sig_name function returns,
this causes write_config to produce corrupted container config when
using non-predefined signal names.

Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-24 16:18:33 -04:00
Alexander Vladimirov
c194ffc100 Update Arch Linux template and add common configuration files
Move common container configuration entries into template config.
Remove unnecessary service symlinking and configuration entries, as well as
guest configs and other redundant configuration, fix minor script bugs.
Clean up template command line, add -d option to allow disabling services.
Also enable getty's on all configured ttys to allow logins via lxc-console,
set lxc.tty value corresponding to default Arch /etc/securetty configuration.

This patch simplifies Arch Linux template a bit, while fixing some
longstanding issues. It also provides common configuration based on
files provided for Fedora templates.

Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-24 16:00:31 -04:00
KATOH Yasufumi
f36062dc50 doc: Update Japanese lxc.container.conf(5) for lxc.cap.keep = none
Update for commit 7035407

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-24 16:00:25 -04:00
Dwight Engen
e78884c958 don't build init.lxc.static if libcap.a isn't available
Note that building init.lxc.static still requires a static libutil.a
and libpthread.a, but these are available on most distro's through
glibc-static.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-23 12:06:30 -04:00
Serge Hallyn
513e1502c8 coverity: avoid possible null deref
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-23 08:41:49 -05:00
Stéphane Graber
0cad52a113
Include ubuntu.priv.seccomp in dist tarball
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 18:08:11 -04:00
Serge Hallyn
214a98ef56 ubuntu containers: use a seccomp filter by default (v2)
Blacklist module loading, kexec, and open_by_handle_at (the cause of the
not-docker-specific dockerinit mounts namespace escape).

This should be applied to all arches, but iiuc stgraber will be doing
some reworking of the commonizations which will simplify that, so I'm
not doing it here.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 17:37:06 -04:00
Serge Hallyn
cd75548b25 seccomp: fix 32-bit rules
When calling seccomp_rule_add(), you must pass the native syscall number
even if the context is a 32-bit context.  So use resolve_name rather
than resolve_name_arch.

Enhance the check of /proc/self/status for Seccomp: so that we do not
enable seccomp policies if seccomp is not built into the kernel.  This
is needed before we can enable by-default seccomp policies (which we
want to do next)

Fix wrong return value check from seccomp_arch_exist, and remove
needless abstraction in arch handling.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 16:33:59 -04:00
Serge Hallyn
d58c6ad0a6 seccomp: support 'all' arch sections (plus bugfixes)
seccomp_ctx is already a void*, so don't use 'scmp_filter_ctx *'

Separately track the native arch from the arch a rule is aimed at.

Clearly ignore irrelevant architectures (i.e. arm rules on x86)

Don't try to load seccomp (and don't fail) if we are already
seccomp-confined.  Otherwise nested containers fail.

Make it clear that the extra seccomp ctx is only for compat calls
on 64-bit arch.  (This will be extended to arm64 when libseccomp
supports it).  Power may will complicate this (if ever it is supported)
and require a new rethink and rewrite.

NOTE - currently when starting a 32-bit container on 64-bit host,
rules pertaining to 32-bit syscalls (as opposed to once which have
the same syscall #) appear to be ignored.  I can reproduce that without
lxc, so either there is a bug in seccomp or a fundamental
misunderstanding in how I"m merging the contexts.

Rereading the seccomp_rule_add manpage suggests that keeping the seccond
seccomp context may not be necessary, but this is not something I care
to test right now.  If it's true, then the code could be simplified, and
it may solve my concerns about power.

With this patch I'm able to start nested containers (with seccomp
policies defined) including 32-bit and 32-bit-in-64-bit.

[ this patch does not yet add the default seccomp policy ]

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 14:35:36 -04:00
Dwight Engen
d74b6771c0 fix the expansion of libexecdir when not explicitly passed to configure
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 14:32:25 -04:00
Dwight Engen
e9aeeadec1 split -lcap and -lselinux out of LIBS
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-20 14:09:26 -04:00
Dwight Engen
7035407c96 allow lxc.cap.keep = none
Commit 1fb86a7c introduced a way to drop capabilities without having to
specify them all explicitly. Unfortunately, there is no way to drop them
all, as just specifying an empty keep list, ie:

    lxc.cap.keep =

clears the keep list, causing no capabilities to be dropped.

This change allows a special value "none" to be given, which will clear
all keep capabilities parsed up to this point. If the last parsed value
is none, all capabilities will be dropped.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-20 14:08:00 -04:00
Dwight Engen
58558042dc don't force dropping capabilities in lxc-init
Commit 0af683cf added clearing of capabilities to lxc-init, but only
after lxc_setup_fs() was done, likely so that the mounting done in
that routine wouldn't fail.

However, in my testing lxc_caps_reset() wasn't really effective
anyway since it did not clear the bounding set. Adding prctl
PR_CAPBSET_DROP in a loop from 0 to CAP_LAST_CAP would fix this, but I
don't think its necessary to forcefully clear all capabilities since
users can now specify lxc.cap.keep = none to drop all capabilities.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-20 14:07:56 -04:00
KATOH Yasufumi
99e616a668 doc: Update Japanese lxc-snapshot(1) for adding the description of destroy
Update for commit 18aa217

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 14:07:40 -04:00
Stéphane Graber
7be2c5ef3c
Fix typo in lxc_attach's usage
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-20 14:04:44 -04:00
Serge Hallyn
d021832111 clone: make sure to update the rootfs path in unexpanded conf
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-18 18:02:35 -05:00
Serge Hallyn
761d81cad8 travis warning: call the fn to clear policy alien statements (memleak)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-18 17:19:05 -05:00
Serge Hallyn
e60e630c4a snapshot test: make sure that external snapshot was really created
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 17:35:15 -04:00
Stéphane Graber
ce7aee4d91
lxc-download: Bump compat to 2 after OpenSUSE
OpenSUSE is now ready for the download template in the master branch,
however it's not going to be compatible with older LXC as they lack the
needed config files, so bump the compat level to 2 to indicate that the
current lxc-download can deal with the current OpenSUSE containers.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 17:33:10 -04:00
Serge Hallyn
18aa217bb1 snapshots: move snapshot directory
Originally we kept snapshots under /var/lib/lxcsnaps.  If a
separate btrfs is mounted at /var/lib/lxc, then we can't
make btrfs snapshots under /var/lib/lxcsnaps.

This patch moves the default directory to /var/lib/lxc/c/snaps.
If /var/lib/lxcsnaps already exists, then we continue to use that.

add c->destroy_with_snapshots() and c->snapshot_destroy_all()
API methods.  c->snashot_destroy_all() can be triggered from
lxc-snapshot using '-d ALL'.  There is no command to call
c->destroy_with_snapshots(c) as of yet.

lxclock: use ".$lxcname" for container lock files
that way we can use /run/lock/lxc/$lxcpath/$lxcname/snaps as a
directory when locking snapshots without having to worry about
/run/lock//lxc/$lxcpath/$lxcname being a file.

destroy: split off a container_destroy
container_destroy() doesn't check for snapshots, so snapshot_rename can
use it.  api_destroy() now does check for snapshots (previously it only
checked for fs - i.e. overlayfs/aufs - snapshots).

Add destroy to the manpage, as it was previously undocumented.

Update snapshot testcase accordingly.

[ rebased in the face of commits 840f05df and 7e36f87e. ]

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 16:28:39 -05:00
Serge Hallyn
3dbcf8b27b confile: fix a typo (s/len/str/) in my previous patch
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-06-18 16:27:18 -05:00
Serge Hallyn
4184c3e172 Store alien config lines
Any config lines not starting with 'lxc.*' are ignored by lxc.  That
can be useful for third party tools, however lxc-clone does not copy such
lines.

Fix that by tracking such lines in our unexpanded config file and
printing them out at write_config().  Note two possible shortcomings here:

1. we always print out all includes followed by all aliens.  They are
not kept in order, nor ordered with respect to lxc.* lines.

2. we're still not storing comments. these could easily be added to
the alien lines, but i chose not to in particular since comments are
usually associated with other lines, so destroying the order would
destroy their value.  I could be wrong about that, and if I am it's
a trivial fix.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 16:56:17 -04:00
Serge Hallyn
f979ac1592 Add a unexpanded lxc_conf
Currently when a container's configuration file has lxc.includes,
any future write_config() will expand the lxc.includes.  This
affects container clones (and snapshots) as well as users of the
API who make an update and then c.save_config().

To fix this, separately track the expanded and unexpanded lxc_conf.  The
unexpanded conf does not contain values read from lxc.includes.  The
expanded conf does.  Lxc functions mainly need the expanded conf to
figure out how to configure the container.  The unexpanded conf is used
at write_config().

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 16:56:14 -04:00
Michael H. Warfield
41cf1ac30d Updated lxc-opensuse for common configuration changes.
Updated the lxc-opensuse template for the changes for the common
configuration used by the download template.  Changed the default
network mode in the container to dhcp.

Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 16:56:11 -04:00
Serge Hallyn
52036991a0 seccomp: warn but continue on unresolvable syscalls
If a syscall is listed which is not resolvable, continue.  This allows
us to keep a more complete list of syscalls in a global seccomp policy
without having to worry about older kernels not supporting the newer
syscalls.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-18 16:56:04 -04:00
Leonid Isaev
08182d4452 bdev.c: initialize a pointer to avoid build failures with -Werror=maybe-uninitialized
Signed-off-by: Leonid Isaev <lisaev@umail.iu.edu>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-17 22:57:56 -04:00
José Martínez
654bf1af09 lxc-ubuntu: use btrfs subvolumes and snapshots
Try to create the cache rootfs as a btrfs subvolume, and use btrfs
snapshots to copy the rootfs if btrfs is selected as backing store.

Signed-off-by: José Martínez <xosemp@gmail.com>
2014-06-17 23:01:33 +02:00
Stéphane Graber
f44b73e189 lxc-autostart: Respect -P
-P was only used for log setup and not when retrieving the container list.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-17 11:55:29 -04:00
Stéphane Graber
f69fd24ea3 tests: Avoid the download template when possible
The use of the download template with an hardcoded --arch=amd64 in aa.c
was causing test failures on any platform incapable of running amd64
binaries.

This wasn't noticed in the CI environment as we run the tests within
containers on an amd64 kernel but this caused failures on the Ubuntu CI
environment.

Instead, let's use the busybox template, tweaking the configuration when
needed to match the needs of the testcase.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-14 15:48:55 -04:00
Stéphane Graber
6ebc050477 tests: Don't fail when HOME isn't defined
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-14 15:48:49 -04:00
Stéphane Graber
91e7b27880 tests: apparmor: Always end with a newline
Some error messages in lxc-test-apparmor didn't end with a newline,
leading to slightly difficult to read output.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-14 15:48:40 -04:00
Stéphane Graber
b38b62a6d4 cgfs: Log the whole cgroup path too
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-10 16:12:56 -04:00
Stéphane Graber
b7aa56b85c tests: Wait 5s for init to respond in lxc-test-autostart
lxc-test-autostart occasionaly fails at the restart test in the CI
environment. Looking at the current test case, the most obvious race
there is if lxc-wait exists succesfuly immediately after LXC marked the
container RUNNING (init spawned) but before init had a chance to setup
the signal handlers.

To avoid this potential race period, let's add a 5s delay between the
tests to give a chance for init to finish starting up.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-10 16:12:39 -04:00
Serge Hallyn
1c1c70514f container start: check for start hooks in container rootfs
Do so early enough that we can report a meaningful failure.

(This should fix https://github.com/lxc/lxc/issues/225)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-09 23:42:39 -04:00
Stéphane Graber
4e31246a25 python3: Fix crashes in snapshot()
This makes sure all PyObject structs are always initialized to NULL,
this will fix issues such as (issue #239).

Also add a snapshot/list/restore testcase to the python3 api test code.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-06-09 15:24:10 -04:00
KATOH Yasufumi
0f84d97e6d doc: Fix typo in lxc-autostart(1)
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-05 09:31:19 -04:00
KATOH Yasufumi
f57517ef96 doc: Update Japanese man pages for the description of boot and group handling
Update lxc-autostart(1) and lxc.container.conf(5) for commit 015f0dd.

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-06-05 09:31:17 -04:00