Commit Graph

2861 Commits

Author SHA1 Message Date
Serge Hallyn
76a26f559f add support for nbd
backing stores supported by qemu-nbd can be attached to a nbd block
device using qemu-nbd.  This user-space process (pair) stays around for
the duration of the device attachment.  Obviously we want it to go away
when the container shuts down, but not before the filesystems have been
cleanly unmounted.

The device attachment is done from the task which will become the
container monitor before the container setup+init task is spawned.
That task starts in a new pid namespace to ensure that the qemu-nbd
process will be killed if need be.  It sets its parent death signal
to sighup, and, on receiving sighup, attempts to do a clean
qemu-device detach, then exits.  This should ensure that the
device is detached if the qemu monitor crashes or exits.

It may be worth adding a delay before the qemu-nbd is detached, but
my brief tests haven't seen any data corruption.

Only the parts required for running a nbd-backed container are
implemented here.  Create, destroy, and clone are not.  The first
use of this that I imagine is for people to use downloaded nbd-backed
images (like ubuntu cloud images, or anything previously used with
qemu).  I imagine people will want to create/clone/destroy out of
band using qemu-img, but if I'm wrong about that we can implement
the rest later.

Because attach_block_device() is done before the bdev is initialized,
and bdev_init needs to know the nbd index so that it can mount the
filesystem, we now need to pass the lxc_conf.

file_exists() is moved to utils.c so we can use it from bdev.c

The nbd attach/detach should lay the groundwork for trivial implementation
of qed and raw images.

changelog (may 12): fix idx check at detach
changelog (may 15): generalize qcow2 to nbd

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
2014-05-16 09:58:03 -04:00
Dwight Engen
7e4ca1a21d lxc-oracle: export upstart environment variable for maygetty
This is a fix to commit 5f2ea8cfcb.
Sorry, not sure how I missed this in testing the original patch.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-16 09:56:03 -04:00
Edvinas Klovas
44464003ee archlinux template: fix lxc.root for btrfs backend
when using btrfs backend lxc-create first creates rootfs in /usr/lib/lxc/rootfs
directory before moving it to /var/lib/lxc or other directory supplied by the
command line. Archlinux template relied in $rootfs_path which made containers
created with btrfs backend have lxc.rootfs set to /usr/lib/lxc/rootfs. By using
$path instead of $rootfs_path we make sure that lxc.rootfs is always correct.

Signed-off-by: Edvinas Klovas <edvinas@pnd.io>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-05-13 15:28:22 -04:00
Dwight Engen
5f2ea8cfcb lxc-oracle: add pts/[1-4] to securetty for libvirt-lxc
Don't spawn a getty on /dev/console when running under libvirt-lxc

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-12 22:59:47 -04:00
S.Çağlar Onur
f1a4a029f6 use same ifndef/define format for all headers
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-09 05:12:22 -05:00
Serge Hallyn
29b0b04b32 cgmanager: detect whether cgmanager supports name= subsystems
On older cgmanager the support was broken.  So rather than
fail container starts altogether, just keep the old lxc behavior
in this case by not using name= subsystems.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-08 12:57:03 -05:00
KATOH Yasufumi
58291e3a43 doc: Fix Japanese lxc.container.conf(5)
commit aafea1f was incomplete.

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-08 10:53:08 -05:00
Dwight Engen
0b648faeca python3: remove assert since hwaddr isn't set by the download template
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-07 08:54:41 -05:00
Dwight Engen
3cd4af2e86 install lxc-patch.py 644 to fix rpmlint warning
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-07 08:48:20 -05:00
Edvinas Klovas
31efc34cff archlinux template: added sigpwr handling to systemd (lxc-stop)
archlinux is using systemd and systemd's configuration does not have any
services setup to handle sigpwr hook which is sent by lxc-stop command. By
enabling sigpwr service we make sure that lxc-stop will work.

Signed-off-by: Edvinas Klovas <edvinas@pnd.io>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-07 08:38:36 -05:00
Serge Hallyn
25c7531cf0 cgmanager: use absolute cgroup path to switch cgroups at attach
If an unprivileged user does 'lxc-start -n u1' in one
login session, followed by 'lxc-attach -n u1' in another
session, the attach will fail if the sessions are in different
cgroups.  The same is true of lxc-cgroup commands.

Address this by using the GetPidCgroupAbs and MovePidAbs
which work with the containers' cgroup path relative to
the cgproxy.

Since GetPidCgroupAbs is new to api version 3 in cgmanager,
use the old method if we are on an older cgmanager.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: "S.Çağlar Onur" <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-06 22:54:39 -05:00
Serge Hallyn
cbf0bae67c cgmanager: also handle named subsystems (like name=systemd)
Read /proc/self/cgroup instead of /proc/cgroups, so as to catch
named subsystems.  Otherwise the contaienrs will not be fully
moved into the container cgroups.

Also free line which was being leaked.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-06 22:54:39 -05:00
Serge Hallyn
44a706bdaf btrfs: support unprivileged destroy
Do this by calling the bdev->destroy() hook from a user namespace
configured as the container's.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-06 22:54:39 -05:00
Serge Hallyn
2659c7cbd5 btrfs: support unprivileged create and clone
btrfs subvolume ioctls are usable by unprivileged users, so allow
unprivileged containers to reside on btrfs.

This patch does not yet enable destroy.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-06 22:54:39 -05:00
Dwight Engen
391260dcb2 correct license on file to LGPL vs. GPL and fix address
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 16:52:23 -05:00
KATOH Yasufumi
aafea1f750 doc: Update lxc.container.conf(5) for improving lxc.mount.auto
Update for commit 0769b82

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 16:52:22 -05:00
KATOH Yasufumi
cf5f31286e doc: Update Japanese lxc.container.conf(5) for mounting /sys/fs/cgroup rw
Update for commit b46f055

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 16:52:21 -05:00
Christian Seiler
0769b82a42 lxc.mount.auto: improve defaults for cgroup and cgroup-full
If the user specifies cgroup or cgroup-full without a specifier (:ro,
:rw or :mixed), this changes the behavior. Previously, these were
simple aliases for the :mixed variants; now they depend on whether the
container also has CAP_SYS_ADMIN; if it does they resolve to the :rw
variants, if it doesn't to the :mixed variants (as before).

If a container has CAP_SYS_ADMIN privileges, any filesystem can be
remounted read-write from within, so initially mounting the cgroup
filesystems partially read-only as a default creates a false sense of
security. It is better to default to full read-write mounts to show the
administrator what keeping CAP_SYS_ADMIN entails.

If an administrator really wants both CAP_SYS_ADMIN and the :mixed
variant of cgroup or cgroup-full automatic mounts, they can still
specify that explicitly; this commit just changes the default without
specifier.

Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 10:20:10 -05:00
Christian Seiler
bab88e6894 Factor out capability parsing logic
Currently, setup_caps and dropcaps_except both use the same parsing
logic for parsing capabilities (try to identify by name, but allow
numerical specification). Since this is a common routine, separate it
out to improve maintainability and reuseability.

Signed-off-by: Christian Seiler <christian@iwakd.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 10:20:09 -05:00
Christian Seiler
b46f055358 cgfs: don't mount /sys/fs/cgroup readonly
Ubuntu containers have had trouble with automatic cgroup mounting that
was not read-write (i.e. lxc.mount.auto = cgroup{,-full}:{ro,mixed}) in
containers without CAP_SYS_ADMIN. Ubuntu's mountall program reads
/lib/init/fstab, which contains an entry for /sys/fs/cgroup. Since
there is no ro option specified for that filesystem, mountall will try
to remount it readwrite if it is already mounted. Without
CAP_SYS_ADMIN, that fails and mountall will interrupt boot and wait for
user input on whether to proceed anyway or to manually fix it,
effectively hanging container bootup.

This patch makes sure that /sys/fs/cgroup is always a readwrite tmpfs,
but that the actual cgroup hierarchy paths (/sys/fs/cgroup/$subsystem)
are readonly if :ro or :mixed is used. This still has the desired
effect within the container (no cgroup escalation possible and programs
get errors if they try to do so anyway), while keeping Ubuntu
containers happy.

Signed-off-by: Christian Seiler <christian@iwakd.de>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-06 10:20:08 -05:00
Stéphane Graber
3c597cee88 python-lxc: minor fixes to __init__.py
Set a base class for the network object and set the encoding in the
header. Neither of those changes are required for python3 but they do
make it easier for anyone trying to make a python2 binding.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-05 22:35:52 -05:00
Serge Hallyn
5b28d06381 Add missing MAX_STACKDEPTH define on MUTEX_DEBUGGING build
Corrected a small oversight when locking related code was moved from
src/lxc/utils.c to src/lxc/lxclock.c.

Signed-off-by: Stephen M Bennett <stephen_m_bennett@hotmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-05-05 22:32:31 -05:00
Stéphane Graber
e01a1f4661 lxc-ls: Force running against containers without python
When using --nesting, we exec ourselves in the container context, if we
somehow need to dynamically-load modules from there, things break. So
make sure we pre-load everything we may need.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-05-05 11:39:09 -05:00
Stéphane Graber
f8f3c3c071 Revert "cgfs: don't mount /sys/fs/cgroup readonly"
This reverts commit 8d783edcae.
2014-05-02 17:19:55 -04:00
Stéphane Graber
52b0a7d983
lxc-ls: Cache groups and show bygroup in autostart
This makes sure we only query lxc.group once and then reuse that list
for filtering, listing groups and autostart.

When a container is auto-started only as part of a group, autostart will
now show by-group instead of yes.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 13:19:46 -04:00
KATOH Yasufumi
4724cf84f9 doc: Update Japanese lxc-ls(1) for the new -g/--group argument
Update for commit 0f02786

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 13:04:23 -04:00
Serge Hallyn
8d783edcae cgfs: don't mount /sys/fs/cgroup readonly
/sys/fs/cgroup is just a size-limited tmpfs, and making it ro does
nothing to affect our ability alter mount settings of its subdirs.
OTOH making it ro can upset mountall in the container which tries
to remount it rw, which may be refused.

So just don't do it.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Christian Seiler <christian@iwakd.de>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 13:04:21 -04:00
Stéphane Graber
b9abc183b1
lxc-ls: Allow the use of --groups without --fancy
There wasn't a good reason for that limit, we can simply make the code
slightly slower when --groups is passed and still have the expected
output even without --fancy.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 11:16:51 -04:00
KATOH Yasufumi
a5ab279643 doc: Update Japanese lxc-create(1) for 'none' bdev type
Update for commit 50040b5

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 11:14:27 -04:00
KATOH Yasufumi
63e6a3de81 doc: Update Japanese lxc-clone(1) for fixing typo
Update for commit 0e98b3bd31

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 11:14:24 -04:00
Stéphane Graber
0ceb65ff25 lxc-ls: Typo in manpage
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-02 11:13:08 -04:00
Stéphane Graber
0f027869da lxc-ls: Update lxc.group handling
This introduces a new -g/--group argument to filter containers based on
their groups.

This supports the rather obvious: --group blah
Which will only list containers that are in group blah.

It may also be passed multiple times: --group blah --group bleh
Which will list containers that are in either (or both) blah or bleh.

And it also takes: --group blah,bleh --group doh
Which will list containers that are either in BOTH blah and bleh or in doh.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Michael H. Warfield <mhw@WittsEnd.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-05-02 11:12:21 -04:00
Serge Hallyn
50040b5e46 lxc-create: make 'none' bdev type work again
This should address https://github.com/lxc/lxc/issues/199

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-01 13:54:16 -04:00
Nikolay Martynov
8a2fdf50ad use correct lxc-init path in sshd template
lxc-init got moved into SBINDIR/init.lxc recently.
This broke sshd template because path wasn't updated there.
This patch should fix this issue.

Signed-off-by: Nikolay Martynov <mar.kolya@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-05-01 10:38:12 -04:00
Carlo Landmeter
91828b0e1f alpinelinux: set correct lxc_arch for x86
Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-04-30 16:28:59 -04:00
S.Çağlar Onur
178af55b1c fix minor typo in .gitignore
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-04-30 15:25:30 -04:00
Stéphane Graber
13aad0ae78 clang: Fix build warnings for 3.4
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2014-04-30 13:02:15 -04:00
Dwight Engen
9e607c2f35 lxc-oracle: fix warnings/errors from some rpm scriptlets
- Some scriptlets expect fstab to exist so create it before doing the
  yum install

- Set the rootfs selinux label same as the hosts or else the PREIN script
  from initscripts will fail when running groupadd utmp, which prevents
  creation of OL4.x containers on hosts > OL6.x.

- Move creation of devices into a separate function

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-30 10:39:09 -05:00
Serge Hallyn
773bd28258 apparmor: allow writes to sem* and msg* sysctls
/proc/sys/kernel/sem* and /proc/sys/kernel/msg* are ipc sysctls
which are properly namespaced.  Allow writes to them from
containers.

Reported-by: Dan Kegel <dank@kegel.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-04-29 16:45:16 -05:00
S.Çağlar Onur
71a606eeb3 revert 1d16785 - fixes #191
According to Serge, we no longer need to keep cgmanager connection open.

As long as my tests go it seems to be working fine.

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 20:07:04 -05:00
Serge Hallyn
f79b86a344 Revert "snapshots: move snapshot directory"
This reverts commit 276a086264.

It breaks lxc-test-snapshot, and perhaps we should go with
stgraber's suggestion of using $lxcpath/$lxcname/snaps/
2014-04-28 17:33:36 -05:00
Dwight Engen
1462279962 output lxc.arch as i686 for PER_LINUX32
When outputing the lxc.arch setting, use i686 instead of x86 since the
later is not a valid input to setarch, nor will the kernel output
UTS_MACHINE as x86. The kernel sets utsname.machine to i[3456]86, which
all map to PER_LINUX32.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 16:18:00 -05:00
Dwight Engen
bb8d8207c3 allow all iX86 strings for lxc.arch
This change accepts all the same strings for lxc.arch that setarch(8) does.

Note that we continue to parse plain x86 as PER_LINUX32 so as not to break
existing lxc configuration files.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 16:17:58 -05:00
Serge Hallyn
a0566914c2 lxc-user-nic: handle failure in create_nic
Failures were being ignored, leading up to an eventual segfault.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2014-04-28 16:16:08 -05:00
KATOH Yasufumi
dc421f3aac Convert punctuation marks in Japanese man pages
This only converts punctuation marks from FULLWIDTH COMMA/FULL STOP to
IDEOGRAPHIC COMMA/FULL STOP in Japanese man pages. The contents of man
pages do not change at all.

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 12:29:04 -05:00
Dwight Engen
92ffb6d8ac coverity: fix fd leak in error case (1011105)
I inadvertently introduced this with commit 8bf1e61e.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 12:28:33 -05:00
Michael H. Warfield
09227be286 Check for symlinks before attempting create.
Check for symlinks before attempting create.

When attempting to create the compulsory symlinks in /dev,
check for the existence of the link using stat first before
blindly attempting to create the link.

This works around an apparent quirk in the kernel VFS on read-only
file systems where the returned error code might be EEXIST or EROFS
depending on previous access to the /dev directory and its entries.

Reported-by: William Dauchy <william@gandi.net>
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com>
Tested-by: William Dauchy <william@gandi.net>
2014-04-28 10:19:01 -05:00
Serge Hallyn
276a086264 snapshots: move snapshot directory
Originally we kept snapshots under /var/lib/lxcsnaps.  If a
separate btrfs is mounted at /var/lib/lxc, then we can't
make btrfs snapshots under /var/lib/lxcsnaps.

This patch moves the default directory to /var/lib/lxc/lxcsnaps.
If /var/lib/lxcsnaps already exists, then use that.  Don't allow
any container to be used with the name 'lxcsnaps'.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 10:18:49 -05:00
Serge Hallyn
e995d7a269 lxc startup: manually mark every shared mount entry as slave
If you 'ip netns add x1', this creates /run/netns and /run/netns/x1
as shared mounts.  When a container starts, it umounts these after
pivot_root, and the umount is propagated to the host.

Worse, doing mount("", "/", NULL, MS_SLAVE|MS_REC, NULL) does not
suffice to change those, even after binding /proc/mounts onto
/etc/mtab.

So, I give up.  Do this manually, walking over /proc/self/mountinfo
and changing the mount propagation on everything marked as shared.

With this patch, lxc-start no longer unmounts /run/netns/* on the
host.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 10:18:47 -05:00
Serge Hallyn
0e98b3bd31 lxc-clone man page: fix typos
In the body of the manpage, replace a few errant 'fssize's with the
more appropriate word.

Reported-by: MegaBrutal <megabrutal@megabrutal.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2014-04-28 08:42:24 -05:00