Explain why we insist that root use newuidmap if it is available.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Call tar with --numeric-owner option to use numbers for user/group
names because the whole uid/gid in rootfs should be consistently
unchanged as in original stage3 tarball and private portage.
Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
We can also narrow the scope of this, since we only need it in the process that
is actually going to use it.
Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
If we just return here, we end up with two processes executing the caller's
code, which is not good.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
criu version 1.3 has been tagged, which has the minimal set of patches to allow
checkpointing and restoring containers. lxc-test-checkpoint-restore is now
skipped on any version of criu lower than 1.3.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This option is required when migrating containers across hosts; it is used to
restore inotify via file paths instead of file handles, which aren't preserved
across hosts.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Regardless of whether "installpkg" command exists or not, install the
command temporarily with static linked tar command into the lxc cache
directory to keep the original uid/gid of files/directories. Also,
use sed command instead of ed command for simplicity.
Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
And add a testcase.
The code to update hwaddrs in a clone was walking through the container
configuration and re-printing all network entries. However network
entries from an include file which should not be printed out were being
added to the unexpanded config. With this patch, at clone we simply
update the hwaddr in-place in the unexpanded configuration file, making
sure to make the same update to the expanded network configuration.
The code to update out lxc.hook statements had the same problem.
We also update it in-place in the unexpanded configuration, though
we mirror the logic we use when updating the expanded configuration.
(Perhaps that should be changed, to simplify future updates)
This code isn't particularly easy to review, so testcases are added
to make sure that (1) extra lxc.network entries are not added (or
removed), even if they are present in an included file, (2) lxc.hook
entries are not added, (3) hwaddr entries are updated, and (4)
the lxc.hook entries are properly updated (only when they should be).
Reported-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Those aren't supported, it's just a lucky coincidence that they weren't
causing problems.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
When managing containers, I need to take action based on container
exit status. For instance, if it exited abnormally (status!=0), I
sometime want to respawn it automatically. Or, when invoking
`lxc-stop` I want to know if it terminated gracefully (ie on `SIGTERM`)
or on `SIGKILL` after a timeout.
This patch adds a new message type `lxc_msg_exit_code,` to preserve
ABI. It sends the raw status code as returned by `waitpid` so that
listening application may want to apply `WEXITSTATUS` before. This is
what `lxc-monitor` does.
Signed-off-by: Jean-Tiare LE BIGOT <jean-tiare.le-bigot@ovh.net>
To ask cgmanager to chown files as an unpriv user, we must send the
request from the container's namespace (with our own userid also
mapped in). However when we create a new namespace then we must
open a new dbus connection, so that our credential and the credential
on the dbus socket match. Otherwise the proxy will refuse the request.
Because we were warning about this failure but not exiting, the failure
was not noticed until the unprivileged container went on to try to
administer its cgroups, i.e. creating a container inside itself.
Fix this by having the do_chown_cgroup create a new cgmanager connection.
In order to reduce the number of connections, since the list of subsystems
is global anyway, don't call do_chown_cgroup once for each controller,
just call it once and have it run over all controllers.
(This patch does not change the fact that we don't fail if the
chown failed. I think we should change that, but let's do it in a
later patch)
Reported-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
With the new hashed command socket names (e85898415c), it's possible to
have something like below;
[caglar@qop:~/go/src/github.com/lxc/go-lxc(master)] cat /proc/net/unix | grep lxc
0000000000000000: 00000002 00000000 00010000 0001 01 53465 @lxc/d086e835c86f4b8d/command
[...]
list_active_containers reads /proc/net/unix to find all running
containers but this new format no longer includes the container name or
its lxcpath.
This patch introduces two new commands (LXC_CMD_GET_NAME and
LXC_CMD_GET_LXCPATH) and starts to use those in list_active_containers
call.
changes since v1:
- added sanity check proposed by Serge
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This patch adds support for checkpointing and restoring containers via CRIU.
It adds two api calls, ->checkpoint and ->restore, which are wrappers around
the CRIU CLI. CRIU has an RPC API, but reasons for preferring exec() are
discussed in [1].
To checkpoint, users specify a directory to dump the container metadata (CRIU
dump files, plus some additional information about veth pairs and which
bridges they are attached to) into this directory. On restore, this
information is read out of the directory, a CRIU command line is constructed,
and CRIU is exec()d. CRIU uses the lxc-restore-net callback (which in turn
inspects the image directory with the NIC data) to properly restore the
network.
This will only work with the current git master of CRIU; anything as of
a152c843 should work. There is a known bug where containers which have been
restored cannot be checkpointed [2].
[1]: http://lists.openvz.org/pipermail/criu/2014-July/015117.html
[2]: http://lists.openvz.org/pipermail/criu/2014-August/015876.html
v2: fixed some problems with the s/int/bool return code form api function
v3: added a testcase, fixed up the man page synopsis
v4: fix a small typo in lxc-test-checkpoint-restore
v5: remove a reference to the old CRIU_PATH, and a bad error about the same
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
distutils can't handle paths to source files containing '..'. It will
try to navigate away from the build directory and fail. To fix that,
before building the python module, transform all the path variables then
cd to the srcdir, and set the build directory manually.
This is hopefully the last needed fix to use separate build and
source diretories.
Signed-off-by: Daniel Miranda <danielkza2@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Now that default.conf is generated/linked during the configuration
phase, it should not longer be removed in the 'clean' stage, or
subsequent builds will fail. Only remove it during 'dist-clean'.
Signed-off-by: Daniel Miranda <danielkza2@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Just setting path isn't enough. Clear the whole environment, and only set
$PATH. It's all we need - ovs-vsctl is running fine this way.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Added check of services in container before start or stop.
Added check of syslog config existence prior changing.
Signed-off-by: Denis Pynkin <dans@altlinux.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
1. don't determine ovs-vsctl path at configure time, do it at runtime
2. lxc-user-nic: set a sane path to protect from unpriv users
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
If statvfs does not exist, then don't recalculate mount flags
at remount.
If someone does need this, they could replace the code (only
if !HAVE_STATVFS) with code parsing /proc/self/mountinfo (which
exists in the recent git history)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Same problem as we had with mount_entry(). lxc_mount_auto_mounts()
sometimes does bind mount followed by remount to change options.
With recent kernels it must pass any preexisting NODEV/NOSUID/etc
flags.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Use statvfs instead of parsing /proc/self/mountinfo to check for the
flags we need to and into the msbind mount flags. This will be faster
and the code is cleaner.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Building LXC in a separate target directory, by running configure from
outside the source tree, failed with multiple errors, mostly in the
Python and Lua extensions, due to assuming the source dir and build dir
are the same in a few places. To fix that:
- Pre-process setup.py with the appropriate directories at configure
time
- Introduce the build dir as an include path in the Lua Makefile
- Link the default container configuration file from the alternatives
in the configure stage, instead of setting a variable and using it
in the Makefile
Signed-off-by: Daniel Miranda <danielkza2@gmail.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This prevents u2 from going into /home/u1/.local/share/lxc/u1/rootfs
and running setuid-root applications to get write access to u1's
container rootfs.
v2: set umask to 002 for the mkdir. Otherwise if umask happens to be,
say, 022, then user does not have write permissions under the container
dir and creation of $containerdir/partial file will fail.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
When we read a lxc.network.hwaddr line, if it contained any 'x's then
those get quitely filled in at config_network_hwaddr. If that happens
then we want to save the autogenerated hwaddr in the unexpanded config
so that when we write it to disk, it is saved.
This patch dumbly re-generates the network configuration in the
unexp configuration every time we load a config file, just as we do
after every clone.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Unprivileged users require "-o user_subvol_rm_allowed" mount option for btrfs.
Make the INFO level message to ERROR to make it clear, which now says following;
[caglar@qop:~] lxc-destroy -n rubik
lxc_container: Is the rootfs mounted with -o user_subvol_rm_allowed?
lxc_container: Error destroying rootfs for rubik
Destroying rubik failed
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
If we didn't find newuidmap, then simply require the caller to be
root and write to /proc/self/uidmap manually. Checking for
newgidmap to exist is bogus.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
- If "installpkg" command does not exist, lxc-plamo temporarily
install the command with static linked tar command into the lxc
cache directory. The tar command does not refer to passwd/group
files, which means that only a few files/directories are extracted
with wrong user/group ownership. To avoid this, the installpkg
command now uses the standard tar command in the system.
- Change mode to 666 for $rootfs/dev/null to allow write access for
all users.
- Small fix in usage message.
Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
This should avoid tests failure when the machine running the tests has
either very slow disks or a lot of data waiting to be flushed.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
See http://lkml.org/lkml/2014/8/13/746 and its history. The kernel now refuses
mounts if we don't add ro,nosuid,nodev,noexec flags if they were already there.
Also use the newly found info to skip remount if unneeded. For background, if
you want to create a read-only bind mount, then you must first mount(2) with
MS_BIND to create the bind mount, then re-mount(2) again to get the new mount
options to apply. So if this wasn't a bind mount, or no new mount options were
introduced, then we don't do the second mount(2).
null_endofword() and get_field() were not changed, only moved up in
the file.
(Note, while I can start containers inside a privileged container with
this patch, most of the lxc tests still fail with the kernel in question;
Andy's patch seems to still be needed - a kernel with which is available
at https://launchpad.net/~serge-hallyn/+archive/ubuntu/userns-natty
ppa:serge-hallyn/userns-natty)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
A long enough lxcpath (and small PATH_MAX through crappy defines) can cause
the creation of the string to be hashed to fail. So just use alloca to
get the size string we need.
More importantly, while I can't explain it, if lxcpath is too long, setting
sockname[sizeof(addr->sun_path)-2] to \0 simply doesn't seem to work. So set
sockname[sizeof(addr->sun_path)-3] to \0, which does work.
With this, and with
lxc.lxcpath = /opt/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789/lxc0123456789
in /etc/lxc/lxc.conf, I can run lxc-wait just fine. Without it, it fails
(as does lxc-start -d, which uses lxc_wait to verify the container started)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
The container command socket is an abstract unix socket containing
the lxcpath and container name. Those can be too long. In that case,
use the hash of the lxcpath and lxcname. Continue to use the path and
name if possible to avoid any back compat issues.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This commit broke the testsuite for unprivileged containers as the
container directory is now 0750 with the owner being the container root
and the group being the user's group, meaning that the parent user can
only enter the directory, not create entries in there.
This reverts commit c86da6a3ac.
This is an hybrid between Micahel's original patch and me making the new
debugging statements look like our existing ones.
Signed-off-by: "Micahel J. Evans" <mjevans1983@gmail.com>
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
When "lxc.autodev = 1", LXC creates automatically a "/dev/.lxc/<name>.<hash>"
folder to put container's devices in so that they are visible from both
the host and the container itself.
On container exit (ne it normal or not), this folder was not cleaned
which made "/dev" folder grow continuously.
We fix this by adding a new `int lxc_delete_autodev(struct lxc_handler
*handler)` called from `static void lxc_fini(const char *name, struct
lxc_handler *handler)`.
Signed-off-by: Jean-Tiare LE BIGOT <jean-tiare.le-bigot@ovh.net>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
This prevents u2 from going into /home/u1/.local/share/lxc/u1/rootfs
and running setuid-root applications to get write access to u1's
container rootfs.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
(Thanks, Dwight, this one look right?)
Make sure we reap our child at cgm_{s,g}et.
Changelog: Fix change in behavior on empty read from the do_cgm_get()
helper that was spotted by Dwight.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Raspberry Pi kernel finally supports all the bits required by LXC [1]
This patch makes "./configure --with-distro=raspbian" to install lxcbr0
based config file and upstart jobs.
Also src/lxc/lxc.net now checks the existence of the lxc-dnsmasq user
(and fallbacks to dnsmasq)
RPI users still need to pass
"MIRROR=http://archive.raspbian.org/raspbian/" parameter to lxc-create
to pick the correct packages
MIRROR=http://archive.raspbian.org/raspbian/ lxc-create -t debian -n rpi
[Could be applied to stable-1.0 if you cherry-pick
7157a508ba3015b830877a5e4d6ca9debb3fd064]
[1] https://github.com/raspberrypi/linux/issues/176
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>