Commit Graph

113 Commits

Author SHA1 Message Date
Serge Hallyn
cf3ef16dc4 container creation: support unpriv container creation in user namespaces
1. lxcapi_create: don't try to unshare and mount for dir backed containers

It's unnecessary, and breaks unprivileged lxc-create (since unpriv users
cannot yet unshare(CLONE_NEWNS)).

2. api_create: chown rootfs

chown rootfs to the host uid to which container root will be mapped

3. create: run template in a mapped user ns

4. use (setuid-root) newxidmap to set id_map if we are not root

This is needed to be able to set userns mappings as an unprivileged
user, for unprivileged lxc-start.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-10-24 12:12:35 -05:00
KATOH Yasufumi
9d65a48729 Fix segfault on lxc-create when no template specified
When no template file is specified on lxc-create, recieve segfault.
So change not to append header in config when no template is specified.

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-10-23 19:27:05 -04:00
S.Çağlar Onur
771d96b380 introduce snapshot_destroy
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-10-19 09:49:19 -05:00
Stéphane Graber
0f8f9c8aa4
lxccontainer.c: Replace rindex by strrchr (bionic)
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2013-10-18 18:00:24 -04:00
Sidnei da Silva
f99c386b60 Add a --thinpool argument to lxc-create, to use thin pool backed lvm when creating the container. When cloning a container backed by a thin pool, the clone will default to the same thin pool. 2013-10-18 14:43:03 -05:00
Serge Hallyn
a41f104bfb define list container api (v2)
Two new commands are defined: list_defined_containers() and
list_active_containers().  Both take an lxcpath (NULL means
use the default lxcpath) and return the number of containers
found.  If a lxc_container ** is passed in, then an array of
lxc_container's is returned, one for each container found.
The caller must then lxc_container_put() each container and
free the array, as shown in the new list testcase.
If a char ** is passed in, then an array of container names
is returned, after which the caller must free all the names
and the name array, as showsn in the testcase.

Changelog:
	Check for the container config file before trying to
	create an lxc_container *, to save some work. [ per
	stgraber comments]
	Add names ** argument to return only container names.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-10-14 12:42:39 -05:00
Serge Hallyn
9baa57bdd4 coverity: closedir on error path
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-10-07 14:03:20 -05:00
Serge Hallyn
b494d2ddf7 add c->may_control
This is an api function which will return false if the container
is running, and the caller may not talk to its monitor over its
command socket.  Otherwise - if the container is not running, or
the caller may access it - it returns true.

We can use this in several tools early on to prevent the segvs
etc which we currently get.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-09-30 13:21:52 -05:00
Stéphane Graber
fe218ca383
Fix crasher in get_ips
Check that the interface structure is not NULL before trying to access
its members.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2013-09-29 19:41:52 -04:00
Serge Hallyn
566981770e drop now-useless have_tpath bool
(Which will also break failure-to-build in the !HAVE_LIBGNUTLS
case)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-26 08:14:50 -05:00
Dwight Engen
85db5535c3 fix segfault on lxc-create with bad template name
- change get_template_path() to only return NULL or non-NULL since one of
  the callers was doing a free(-1) which caused the segfault. Handle the
  NULL template case in the lxcapi_create() caller.

- make sure to free(tpath) in the sha1sum_file() failure case

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-09-26 08:11:59 -05:00
Qiang Huang
89cd779348 utils: move remove_trailing_slashes to utils
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-24 09:37:18 -05:00
Stéphane Graber
948955a2d6 Consistently use <lxc/lxccontainer.h> for the API
The API header was included in a variety of ways before, standardize
those to "include <lxc/lxccontainer.h>" as this will always work both in
tree and on a system with the headers installed.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-23 15:30:05 -05:00
S.Çağlar Onur
49badbbef6 return the result of the lxcapi_want_close_all_fds call to the caller
Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-23 10:20:29 -05:00
S.Çağlar Onur
130a188840 Expose underlying close_all_fds config value via API
Being able to set close_all_fds via API would be usefull for the
situations like running an application (let's say web server)
that controls the lifecycle of the container using the LXC API.
We don't want forked process to inherit parent's resource (file, socket, ...)

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-09-20 23:48:20 -05:00
S.Çağlar Onur
799f29ab69 Add get_interfaces to the API
get_ips accepts an interface name as a parameter but there was no
way to get the interfaces names from the container. This patch
introduces a new get_interfaces call to the API so that users
can obtain the name of the interfaces.

Support for python bindings also introduced as a part of this version.

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-09-18 14:35:49 -05:00
Serge Hallyn
025ed0f391 make heavier use of process_lock (v2)
pthread_mutex_lock() will only return an error if it was set to
PTHREAD_MUTEX_ERRORCHECK and we are recursively calling it (and
would otherwise have deadlocked).  If that's the case then log a
message for future debugging and exit.  Trying to "recover" is
nonsense at that point.

process_lock() was held over too long a time in lxcapi_start()
in the daemonize case.  (note the non-daemonized case still needs a
check to enforce that it must NOT be called while threaded).  Add
process_lock() at least across all open/close/socket() calls.

Anything done after a fork() doesn't need the locks as it is no
longer threaded - so some open/close/dups()s are not locked for
that reason.  However, some common functions are called from both
threaded and non-threaded contexts.  So after doing a fork(), do
a possibly-extraneous process_unlock() to make sure that, if we
were forked while pthread mutex was held, we aren't deadlocked by
nobody.

Tested that lp:~serge-hallyn/+junk/lxc-test still works with this
patch.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-09-18 13:49:08 -05:00
Serge Hallyn
4575a9f939 Revert "api_create and api_start: work toward making them thread-safe"
This should deadlock with daemonized start due to af_unix changes.

Do this later, but do it more carefully.

This reverts commit 002f3cff4d.
2013-09-14 13:09:53 -05:00
Serge Hallyn
002f3cff4d api_create and api_start: work toward making them thread-safe
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-14 13:02:50 -05:00
Christian Seiler
33ad9f1ab1 cgroup: Major rewrite of cgroup logic
This patch rewrites most of the cgroup logic. It creates a set of data
structures to store the kernel state of the cgroup hierarchies and
their mountpoints.

Mainly, everything is now grouped with respect to the hierarchies of
the system. Multiple controllers may be mounted together or separately
to different hierarchies, the data structures reflect this.

Each hierarchy may have multiple mount points (that were created
previously using the bind mount method) and each of these mount points
may point to a different prefix inside the cgroup tree. The current
code does not make any assumptions regarding the mount points, it just
parses /proc/self/mountinfo to acquire the relevant information.

The only requirement is that the current cgroup of either init (if
cgroup.pattern starts with '/' and the tools are executed as root) or
the current process (otherwise) are accessible. The root cgroup need
not be accessible.

The configuration option cgroup.pattern is introduced. For
root-executed containers, it specifies which format the cgroups should
be in. Example values may include '/lxc/%n', 'lxc/%n', '%n' or
'/machine/%n.lxc'. Any occurrence of '%n' is replaced with the name of
the container (and if clashes occur in any hierarchy, -1, -2, etc. are
appended globally). If the pattern starts with /, new containers'
cgroups will be located relative to init's cgroup; if it doesn't, they
will be located relative to the current process's cgroup.

Some changes to the cgroup.h API have been done to make it more
consistent, both with respect to naming and with respect to the
parameters. This causes some changes in other parts of the code that
are included in the patch.

There has been some testing of this functionality, but there are
probably still quite a few bugs in there, especially for people with
different configurations.

Signed-off-by: Christian Seiler <christian@iwakd.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-10 18:19:21 -04:00
Serge Hallyn
f5dd1d532a API support for container snapshots (v2)
The api allows for creating, listing, and restoring of container
snapshots.  Snapshots are created as snapshot clones of the
original container - i.e. btrfs and lvm will be done as snapshot,
a directory-backed container will have overlayfs snapshots.  A
restore is a copy-clone, using the same backing store as the
original container had.

Changelog:

 . remove lxcapi_snap_open, which wasn't defined anyway.
 . rename get_comment to get_commentpath
 . if no newname is specified at restore, use c->name (as we meant to)
   rather than segving.
 . when choosing a snapshot index, use the correct path to check for.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
2013-09-10 18:19:20 -04:00
Serge Hallyn
eee59f9408 clone: don't copy rdepends when not doing a snapshot clone
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-05 18:05:34 -05:00
Serge Hallyn
2a2d36a425 fix typo
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-05 17:59:28 -05:00
Serge Hallyn
59d66af29d bdev: free after bdev_init
(Except in cases where we will immediately exit)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-09-05 17:04:15 -05:00
Serge Hallyn
d75462e4d6 fix wrong license text for parts of liblxc library
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-30 14:43:25 -05:00
Serge Hallyn
dfb31b25e2 Track snapshot dependencies (v2)
(Will push in a bit barring any objections)

lvm, btrfs, and zfs snapshots each do an ok job of handling deletions
for us - a btrfs snapshot does fine after the original is removed,
while zfs and lvm will both refuse to allow the original to be deleted
while the snapshot exists.

Overlayfs doesn't do this for us.  So, for overlayfs snapshots, track
the dependencies.

When c2 is created as an overlayfs snapshot of dir-backed c1, then

1. c2's lxc_rdepends file will contain

	c1_lxcpath
	c1_lxcname

2. c1's lxc_snapshots will contain "1"

c1 cannot be deleted so long as lxc_snapshots exists and contains
a non-zero number.

The contents of lxc_snapshots and lxc_rdepends are protected by
container_disk_lock() and at lxc_clone by the new container not yet
being accessible.

(Originally I was going to keep them in the container config, but the
problem with using $lxcpath/$name/config is that api users could end up
calling c->save_config() with a cached old value of snapshots/rdepends.)

Changelog:
	aug 21: check for fprintf and fclose failures

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
2013-08-21 16:38:51 -05:00
Serge Hallyn
a09295f841 coverity: don't leak partial_fd
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-20 17:54:19 -05:00
Serge Hallyn
01efd4d3d9 coverity: correctly handle tpath error case.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-20 16:58:24 -05:00
Serge Hallyn
1fd9bd50ab coverity: ftell returns long, not size_t (which is unsigned)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-20 16:50:25 -05:00
Serge Hallyn
b4569e9321 coverity: don't bother getting template path if we're not going to measure it
This should also fix a memory leak, since we were freeing it under ifdef
but always allocating it.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-20 16:30:36 -05:00
Stéphane Graber
e768f9c0f6 Add missing namespace.h include
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-08-19 14:33:33 +02:00
Stéphane Graber
c32981c3fb Replace all calls to rindex by strrchr
The two functions are identical but strrchr also works on Bionic.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-08-19 14:32:55 +02:00
Stéphane Graber
4ba0d9af63 Add a local implementation of ifaddrs.h
This adds a local ifaddrs implementation to be used on Bionic or other C
libraries that don't come with a getifaddrs implementation.

This code was written by Kenneth MacKay and is under a two-clause BSD
license (copyright information in the file headers).

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-08-19 14:32:41 +02:00
Serge Hallyn
d44e88c266 bdev: support -B best and -B lvm,dir
-B dev will check whether btrfs, zfs, or lvm can be used,
in that order, and fall back to dir.

-B lvm,btrfs will try lvm first, then btrfs, then fail.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-15 15:35:47 -05:00
Christian Seiler
a0e93eeb22 Add attach support to container C API
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2013-08-14 16:51:05 -05:00
Serge Hallyn
96532523ef lxc-clone: don't s/oldname/newname in the config file and hooks
1. container hooks should use lxcpath and lxcname from the environment.
2. the utsname now gets separately updated
3. the rootfs path gets updated by the bdev backend.
4. the fstab mount targets should be relative
5. the fstab source directories could be separately updated if needed.

This leaves one definate bug: the lxc.logfile does not get updated.
This made me wonder why it was in the configuration file to begin with.
Digging deeper, I realized that whatever '-o outfile' you give
lxc-create gets set in log.c and gets used by the lxc_container object
we create at write_config().  So if you say
	lxc-create -t cirros -n c1 -o /tmp/out1
then /var/lib/lxc/c1/config will have lxc.logfile=/tmp/out1 - which is
clearly wrong.  Therefore I leave fixing that for later.

I'm looking for candidates for $p/$n expansion.  Note we can't expand
these at config_utsname() etc, because then lxc-clone would see the
expanded variable.  So we want to read $p/$n verbatim at config_*(),
and expand them only when they are used.  lxc.logfile is an obvious
good use case.  lxc.utsname can do it too, in case you want container
c1 to be called "c1-whatever".  I'm not sure that's worth it though.
Are there any others, or is that it?

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-08-07 08:55:23 -05:00
Dwight Engen
8058be395d clone: only update <rootfs>/etc/hostname if it exists
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-16 17:25:25 -05:00
Serge Hallyn
5202677243 lxccontainer: don't define certain variables if !HAVE_GNUTLS
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-16 08:16:07 -05:00
Serge Hallyn
3ce746862b lxc_create: prepend pretty header to config file (v2)
Define a sha1sum_file() function in utils.c.  Use that in lxcapi_create
to write out the sha1sum of the template being used.  If libgnutls is
not found, then the template sha1sum simply won't be printed into the
container config.

This patch also trivially fixes some cases where SYSERROR is used after
a fclose (masking errno) and missing consts in mkdir_p.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-15 16:34:00 -05:00
Serge Hallyn
dc23c1c817 create: add a quiet flag
If set, then fds 0,1,2 will be redirected while the creation
template is executed.

Note, as Dwight has pointed out, if fd 0 is redirected, then if
templates ask for input there will be a problem.  We could simply
not redirect fd 0, or we could require that templates work without
interaction.  I'm assuming here that we want to do the latter, but
I'm open to changing that.

Reported-by: "S.Çağlar Onur" <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-12 11:19:54 -05:00
Serge Hallyn
283678ed2c Accomodate stricter devices cgroup rules
3.10 kernel comes with proper hierarchical enforcement of devices
cgroup.  To keep that code somewhat sane, certain things are not
allowed.  Switching from default-allow to default-deny and vice versa
are not allowed when there are children cgroups.  (This *could* be
simplified in the kernel by checking that all child cgroups are
unpopulated, but that has not yet been done and may be rejected)

The mountcgroup hook causes lxc-start to break with 3.10 kernels, because
you cannot write 'a' to devices.deny once you have a child cgroup.  With
this patch, (a) lxcpath is passed to hooks, (b) the cgroup mount hook sets
the container's devices cgroup, and (c) setup_cgroup() during lxc startup
ignores failures to write to devices subsystem if we are already in a
child of the container's new cgroup.

((a) is not really related to this bug, but is definately needed.
The followup work of making the other hooks use the passed-in lxcpath
is still to be done)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-11 10:26:33 -05:00
Serge Hallyn
cbee8106e3 lxcapi_create: fix template handling
1. If no template is passed in, then do not try to execute it.  The user
just wanted to write the configuration.

2. If template is passed in as a full path, then use that instead of
constructing '$templatedir/lxc-$template'.

Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-11 10:25:33 -05:00
Serge Hallyn
96b3cb407c lxcapi_create: split out the template execution
Make it its own function to make both more readable.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-11 10:25:10 -05:00
Dwight Engen
1143ed392d add clonehostname hook
This hook script updates the hostname in various files under /etc in the
cloned container. In order to do so, the old container name is passed in
the LXC_SRC_NAME environment variable.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-10 14:08:43 -05:00
Dwight Engen
3327917f4a fix potential out of bounds pointer deref
I noticed that if find_first_wholeword() is called with word at the very
beginning of p, we will deref *(p - 1) to see if it is a word boundary.
Fix by considering p = p0 to be a word boundary.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-10 14:07:03 -05:00
Bogdan Purcareata
ef091cefca lxcapi_set_cgroup_item: remove duplicate == 0
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-07-03 12:49:16 -05:00
Serge Hallyn
176d9acb2e api_clone: don't remove storage if we haven't created it
In the best case we'll get errors about failing to remove it.  In the
worst case we'll be trying to delete the original container's rootfs.

Reported-by: zoolook <nbensa+lxcusers@gmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-06-24 13:56:05 -05:00
Serge Hallyn
ae3f8cf9a4 Accept more word delimiters when updating hooks
When updating container names in hook files during a container clone,
we substitute the new container name for the old any time the old name
shows up as a separate word.  This patch adds the four characters
'.,_-' as additional delimiters.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-06-24 13:56:03 -05:00
Dwight Engen
b515981702 console API improvements
Add a higher level console API that opens a tty/console and runs the
mainloop as well. Rename existing API to console_getfd(). Use these in
the python binding.

Allow attaching a console peer after container bootup, including if the
container was launched with -d. This is made possible by allocation of a
"proxy" pty as the peer when the console is attached to.

Improve handling of SIGWINCH, the pty size will be correctly set at the
beginning of a session and future changes when using the lxc_console() API
will be propagated to it as well.

Refactor some common code between lxc_console.c and console.c. The variable
wait4q (renamed to saw_escape) was static, making the mainloop callback not
safe across threads. This wasn't a problem when the callback was in the
non-threaded lxc-console, but now that it is internal to console.c, we have
to take care of it. This is now contained in a per-tty state structure.

Don't attempt to open /dev/null as the console peer since /dev/null cannot
be added to the mainloop (epoll_ctl() fails with EPERM). This isn't needed
to get the console setup (and the log to work) since the case of not having
a peer at console init time has to be handled to allow for attaching to it
later.

Move signalfd libc wrapper/replacement to utils.h.

Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-06-12 15:53:08 -05:00
Dwight Engen
f02abefef9 fix check for lock acquired
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2013-06-10 06:47:30 -05:00