This should also fix a memory leak, since we were freeing it under ifdef
but always allocating it.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
The two functions are identical but strrchr also works on Bionic.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
This adds a local ifaddrs implementation to be used on Bionic or other C
libraries that don't come with a getifaddrs implementation.
This code was written by Kenneth MacKay and is under a two-clause BSD
license (copyright information in the file headers).
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
-B dev will check whether btrfs, zfs, or lvm can be used,
in that order, and fall back to dir.
-B lvm,btrfs will try lvm first, then btrfs, then fail.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
1. container hooks should use lxcpath and lxcname from the environment.
2. the utsname now gets separately updated
3. the rootfs path gets updated by the bdev backend.
4. the fstab mount targets should be relative
5. the fstab source directories could be separately updated if needed.
This leaves one definate bug: the lxc.logfile does not get updated.
This made me wonder why it was in the configuration file to begin with.
Digging deeper, I realized that whatever '-o outfile' you give
lxc-create gets set in log.c and gets used by the lxc_container object
we create at write_config(). So if you say
lxc-create -t cirros -n c1 -o /tmp/out1
then /var/lib/lxc/c1/config will have lxc.logfile=/tmp/out1 - which is
clearly wrong. Therefore I leave fixing that for later.
I'm looking for candidates for $p/$n expansion. Note we can't expand
these at config_utsname() etc, because then lxc-clone would see the
expanded variable. So we want to read $p/$n verbatim at config_*(),
and expand them only when they are used. lxc.logfile is an obvious
good use case. lxc.utsname can do it too, in case you want container
c1 to be called "c1-whatever". I'm not sure that's worth it though.
Are there any others, or is that it?
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Define a sha1sum_file() function in utils.c. Use that in lxcapi_create
to write out the sha1sum of the template being used. If libgnutls is
not found, then the template sha1sum simply won't be printed into the
container config.
This patch also trivially fixes some cases where SYSERROR is used after
a fclose (masking errno) and missing consts in mkdir_p.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
If set, then fds 0,1,2 will be redirected while the creation
template is executed.
Note, as Dwight has pointed out, if fd 0 is redirected, then if
templates ask for input there will be a problem. We could simply
not redirect fd 0, or we could require that templates work without
interaction. I'm assuming here that we want to do the latter, but
I'm open to changing that.
Reported-by: "S.Çağlar Onur" <caglar@10ur.org>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
3.10 kernel comes with proper hierarchical enforcement of devices
cgroup. To keep that code somewhat sane, certain things are not
allowed. Switching from default-allow to default-deny and vice versa
are not allowed when there are children cgroups. (This *could* be
simplified in the kernel by checking that all child cgroups are
unpopulated, but that has not yet been done and may be rejected)
The mountcgroup hook causes lxc-start to break with 3.10 kernels, because
you cannot write 'a' to devices.deny once you have a child cgroup. With
this patch, (a) lxcpath is passed to hooks, (b) the cgroup mount hook sets
the container's devices cgroup, and (c) setup_cgroup() during lxc startup
ignores failures to write to devices subsystem if we are already in a
child of the container's new cgroup.
((a) is not really related to this bug, but is definately needed.
The followup work of making the other hooks use the passed-in lxcpath
is still to be done)
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
1. If no template is passed in, then do not try to execute it. The user
just wanted to write the configuration.
2. If template is passed in as a full path, then use that instead of
constructing '$templatedir/lxc-$template'.
Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This hook script updates the hostname in various files under /etc in the
cloned container. In order to do so, the old container name is passed in
the LXC_SRC_NAME environment variable.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
I noticed that if find_first_wholeword() is called with word at the very
beginning of p, we will deref *(p - 1) to see if it is a word boundary.
Fix by considering p = p0 to be a word boundary.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
In the best case we'll get errors about failing to remove it. In the
worst case we'll be trying to delete the original container's rootfs.
Reported-by: zoolook <nbensa+lxcusers@gmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When updating container names in hook files during a container clone,
we substitute the new container name for the old any time the old name
shows up as a separate word. This patch adds the four characters
'.,_-' as additional delimiters.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Add a higher level console API that opens a tty/console and runs the
mainloop as well. Rename existing API to console_getfd(). Use these in
the python binding.
Allow attaching a console peer after container bootup, including if the
container was launched with -d. This is made possible by allocation of a
"proxy" pty as the peer when the console is attached to.
Improve handling of SIGWINCH, the pty size will be correctly set at the
beginning of a session and future changes when using the lxc_console() API
will be propagated to it as well.
Refactor some common code between lxc_console.c and console.c. The variable
wait4q (renamed to saw_escape) was static, making the mainloop callback not
safe across threads. This wasn't a problem when the callback was in the
non-threaded lxc-console, but now that it is internal to console.c, we have
to take care of it. This is now contained in a per-tty state structure.
Don't attempt to open /dev/null as the console peer since /dev/null cannot
be added to the mainloop (epoll_ctl() fails with EPERM). This isn't needed
to get the console setup (and the log to work) since the case of not having
a peer at console init time has to be handled to allow for attaching to it
later.
Move signalfd libc wrapper/replacement to utils.h.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
flock is not supported on nfs. fcntl is at least supported on newer
(v3 and above) nfs.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: zoolook <nbensa+lxcusers@gmail.com>
Create a loopfile backed container by doing:
lxc-create -B loop -t template -n name
or
lxc-clone -B loop -o dir1 -n loop1
The rootfs in the configuration file will be
loop:/var/lib/lxc/loop1/rootdev
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
And use it in place of the various ways we were deducing /etc/lxc/default.conf.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Update the LOCKING comment.
Take mem_lock in want_daemonize.
convert lxcapi_destroy to not use privlock/slock by hand.
Fix a coverity-found potential dereference of NULL c->lxc_conf.
api_cgroup_get_item() and api_cgroup_set_item(): use disklock,
not memlock, since the values are set through the cgroup fs on
the running container.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Those go through commands.c and are already mutex'ed that way.
Also remove a unmatched container_disk_unlock in lxcapi_create.
Since is_stopped uses getstate which is no longer locked, rename
it to drop the _locked suffix.
And convert save_config to taking the disk lock. This way the
save_ and load_config are mutexing each other, as they should.
Changelog: May 29:
Per Dwight's comment, take the lock before opening the config
FILE *.
Only take disklock at load and save_config when we're using the
container's config file, not when read/writing from/to another
file.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Make lxc_cmd_console() return the fd from the socket connection to the
caller. This fd keeps the tty slot allocated until the caller closes
it. Returning the fd allows for a long lived process to close the fd
and reuse consoles.
Add API function for console allocation.
Create test program for console API.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Technically as Dwight has mentioned we should probably drop the locking
from api_state() altogether, since those are protected through the
lxc command system.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
1. implement bdev->create:
python and lua: send NULL for bdevtype and bdevspecs.
They'll want to be updated to pass those in in a way that makes
sense, but I can't think about that right now.
2. templates: pass --rootfs
If the container is backed by a device which must be mounted (i.e.
lvm) then pass the actual rootfs mount destination to the
templates.
Note that the lxc.rootfs can be a mounted block device. The template
should actually be installing the rootfs under the path where the
lxc.rootfs is *mounted*.
Still, some people like to run templates by hand and assume purely
directory backed containers, so continue to support that use case
(i.e. if no --rootfs is listed).
Make sure the templates don't re-write lxc.rootfs if it is
already in the config. (Most were already checking for that)
3. Replace lxc-create script with lxc_create.c program.
Changelog:
May 24: when creating a container, create $lxcpath/$name/partial,
and flock it. When done, close that file and unlink it. In
lxc_container_new() and lxcapi_start(), check for this file. If
it is locked, create is ongoing. If it exists but is not locked,
create() was killed - remove the container.
May 24: dont disk-lock during lxcapi_create. The partial lock
is sufficient.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This requires implementing bdev->ops->destroy() for each of the backing
store types. Then implementing lxcapi_clone(), writing lxc_destroy.c
using the api, and removing the lxc-destroy.in script.
(this also has a few other cleanups, like marking some functions
static)
Changelog:
fold into destroy: fix zfs destroy
destroy: use correct program name in help
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
implement c->reboot(c) in the api.
Also if the container is not running, return -2. Currently
lxc-stop will return 0, so you cannot tell the difference
between successfull stopping and noop.
Per stgraber's email:
- Remove lxc-shutdown
- Change lxc-stop so that:
* Default behaviour is to call shutdown(), wait 15s for STOPPED, if
not STOPPED, print a message to the user and call stop() [ NOTE:
actually 60 seconds per followup thread]
* We have a -r option to reboot the container (with proper check that
the container indeed rebooted within the next 15s)
* We have a -s option to shutdown the container without the automatic
fallback to stop()
* Add a -k option allowing a user to just kill a container
(equivalent to old lxc-stop, no shutdown() call and no delay).
and update manpages.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Create three pairs of functions:
int process_lock(void);
void process_unlock(void);
int container_mem_lock(struct lxc_container *c)
void container_mem_unlock(struct lxc_container *c)
int container_disk_lock(struct lxc_container *c);
void container_disk_unlock(struct lxc_container *c);
and use those in lxccontainer.c
process_lock() is to protect the process state among multiple threads.
container_mem_lock() is to protect a struct container among multiple
threads. container_disk_lock is to protect a container on disk.
Also remove the lock in lxcapi_init_pid() as Dwight suggested.
Fix a typo (s/container/contain) spotted by Dwight.
More locking fixes are needed, but let's first the the fundamentals
right. How close does this get us?
Changelog: v2:
fix lxclock compile
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
The problem: if a task is killed while holding a posix semaphore,
there appears to be no way to have the semaphore be reliably
autmoatically released. The only trick which seemed promising
is to store the pid of the lock holder in some file and have
later lock seekers check whether that task has died.
Instead of going down that route, this patch switches from a
named posix semaphore to flock. The advantage is that when
the task is killed, its fds are closed and locks are automatically
released.
The disadvantage of flock is that we can't rely on it to exclude
threads. Therefore c->slock must now always be wrapped inside
c->privlock.
This patch survived basic testing with the lxcapi_create patchset,
where now killing lxc-create while it was holding the lock did
not lock up future api commands.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
There were several memory leaks in the cgroup functions, notably in the
success cases.
The cgpath test program was refactored and additional tests added to it.
It was used in various modes under valgrind to test that the leaks were
fixed.
Simplify lxc_cgroup_path_get() and cgroup_path_get by having them return a
char * instead of an int and an output char * argument. The only return
values ever used were -1 and 0, which are now handled with NULL and non-NULL
returns respectively.
Use consistent variable names of cgabspath when refering to an absolute path
to a cgroup subsystem or file, and cgrelpath when refering to a container
"group/name" within the cgroup heirarchy.
Remove unused subsystem argument to lxc_cmd_get_cgroup_path().
Remove unused #define MAXPRIOLEN
Make template arg to lxcapi_create() const
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This fixes the build of lxccontainer.c on systems that have __NR_setns
but not HAVE_SETNS.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Motivation for this change is to have the ability to get the run-time
configuration items from a container, which may differ from its current
on disk configuration, or might not be available any other way (for
example lxc.network.0.veth.pair). In adding this ability it seemed there
was room for refactoring improvements.
Genericize the command infrastructure so that both command requests and
responses can have arbitrary data. Consolidate all commands into command.c
and name them consistently. This allows all the callback routines to be
made static, reducing exposure.
Return the actual allocated tty for the console command. Don't print the
init pid in lxc_info if the container isn't actually running. Command
processing was made more thread safe by removing the static buffer from
receive_answer(). Refactored command response code to a common routine.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This adds a new get_ips call which takes a family (inet, inet6 or NULL),
a network interface (or NULL for all) and a scope (0 for global) and returns
a char** of all the IPs in the container.
This also adds a matching python3 binding (function result is a tuple) and
deprecates the previous pure-python get_ips() implementation.
WARNING: The python get_ips() call is quite different from the previous
implementation. The timeout argument has been removed, the family names are
slightly different (inet/inet6 vs ipv4/ipv6) and an extra scope parameter
has been added.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Add a clone hook called from api_clone. Pass arguments to it from
lxc_clone.c.
The clone update hook is called while the container's bdev is mounted.
Information about the container is passed in through environment
variables LXC_ROOTFS_PATH, LXC_NAME, The LXC_ROOTFS_MOUNT, and
LXC_CONFIG_FILE.
LXC_ROOTFS_MOUNT=/usr/lib/x86_64-linux-gnu/lxc
LXC_CONFIG_FILE=/var/lib/lxc/demo3/config
LXC_ROOTFS_PATH=/var/lib/lxc/demo3/rootfs
LXC_NAME=demo3
So from the hook, updates to the container should be made under
$LXC_ROOTFS_MOUNT/ .
The hook also receives command line arguments as follows:
First argument is container name, second is always 'lxc', third
is the hook name (always clone), then come the arguments which
were passed to lxc-clone. I.e. when I did:
sudo lxc-clone demo2 demo3 -- hey there dude
the arguments passed in were "demo3 lxc clone hey there dude"
I personally would like to drop the first two arguments. The
name is available as $LXC_NAME, and the section argument ('lxc')
is meaningless. However, doing so risks invalidating existing
hooks.
Soon analogous create and destroy hooks will be added as well.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
The problem is that the fd table is shared between threads and if a thread
forks() while another thread has an open fd to the monitor, the duped fd
in the fork()ed child will not get closed, thus causing monitord to stay
around since it thinks it still has a client. This only happened when
calling lxcapi_start() in the daemonized case since that is the only time
we try to get the status from the monitor.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
The check for flen < 0 could never have been true since flen was declared
to be size_t (unsigned). Declare flen to be long since that is what ftell
returns.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
clean up error case in clone, which in particular could cause double
lxc_container_put(c2)
for overlayfs, handle (with error message) all bdev types.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
It's a tiny program (exported through the api) wrapping the util.c
helpers for reading /etc/lxc/lxc.conf variables, and replaces
the kludgy shell duplication in lxc.functions.in
Changelog: Apr 30: address feedback from Dwight
(exit error on failure, and use 'lxcpath' as name, not
'default_path').
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
1. commonize waitpid users to use a single helper. We frequently want
to run something in a clean namespace, or fork off a script. This
lets us keep the function doing fork:(1)exec(2)waitpid simpler.
2. start a blockdev backend implementation. This will be used for
mounting, copying, and snapshotting container filesystems.
3. implement btrfs, lvm, directory, and overlayfs backends.
4. For overlayfs, support a new lxc.rootfs format of
'bdevtype:<extra>'. This means you can now use overlayfs-based
containers without using lxc-start-ephemeral, by using
lxc.rootfs = overlayfs:/readonly-dir:writeable-dir
5. add a set of simple clone testcases
6. Write a new lxc_clone.c based on api clone.
Still to do (there's more, but off top of my head):
1. support zfs, aufs
2. have clone handle other mount entries (right now it only clones
the rootfs)
3. python, lua, and go bindings (not me :)
4. lxc-destroy: if lvm backing store, check for snapshots of it.
(what about directories which have overlayfs clones?)
Changes since v2:
Initialize random generator when picking new macaddr (reported
by caglar@10ur.org)
Fix wrong use of bitmask flags
On copy-clone of btrfs, create a subvolume
lxc_clone.c: respect the command line usage of the old script
lxc-clone(1): update documentation
Refuse to try changing backing stores expect to overlayfs, as
it is not implemented (yet) anyway.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Conflicts:
src/lxc/utils.h
On Fri, 26 Apr 2013 10:18:12 -0500
Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> Quoting Dwight Engen (dwight.engen@oracle.com):
> > On Fri, 26 Apr 2013 09:37:49 -0500
> > Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> >
> > > Quoting Dwight Engen (dwight.engen@oracle.com):
> > > > Using lxc configured with --enable-configpath-log, and
> > > > specifying a path to the lxc commands with -P, the log file
> > > > path is generated with a basename of LOGPATH instead of the
> > > > lxcpath. This means for example if you do
> > > >
> > > > lxc-start -P /tmp/containers -n test01 -l INFO
> > > >
> > > > your log file will be
> > > >
> > > > /var/lib/lxc/test01/test01.log
> > > >
> > > > I was expecting the log to be /tmp/containers/test01/test01.log.
> > > > This is particularly confusing if you also have test01 on the
> > > > regular lxcpath. The patch below changes the log file path to be
> > > > based on lxcpath rather than LOGPATH when lxc is configured with
> > > > --enable-configpath-log.
> > > >
> > > > I think that even in the normal non --enable-configpath-log case
> > > > we should consider using lxcpath as the base and not having
> > > > LOGPATH at all, as attempting to create the log files
> > > > in /var/log is not going to work for regular users on their own
> > > > lxcpath. If we want that, I'll update the patch to do that as
> > > > well.
> > >
> > >
> > > Perhaps we should do:
> > >
> > > 1. If lxcpath == default_lxc_path(), then first choice is
> > > LOGPATH, second is lxcpath/container.log
> > > 2. when opening, if first choice fails, use second choice
> > > if there is any.
> > >
> > > That way 'system' containers will go to /var/log/lxc, as I think
> > > they should. Custom-lxcpath containers should never go
> > > to /var/log/lxc, since their names could be dups of containers in
> > > default_lxc_path(). And if the system is a weird one where
> > > default_lxc_path is set up so that an unprivileged user can use
> > > it, then we should log into $lxcpath.
> >
> > That sounds good to me. So these rules would apply in both the
> > regular and --enable-configpath-log cases.
I updated the patch to try to open the log file according to the
choices given above. Along the way I cleaned up log.c a bit, making
some things static, grouping external interfaces together, etc...
Hopefully that doesn't add too much noise.
> > > (Note this patch will trivially conflict with my new lxc_clone.c
> > > causing it to fail to build - unfortunate result of timing)
> >
> > Yeah unfortunately this touches every lxc_log_init() caller. I can
> > work on the above logic and re-submit after your new lxc_clone
> > stuff goes in.
>
> No no, I'll just need to remember to update mine. Don't hold up on
> mine, this is just the nature of such collaboration :)
>
> > Did you have any thoughts on the XXX what to pass in for lxcpath in
> > lxc_init? Right now it just falls back to LOGPATH.
>
> No - that's a weird one, since lxc_init runs in the container. If
> there were only system containers I'd say always use LOGPATH.
> However there are people (apparently :) who use container sharing the
> host's rootfs...
>
> lxc-execute does know the lxcpath. Perhaps we can simply have
> src/lxc/execute.c:execute_start() look at handler->conf to see if a
> rootfs is set. If rootfs is NOT set, then pass lxcpath along to
> lxc-init. Then lxc-init can mostly do the same as the others? (It
> doesn't use src/lxc/arguments.c, so you'd have to add lxcpath to
> options[] in lxc-init.c)
So I did this, only to realize that lxc-init is passing "none" for the
file anyway, so it currently doesn't intend to log. This makes me
think that passing NULL for lxcpath is the right thing to do in
this patch. If you want me to make it so lxc-init can log, I can do
that but I think it should be in a different change :)
--
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This fixes a long standing issue that there could only be a single
lxc-monitor per container.
With this change, a new lxc-monitord daemon is spawned the first time
lxc-monitor is called against the container and will accept connections
from any subsequent lxc-monitor.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>