to reduce the chances of accidentally handing out privilege modification
privileges. the old default setup of having Permissions.Modify in PVESysAdmin
and PVEAdmin weakened the distinction between those roles and Administrator.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Previously `authentication_verify` just `die`d on error and
would only return a boolean whether `priv/tfa.cfg` needs
updating as a positive result.
Since we want to support locking TOTP as well as a general
TFA lock-out via the config, we also want to be able to tell
when this occurs. Most of it is handled by the TFA rust
crate already, but notifying users needs to be done on this
end instead.
In pve-rs we now have a different API for this:
`authentication_verify2`, which, instead of die()ing on
errors, always returns a hash containing the result as well
as the flags 'tfa-limit-reached' and 'totp-limit-reached'
which, if set, tell us to notify the user.
However, doing so will introduce new fields in the
`priv/tfa.cfg` in a struct marked as `deny_unknown_fields`,
so in a cluster, the limits & notification handling should
only be done once we can be sure that all nodes are up to
date.
These fields are only introduced on login errors, so for
now, handle a failed result early without saving
`priv/tfa.cfg`.
The only case where saving the file was previously required
was when *successfully* logging in with a recovery key, by
which we cannot be reaching a limit, so this should still be
safe.
Once we can validate that all cluster nodes are up to date,
we can implement the notification system.
A commented-out code structure for this is included in this
patch.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
by switching to a tree-based in-memory structure, like we do in PBS.
instead of parsing ACL entries into a hash using the full ACL path as key for
each entry, parse them into a tree-like nested hash. when evaluating ACLs,
iterating over all path prefixes starting at '/' is needed anyway, so this is a
more natural way to store and access the parsed configuration.
some performance data, timing `pveum user permissions $user > /dev/null` for
various amounts of ACL entries in user.cfg
entries | stock | patched | speedup
-------------------------------------
1k | 1.234s | 0.241s | 5.12
2k | 4.480s | 0.262s | 17.09
20k | 7m25s | 0.987s | 450.86
similarly, an /access/ticket request such as the one happening on login goes
down from 4.27s to 109ms with 2k entries (testing with 20k entries fails
because the request times out after 30s, but with the patch it takes 336ms).
the underlying issue is that these two code paths not only iterate over *all
defined ACL paths* to get a complete picture of a user's/token's privileges,
but the fact that that ACL computation for each checked path itself did another
such loop in PVE::AccessControl::roles().
it is enough to iterate over the to-be-checked ACL path in a component-wise
fashion in order to handle role propagation, e.g., when looking at /a/b/c/d,
iterate over
/
/a
/a/b
/a/b/c
/a/b/c/d
in turn instead of all defined ACL paths.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
just above, we check & return if $tfa_challenge is set, so there is no
way that it would be set here. To make it clearer that it must be undef
here pass it as such.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we currently only need to lock the tfa config when we got a recovery key
as a tfa challenge response, since that's the only thing that can
actually change the tfa config (every other method only reads from
there).
so to do that, factor out the code that was inside the lock, and call it
with/without lock depending on the tfa challenge response
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
there is a (hard to trigger) race that can cause a double rotation of
the auth key, with potentially confusing fallout (various processes on
different nodes having an inconsistent view of the current and previous
auth keys, resulting in "random" invalid ticket errors until the next
proper key rotation 24h later).
the underlying cause is that `stat()` calls are excempt from our
otherwise non-cached/direct_io handling of pmxcfs I/O, which allows the
following sequence of events to take place:
LAST: mtime of current auth key
- current epoch advances to LAST + 24h
the following can be arbitrarly interleaved between node A and B:
- LAST+24h node A: pvedaemon/pvestatd on node A calls check_authkey(1)
- LAST+24h node A: it returns 0 (rotation required)
- LAST+24h node A: rotate_key() is called
- LAST+24h node A: cfs_lock_authkey is called
- LAST+24h node B: pvedaemon/pvestatd calls check_authkey(1)
- LAST+24h node B: key is not yet cached in-memory by current process
- LAST+24h node B: key file is opened, stat-ed, read, parsed, and content+mtime
is cached (the kernel will now cache this stat result for 1s unless
the path is opened)
- LAST+24h node B: it returns 0 (rotation required)
- LAST+24h node B: rotate_key() is called
- LAST+24h node B: cfs_lock_authkey is called
the following is mutex-ed via a cfs_lock:
- LAST+24h node A: lock is obtained
- LAST+24h node A: check_authkey() is called
- LAST+24h node A: key is stat-ed, mtime is still (correctly) LAST,
cached mtime and content are returned
- LAST+24h node A: it returns 0 (rotation still required)
- LAST+24h node A: get_pubkey() is called and returns current auth key
- LAST+24h node A: new keypair is generated and persisted
- LAST+24h node A: cfs_lock is released
- LAST+24h node B: changes by node A are processed by pmxcfs
- LAST+24h node B: lock is obtained
- LAST+24h node B: check_authkey() is called
- LAST+24h node B: key is stat-ed, mtime is (incorrectly!) still LAST
since the stat call is handled by the kernel/page cache, not by
pmxcfs, cached mtime and content are returned
- LAST+24h node B: it returns 0 (rotation still required)
- LAST+24h node B: get_pubkey() is called and returns either previous or
key written by node A (depending on whether page cache or pmxcfs
answers stat call)
- LAST+24h node B: new keypair is generated, key returned by last
get_pubkey call is written as old key
the end result is that some nodes and process will treat the key
generated by node A as "current", while others will treat the one
generated by nodoe B as "current". both have the same mtime, so the
in-memory cache hash won't be updated unless the service is restarted or
another rotation happens. depending on who generated the ticket and who
attempts validating it, a ticket might be rejected as invalid even
though the generating party would treat it as valid, and time on all
nodes is properly synced.
there seems to be now way for pmxcfs to pro-actively invalidate the page
cache entry safely (since we'd need to do so while writes to the same
path can happen concurrently), so work around by forcing an open/close
at the (stat) call site which does the work for us. regular reads are
not affected since those already bypass the page cache entirely anyway.
thankfully in almost all cases, the following sequence has enough
synchronization overhead baked in to avoid triggering the issue almost
entirely:
- cfs_lock
- generate key
- create tmp file for old key
- write tmp file
- rename tmp file into proper place
- create tmp file for new pub key
- write tmp file
- rename tmp file into proper place
- create tmp file for new priv key
- write tmp file
- rename tmp file into proper place
- release lock
that being said, there has been at least one report where this was
triggered in the wild from time to time.
it is easy to reproduce by increasing the attr_timeout and entry_timeout
fuse settings inside pmxcfs to increase the time stat results are
treated as valid/retained in the page cache:
-----8<-----
diff --git a/data/src/pmxcfs.c b/data/src/pmxcfs.c
index d78a248..e3e807b 100644
--- a/data/src/pmxcfs.c
+++ b/data/src/pmxcfs.c
@@ -935,7 +935,7 @@ int main(int argc, char *argv[])
mkdir(CFSDIR, 0755);
- char *fa[] = { "-f", "-odefault_permissions", "-oallow_other", NULL};
+ char *fa[] = { "-f", "-odefault_permissions", "-oallow_other", "-oentry_timeout=5", "-oattr_timeout=5", NULL};
struct fuse_args fuse_args = FUSE_ARGS_INIT(sizeof (fa)/sizeof(gpointer) - 1, fa);
----->8-----
in which case it's even easy to trigger more than double rotation in a
bigger test cluster (stopping all PVE services except for pve-cluster
helps avoiding interference):
on a single node:
$ touch --date yesterday /etc/pve/authkey.pub
in parallel (i.e., via tmux synchronized panes):
-----8<-----
use strict;
use warnings;
use PVE::Cluster;
use PVE::AccessControl;
PVE::Cluster::cfs_update();
# ensure page cache entry is there
PVE::AccessControl::check_authkey(1);
PVE::AccessControl::check_authkey(1);
# now attempt rotation
PVE::AccessControl::rotate_authkey();
----->8-----
Thanks to Wolfgang Bumiller for assistance in triaging and exploring
various avenues of fixing.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to detect similar issues to that fixed in the previous commit early on
and without the potential for dangerous side-effects.
root@pam is intentionally still allowed before the check in case such
situations can be triggered by misconfiguration - root@pam can then
still clean up the affected configs via the GUI/API, and not just via
manual editing.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
not that relevant for the user as the daemon auth log already
contains that info, but for token it can be nice.
The API response is always just a plain "401 auth failure" in any
case (expired or wrong creds)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
but don't bail out of the entire auth process, otherwise
not even totp or recovery keys will work anymore in this
case
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
in order to allow subdomains to work, the wa config should
only specify 'id' and 'rp', the 'origin' gets filled in by
the node
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
this happens when the first new tfa entry is added and the
'keys' entry is replaced by "x"
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In PBS we don't support this, so the current TFA API in rust
does not support this either (although the config does know
about its *existence*).
For now, yubico authentication will be done in perl. Adding
it to rust the rust TFA crate would not make much sense
anyway as we'd likely not want to use the same http client
crate in pve and pbs anyway (since pve is all blocking code
and pbs is async...)
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
implements the same api paths as in pbs by forwarding the
api methods to the rust implementation after performing the
product-specific checks
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The main difference here is that we have no separate api
path for verification. Instead, the ticket api call gets an
optional 'tfa-challenge' parameter.
This is opt-in: old pve-manager UI with new
pve-access-control will still work as expected, but users
won't be able to *update* their TFA settings.
Since the new tfa config parser will build a compatible
in-perl tfa config object as well, the old authentication
code is left unchanged for compatibility and will be removed
with pve-8, where the `new-format` parameter in the ticket
call will change its default to `1`.
The `change_tfa` call will simply die in this commit. It is
removed later when adding the pbs-style TFA API calls.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This was missing in commit20c60513b2a6b2d7c7aae0dcc0391889b9cb7ecf,
so user can't assign permisson on a zone currently
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>