linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-27 06:50:37 +00:00

Author	SHA1	Message	Date
Steven Chen	9f0ec4b16f	ima: kexec: move IMA log copy from kexec load to execute The IMA log is currently copied to the new kernel during kexec 'load' using ima_dump_measurement_list(). However, the IMA measurement list copied at kexec 'load' may result in loss of IMA measurements records that only occurred after the kexec 'load'. Move the IMA measurement list log copy from kexec 'load' to 'execute' Make the kexec_segment_size variable a local static variable within the file, so it can be accessed during both kexec 'load' and 'execute'. Define kexec_post_load() as a wrapper for calling ima_kexec_post_load() and machine_kexec_post_load(). Replace the existing direct call to machine_kexec_post_load() with kexec_post_load(). When there is insufficient memory to copy all the measurement logs, copy as much of the measurement list as possible. Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-29 15:54:54 -04:00
Steven Chen	f18e502db6	ima: kexec: define functions to copy IMA log at soft boot The IMA log is currently copied to the new kernel during kexec 'load' using ima_dump_measurement_list(). However, the log copied at kexec 'load' may result in loss of IMA measurements that only occurred after kexec "load'. Setup the needed infrastructure to move the IMA log copy from kexec 'load' to 'execute'. Define a new IMA hook ima_update_kexec_buffer() as a stub function. It will be used to call ima_dump_measurement_list() during kexec 'execute'. Implement ima_kexec_post_load() function to be invoked after the new Kernel image has been loaded for kexec. ima_kexec_post_load() maps the IMA buffer to a segment in the newly loaded Kernel. It also registers the reboot notifier_block to trigger ima_update_kexec_buffer() at kexec 'execute'. Set the priority of register_reboot_notifier to INT_MIN to ensure that the IMA log copy operation will happen at the end of the operation chain, so that all the IMA measurement records extended into the TPM are copied Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-29 15:54:54 -04:00
Steven Chen	9ee8888a80	ima: kexec: skip IMA segment validation after kexec soft reboot Currently, the function kexec_calculate_store_digests() calculates and stores the digest of the segment during the kexec_file_load syscall, where the IMA segment is also allocated. Later, the IMA segment will be updated with the measurement log at the kexec execute stage when a kexec reboot is initiated. Therefore, the digests should be updated for the IMA segment in the normal case. The problem is that the content of memory segments carried over to the new kernel during the kexec systemcall can be changed at kexec 'execute' stage, but the size and the location of the memory segments cannot be changed at kexec 'execute' stage. To address this, skip the calculation and storage of the digest for the IMA segment in kexec_calculate_store_digests() so that it is not added to the purgatory_sha_regions. With this change, the IMA segment is not included in the digest calculation, storage, and verification. Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm [zohar@linux.ibm.com: Fixed Signed-off-by tag to match author's email ] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-29 15:54:53 -04:00
Steven Chen	c95e1acb6d	ima: define and call ima_alloc_kexec_file_buf() In the current implementation, the ima_dump_measurement_list() API is called during the kexec "load" phase, where a buffer is allocated and the measurement records are copied. Due to this, new events added after kexec load but before kexec execute are not carried over to the new kernel during kexec operation Carrying the IMA measurement list across kexec requires allocating a buffer and copying the measurement records. Separate allocating the buffer and copying the measurement records into separate functions in order to allocate the buffer at kexec 'load' and copy the measurements at kexec 'execute'. After moving the vfree() here at this stage in the patch set, the IMA measurement list fails to verify when doing two consecutive "kexec -s -l" with/without a "kexec -s -u" in between. Only after "ima: kexec: move IMA log copy from kexec load to execute" the IMA measurement list verifies properly with the vfree() here. Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-29 15:54:53 -04:00
Steven Chen	cb5052282c	ima: rename variable the seq_file "file" to "ima_kexec_file" Before making the function local seq_file "file" variable file static global, rename it to "ima_kexec_file". Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-29 15:54:53 -04:00
Linus Torvalds	30e268185e	Landlock fix for v6.15-rc4 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCaAp+shAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSAukA/1FkHkyqXaiMYy2s4tlwvpjWMBP11gEUzyYM j1HIn3rOAQDUT1V2q+Ff5ZVk4hsCZ8pJMTmrASSL/CHw1yfQHtzQAw== =5RX1 -----END PGP SIGNATURE----- Merge tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fix some Landlock audit issues, add related tests, and updates documentation" * tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Update log documentation landlock: Fix documentation for landlock_restrict_self(2) landlock: Fix documentation for landlock_create_ruleset(2) selftests/landlock: Add PID tests for audit records selftests/landlock: Factor out audit fixture in audit_test landlock: Log the TGID of the domain creator landlock: Remove incorrect warning	2025-04-24 12:59:05 -07:00
Jakub Kicinski	5565acd1e6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.15-rc4). This pull includes wireless and a fix to vxlan which isn't in Linus's tree just yet. The latter creates with a silent conflict / build breakage, so merging it now to avoid causing problems. drivers/net/vxlan/vxlan_vnifilter.c `094adad913` ("vxlan: Use a single lock to protect the FDB table") `087a9eb9e5` ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry") https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com No "normal" conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-24 11:20:52 -07:00
Song Liu	74e5b13a1b	lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORK security_netlink_send() is a networking hook, so it fits better under CONFIG_SECURITY_NETWORK. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-22 15:34:58 -04:00
Frederick Lawler	30d68cb0c3	ima: process_measurement() needlessly takes inode_lock() on MAY_READ On IMA policy update, if a measure rule exists in the policy, IMA_MEASURE is set for ima_policy_flags which makes the violation_check variable always true. Coupled with a no-action on MAY_READ for a FILE_CHECK call, we're always taking the inode_lock(). This becomes a performance problem for extremely heavy read-only workloads. Therefore, prevent this only in the case there's no action to be taken. Signed-off-by: Frederick Lawler <fred@cloudflare.com> Acked-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-04-22 16:39:32 +02:00
Mickaël Salaün	25b1fc1cdc	landlock: Fix documentation for landlock_restrict_self(2) Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s flags documentation. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-04-17 11:09:10 +02:00
Mickaël Salaün	50492f942c	landlock: Fix documentation for landlock_create_ruleset(2) Move and fix the flags documentation, and improve formatting. It makes more sense and it eases maintenance to document syscall flags in landlock.h, where they are defined. This is already the case for landlock_restrict_self(2)'s flags. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-04-17 11:09:07 +02:00
Kees Cook	f5c68a4e84	hardening: Disable GCC randstruct for COMPILE_TEST There is a GCC crash bug in the randstruct for latest GCC versions that is being tickled by landlock[1]. Temporarily disable GCC randstruct for COMPILE_TEST builds to unbreak CI systems for the coming -rc2. This can be restored once the bug is fixed. Suggested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/all/20250407-kbuild-disable-gcc-plugins-v1-1-5d46ae583f5e@kernel.org/ [1] Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250409151154.work.872-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>	2025-04-15 13:50:17 -07:00
Paul Moore	05f1a93922	selinux: fix the kdoc header for task_avdcache_update The kdoc header incorrectly references an older parameter, update it to reference what is currently used in the function. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504122308.Ch8PzJdD-lkp@intel.com/ Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-12 11:37:06 -04:00
Paul Moore	1ec31f14a8	selinux: remove a duplicated include The "linux/parser.h" header was included twice, we only need it once. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504121945.Q0GDD0sG-lkp@intel.com/ Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-12 11:31:46 -04:00
Kuniyuki Iwashima	2a63dd0edf	net: Retire DCCP socket. DCCP was orphaned in 2021 by commit `054c4610bd` ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS"), which noted that the last maintainer had been inactive for five years. In recent years, it has become a playground for syzbot, and most changes to DCCP have been odd bug fixes triggered by syzbot. Apart from that, the only changes have been driven by treewide or networking API updates or adjustments related to TCP. Thus, in 2023, we announced we would remove DCCP in 2025 via commit `b144fcaf46` ("dccp: Print deprecation notice."). Since then, only one individual has contacted the netdev mailing list. [0] There is ongoing research for Multipath DCCP. The repository is hosted on GitHub [1], and development is not taking place through the upstream community. While the repository is published under the GPLv2 license, the scheduling part remains proprietary, with a LICENSE file [2] stating: "This is not Open Source software." The researcher mentioned a plan to address the licensing issue, upstream the patches, and step up as a maintainer, but there has been no further communication since then. Maintaining DCCP for a decade without any real users has become a burden. Therefore, it's time to remove it. Removing DCCP will also provide significant benefits to TCP. It allows us to freely reorganize the layout of struct inet_connection_sock, which is currently shared with DCCP, and optimize it to reduce the number of cachelines accessed in the TCP fast path. Note that we keep DCCP netfilter modules as requested. [3] Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0] Link: https://github.com/telekom/mp-dccp #[1] Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2] Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux) Acked-by: Casey Schaufler <casey@schaufler-ca.com> Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-11 18:58:10 -07:00
Paul Moore	5d7ddc59b3	selinux: reduce path walk overhead Reduce the SELinux performance overhead during path walks through the use of a per-task directory access cache and some minor code optimizations. The directory access cache is per-task because it allows for a lockless cache while also fitting well with a common application pattern of heavily accessing a relatively small number of SELinux directory labels. The cache is inherited by child processes when the child runs with the same SELinux domain as the parent, and invalidated on changes to the task's SELinux domain or the loaded SELinux policy. A cache of four entries was chosen based on testing with the Fedora "targeted" policy, a SELinux Reference Policy variant, and 'make allmodconfig' on Linux v6.14. Code optimizations include better use of inline functions to reduce function calls in the common case, especially in the inode revalidation code paths, and elimination of redundant checks between the LSM and SELinux layers. As mentioned briefly above, aside from general use and regression testing with the selinux-testsuite, performance was measured using 'make allmodconfig' with Linux v6.14 as a base reference. As expected, there were variations from one test run to another, but the measurements below are a good representation of the test results seen on my test system. * Linux v6.14 REF 1.26% [k] __d_lookup_rcu SELINUX (1.31%) 0.58% [k] selinux_inode_permission 0.29% [k] avc_lookup 0.25% [k] avc_has_perm_noaudit 0.19% [k] __inode_security_revalidate * Linux v6.14 + patch REF 1.41% [k] __d_lookup_rcu SELINUX (0.89%) 0.65% [k] selinux_inode_permission 0.15% [k] avc_lookup 0.05% [k] avc_has_perm_noaudit 0.04% [k] avc_policy_seqno X.XX% [k] __inode_security_revalidate (now inline) In both cases the __d_lookup_rcu() function was used as a reference point to establish a context for the SELinux related functions. On a unpatched Linux v6.14 system we see the time spent in the combined SELinux functions exceeded that of __d_lookup_rcu(), 1.31% compared to 1.26%. However, with this patch applied the time spent in the combined SELinux functions dropped to roughly 65% of the time spent in __d_lookup_rcu(), 0.89% compared to 1.41%. Aside from the significant decrease in time spent in the SELinux AVC, it appears that any additional time spent searching and updating the cache is offset by other code improvements, e.g. time spent in selinux_inode_permission() + __inode_security_revalidate() + avc_policy_seqno() is less on the patched kernel than the unpatched kernel. It is worth noting that in this patch the use of the per-task cache is limited to the security_inode_permission() LSM callback, selinux_inode_permission(), but future work could expand the cache into inode_has_perm(), likely through consolidation of the two functions. While this would likely have little to no impact on path walks, it may benefit other operations. Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:41:31 -04:00
Takaya Saeki	8716451a4e	selinux: support wildcard match in genfscon Currently, genfscon only supports string prefix match to label files. Thus, labeling numerous dynamic sysfs entries requires many specific path rules. For example, labeling device paths such as `/sys/devices/pci0000:00/0000:00:03.1/<...>/0000:04:00.1/wakeup` requires listing all specific PCI paths, which is challenging to maintain. While user-space restorecon can handle these paths with regular expression rules, relabeling thousands of paths under sysfs after it is mounted is inefficient compared to using genfscon. This commit adds wildcard matching to genfscon to make rules more efficient and expressive. This new behavior is enabled by genfs_seclabel_wildcard capability. With this capability, genfscon does wildcard matching instead of prefix matching. When multiple wildcard rules match against a path, then the longest rule (determined by the length of the rule string) will be applied. If multiple rules of the same length match, the first matching rule encountered in the given genfscon policy will be applied. Users are encouraged to write longer, more explicit path rules to avoid relying on this behavior. This change resulted in nice real-world performance improvements. For example, boot times on test Android devices were reduced by 15%. This improvement is due to the elimination of the "restorecon -R /sys" step during boot, which takes more than two seconds in the worst case. Signed-off-by: Takaya Saeki <takayas@chromium.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:36:34 -04:00
Christian Göttsche	4926c3fd83	selinux: drop copy-paste comment Port labeling is based on port number and protocol (TCP/UDP/...) but not based on network family (IPv4/IPv6). Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:32:07 -04:00
Christian Göttsche	cde3b1b66f	selinux: unify OOM handling in network hashtables For network objects, like interfaces, nodes, port and InfiniBands, the object to SID lookup is cached in hashtables. OOM during such hashtable additions of new objects is considered non-fatal and the computed SID is simply returned without adding the compute result into the hash table. Actually ignore OOM in the InfiniBand code, despite the comment already suggesting to do so. This reverts commit `c350f8bea2` ("selinux: Fix error return code in sel_ib_pkey_sid_slow()"). Add comments in the other places. Use kmalloc() instead of kzalloc(), since all members are initialized on success and the data is only used in internbal hash tables, so no risk of information leakage to userspace. Fixes: `c350f8bea2` ("selinux: Fix error return code in sel_ib_pkey_sid_slow()") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:29:51 -04:00
Christian Göttsche	e6fb56b225	selinux: add likely hints for fast paths In the network hashtable lookup code add likely() compiler hints in the fast path, like already done in sel_netif_sid(). Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:29:51 -04:00
Christian Göttsche	9cc034be10	selinux: contify network namespace pointer The network namespace is not modified. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:29:51 -04:00
Christian Göttsche	47a1a15645	selinux: constify network address pointer The network address, either an IPv4 or IPv6 one, is not modified. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-04-11 16:29:50 -04:00
Mickaël Salaün	4767af82a0	landlock: Log the TGID of the domain creator As for other Audit's "pid" fields, Landlock should use the task's TGID instead of its TID. Fix this issue by keeping a reference to the TGID of the domain creator. Existing tests already check for the PID but only with the thread group leader, so always the TGID. A following patch adds dedicated tests for non-leader thread. Remove the current_real_cred() check which does not make sense because we only reference a struct pid, whereas a previous version did reference a struct cred instead. Cc: Christian Brauner <brauner@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250410171725.1265860-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-04-11 12:53:17 +02:00
Mickaël Salaün	fe81536af3	landlock: Remove incorrect warning landlock_put_hierarchy() can be called when an error occurs in landlock_merge_ruleset() due to insufficient memory. In this case, the domain's audit details might not have been allocated yet, which would cause landlock_free_hierarchy_details() to print a warning (but still safely handle this case). We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for this kind of function, so let's remove it entirely. Cc: Paul Moore <paul@paul-moore.com> Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net Reviewed-by: Günther Noack <gnoack@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-04-08 19:18:20 +02:00
NeilBrown	06c567403a	Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS try_lookup_noperm() and d_hash_and_lookup() are nearly identical. The former does some validation of the name where the latter doesn't. Outside of the VFS that validation is likely valuable, and having only one exported function for this task is certainly a good idea. So make d_hash_and_lookup() local to VFS files and change all other callers to try_lookup_noperm(). Note that the arguments are swapped. Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250319031545.2999807-6-neil@brown.name Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-04-08 11:24:41 +02:00
NeilBrown	fa6fe07d15	VFS: rename lookup_one_len family to lookup_noperm and remove permission check The lookup_one_len family of functions is (now) only used internally by a filesystem on itself either - in a context where permission checking is irrelevant such as by a virtual filesystem populating itself, or xfs accessing its ORPHANAGE or dquota accessing the quota file; or - in a context where a permission check (MAY_EXEC on the parent) has just been performed such as a network filesystem finding in "silly-rename" file in the same directory. This is also the context after the _parentat() functions where currently lookup_one_qstr_excl() is used. So the permission check is pointless. The name "one_len" is unhelpful in understanding the purpose of these functions and should be changed. Most of the callers pass the len as "strlen()" so using a qstr and QSTR() can simplify the code. This patch renames these functions (include lookup_positive_unlocked() which is part of the family despite the name) to have a name based on "lookup_noperm". They are changed to receive a 'struct qstr' instead of separate name and len. In a few cases the use of QSTR() results in a new call to strlen(). try_lookup_noperm() takes a pointer to a qstr instead of the whole qstr. This is consistent with d_hash_and_lookup() (which is nearly identical) and useful for lookup_noperm_unlocked(). The new lookup_noperm_common() doesn't take a qstr yet. That will be tidied up in a subsequent patch. Signed-off-by: NeilBrown <neil@brown.name> Link: https://lore.kernel.org/r/20250319031545.2999807-5-neil@brown.name Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-04-08 11:24:36 +02:00
Jeff Xu	5796d3967c	mseal sysmap: kernel config and header change Patch series "mseal system mappings", v9. As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] This patch (of 7): Provide infrastructure to mseal system mappings. Establish two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS, ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS) and VM_SEALED_SYSMAP macro for future patches. Link: https://lkml.kernel.org/r/20250305021711.3867874-1-jeffxu@google.com Link: https://lkml.kernel.org/r/20250305021711.3867874-2-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-04-01 15:17:14 -07:00
Linus Torvalds	2cd5769fb0	Driver core updates for 6.15-rc1 Here is the big set of driver core updates for 6.15-rc1. Lots of stuff happened this development cycle, including: - kernfs scaling changes to make it even faster thanks to rcu - bin_attribute constify work in many subsystems - faux bus minor tweaks for the rust bindings - rust binding updates for driver core, pci, and platform busses, making more functionaliy available to rust drivers. These are all due to people actually trying to use the bindings that were in 6.14. - make Rafael and Danilo full co-maintainers of the driver core codebase - other minor fixes and updates. This has been in linux-next for a while now, with the only reported issue being some merge conflicts with the rust tree. Depending on which tree you pull first, you will have conflicts in one of them. The merge resolution has been in linux-next as an example of what to do, or can be found here: https://lore.kernel.org/r/CANiq72n3Xe8JcnEjirDhCwQgvWoE65dddWecXnfdnbrmuah-RQ@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ+mMrg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ylRgwCdH58OE3BgL0uoFY5vFImStpmPtqUAoL5HpVWI jtbJ+UuXGsnmO+JVNBEv =gy6W -----END PGP SIGNATURE----- Merge tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updatesk from Greg KH: "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff happened this development cycle, including: - kernfs scaling changes to make it even faster thanks to rcu - bin_attribute constify work in many subsystems - faux bus minor tweaks for the rust bindings - rust binding updates for driver core, pci, and platform busses, making more functionaliy available to rust drivers. These are all due to people actually trying to use the bindings that were in 6.14. - make Rafael and Danilo full co-maintainers of the driver core codebase - other minor fixes and updates" * tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits) rust: platform: require Send for Driver trait implementers rust: pci: require Send for Driver trait implementers rust: platform: impl Send + Sync for platform::Device rust: pci: impl Send + Sync for pci::Device rust: platform: fix unrestricted &mut platform::Device rust: pci: fix unrestricted &mut pci::Device rust: device: implement device context marker rust: pci: use to_result() in enable_device_mem() MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers rust/kernel/faux: mark Registration methods inline driver core: faux: only create the device if probe() succeeds rust/faux: Add missing parent argument to Registration::new() rust/faux: Drop #[repr(transparent)] from faux::Registration rust: io: fix devres test with new io accessor functions rust: io: rename `io::Io` accessors kernfs: Move dput() outside of the RCU section. efi: rci2: mark bin_attribute as __ro_after_init rapidio: constify 'struct bin_attribute' firmware: qemu_fw_cfg: constify 'struct bin_attribute' powerpc/perf/hv-24x7: Constify 'struct bin_attribute' ...	2025-04-01 11:02:03 -07:00
Linus Torvalds	fa593d0f96	bpf-next-6.15 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmfi6ZAACgkQ6rmadz2v bTpLOg/+J7xUddPMhlpFAUlifQEadE5hmw6v1tXpM3zyKHzUWJiv/qsx3j8/ckgD D+d4P8bqIbI9SSuIS4oZ0+D9pr/g7GYztnoYZmPiYJ7v2AijPuof5dsagFQE8E2y rhfbt9KHTMzzkdkTvaAZaITS/HWAoJ2YVRB6gfLex2ghcXYHcgmtKRZniQrbBiFZ MIXBN8Rg6HP+pUdIVllSXFcQCb3XIgjPONRAos4hr5tIm+3Ku7Jvkgk2H/9vUcoF bdXAcg8xygyH7eY+1l3e7nEPQlG0jUZEsL+tq+vpdoLRLqlIpAUYmwUvqcmq4dPS QGFjiUcpDbXlxsUFpzjXHIFto7fXCfND7HEICQPwAncdflIIfYaATSQUfkEexn0a wBCFlAChrEzAmg2vFl4EeEr0fdSe/3jswrgKx0m6ctKieMjgloBUeeH4fXOpfkhS 9tvhuduVFuronlebM8ew4w9T/mBgbyxkE5KkvP4hNeB3ni3N0K6Mary5/u2HyN1e lqTlnZxRA4p6lrvxce/mDrR4VSwlKLcSeQVjxAL1afD5KRkuZJnUv7bUhS361vkG IjNrQX30EisDAz+X7tMn3ndBf9vVatwFT4+c3yaxlQRor1WofhDfT88HPiyB4QqQ Kdx2EHgbQxJp4vkzhp4/OXlTfkihsMEn8egzZuphdPEQ9Y+Jdwg= =aN/V -----END PGP SIGNATURE----- Merge tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "For this merge window we're splitting BPF pull request into three for higher visibility: main changes, res_spin_lock, try_alloc_pages. These are the main BPF changes: - Add DFA-based live registers analysis to improve verification of programs with loops (Eduard Zingerman) - Introduce load_acquire and store_release BPF instructions and add x86, arm64 JIT support (Peilin Ye) - Fix loop detection logic in the verifier (Eduard Zingerman) - Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet) - Add kfunc for populating cpumask bits (Emil Tsalapatis) - Convert various shell based tests to selftests/bpf/test_progs format (Bastien Curutchet) - Allow passing referenced kptrs into struct_ops callbacks (Amery Hung) - Add a flag to LSM bpf hook to facilitate bpf program signing (Blaise Boscaccy) - Track arena arguments in kfuncs (Ihor Solodrai) - Add copy_remote_vm_str() helper for reading strings from remote VM and bpf_copy_from_user_task_str() kfunc (Jordan Rome) - Add support for timed may_goto instruction (Kumar Kartikeya Dwivedi) - Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy) - Reduce bpf_cgrp_storage_busy false positives when accessing cgroup local storage (Martin KaFai Lau) - Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko) - Allow retrieving BTF data with BTF token (Mykyta Yatsenko) - Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix (Song Liu) - Reject attaching programs to noreturn functions (Yafang Shao) - Introduce pre-order traversal of cgroup bpf programs (Yonghong Song)" * tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits) selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid bpf: Fix out-of-bounds read in check_atomic_load/store() libbpf: Add namespace for errstr making it libbpf_errstr bpf: Add struct_ops context information to struct bpf_prog_aux selftests/bpf: Sanitize pointer prior fclose() selftests/bpf: Migrate test_xdp_vlan.sh into test_progs selftests/bpf: test_xdp_vlan: Rename BPF sections bpf: clarify a misleading verifier error message selftests/bpf: Add selftest for attaching fexit to __noreturn functions bpf: Reject attaching fexit/fmod_ret to __noreturn functions bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage bpf: Make perf_event_read_output accessible in all program types. bpftool: Using the right format specifiers bpftool: Add -Wformat-signedness flag to detect format errors selftests/bpf: Test freplace from user namespace libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID bpf: Return prog btf_id without capable check bpf: BPF token support for BPF_BTF_GET_FD_BY_ID bpf, x86: Fix objtool warning for timed may_goto bpf: Check map->record at the beginning of check_and_free_fields() ...	2025-03-30 12:43:03 -07:00
Linus Torvalds	e5e0e6bebe	This update includes the following changes: API: - Remove legacy compression interface. - Improve scatterwalk API. - Add request chaining to ahash and acomp. - Add virtual address support to ahash and acomp. - Add folio support to acomp. - Remove NULL dst support from acomp. Algorithms: - Library options are fuly hidden (selected by kernel users only). - Add Kerberos5 algorithms. - Add VAES-based ctr(aes) on x86. - Ensure LZO respects output buffer length on compression. - Remove obsolete SIMD fallback code path from arm/ghash-ce. Drivers: - Add support for PCI device 0x1134 in ccp. - Add support for rk3588's standalone TRNG in rockchip. - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93. - Fix bugs in tegra uncovered by multi-threaded self-test. - Fix corner cases in hisilicon/sec2. Others: - Add SG_MITER_LOCAL to sg miter. - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmfiQ9kACgkQxycdCkmx i6fFZg/9GWjC1FLEV66vNlYAIzFGwzwWdFGyQzXyP235Cphhm4qt9gx7P91N6Lvc pplVjNEeZHoP8lMw+AIeGc2cRhIwsvn8C+HA3tCBOoC1qSe8T9t7KHAgiRGd/0iz UrzVBFLYlR9i4tc0T5peyQwSctv8DfjWzduTmI3Ts8i7OQcfeVVgj3sGfWam7kjF 1GJWIQH7aPzT8cwFtk8gAK1insuPPZelT1Ppl9kUeZe0XUibrP7Gb5G9simxXAyi B+nLCaJYS6Hc1f47cfR/qyZSeYQN35KTVrEoKb1pTYXfEtMv6W9fIvQVLJRYsqpH RUBdDJUseE+WckR6glX9USrh+Fv9d+HfsTXh1fhpApKU5sQJ7pDbUm4ge8p6htNG MIszbJPdqajYveRLuPUjFlUXaqomos8eT6BZA+RLHm1cogzEOm+5bjspbfRNAVPj x9KiDu5lXNiFj02v/MkLKUe3bnGIyVQnZNi7Rn0Rpxjv95tIjVpksZWMPJarxUC6 5zdyM2I5X0Z9+teBpbfWyqfzSbAs/KpzV8S/xNvWDUT6NlpYGBeNXrCDTXcwJLAh PRW0w1EJUwsZbPi8GEh5jNzo/YK1cGsUKrihKv7YgqSSopMLI8e/WVr8nKZMVDFA O+6F6ec5lR7KsOIMGUqrBGFU1ccAeaLLvLK3H5J8//gMMg82Uik= =aQNt -----END PGP SIGNATURE----- Merge tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Remove legacy compression interface - Improve scatterwalk API - Add request chaining to ahash and acomp - Add virtual address support to ahash and acomp - Add folio support to acomp - Remove NULL dst support from acomp Algorithms: - Library options are fuly hidden (selected by kernel users only) - Add Kerberos5 algorithms - Add VAES-based ctr(aes) on x86 - Ensure LZO respects output buffer length on compression - Remove obsolete SIMD fallback code path from arm/ghash-ce Drivers: - Add support for PCI device 0x1134 in ccp - Add support for rk3588's standalone TRNG in rockchip - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93 - Fix bugs in tegra uncovered by multi-threaded self-test - Fix corner cases in hisilicon/sec2 Others: - Add SG_MITER_LOCAL to sg miter - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp" * tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits) crypto: testmgr - Add multibuffer acomp testing crypto: acomp - Fix synchronous acomp chaining fallback crypto: testmgr - Add multibuffer hash testing crypto: hash - Fix synchronous ahash chaining fallback crypto: arm/ghash-ce - Remove SIMD fallback code path crypto: essiv - Replace memcpy() + NUL-termination with strscpy() crypto: api - Call crypto_alg_put in crypto_unregister_alg crypto: scompress - Fix incorrect stream freeing crypto: lib/chacha - remove unused arch-specific init support crypto: remove obsolete 'comp' compression API crypto: compress_null - drop obsolete 'comp' implementation crypto: cavium/zip - drop obsolete 'comp' implementation crypto: zstd - drop obsolete 'comp' implementation crypto: lzo - drop obsolete 'comp' implementation crypto: lzo-rle - drop obsolete 'comp' implementation crypto: lz4hc - drop obsolete 'comp' implementation crypto: lz4 - drop obsolete 'comp' implementation crypto: deflate - drop obsolete 'comp' implementation crypto: 842 - drop obsolete 'comp' implementation crypto: nx - Migrate to scomp API ...	2025-03-29 10:01:55 -07:00
Linus Torvalds	7288511606	Landlock update for v6.15-rc1 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ+bGgBAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSKmgBAICZsmQTuKMHIXdB7kwA+BX5k++SZcyA+qHN 0hrJTSMsAP0Uv6NpiPT4CTduqBMRbuMwNhujBczRiok6yaHDbC8eCw== =K8XL -----END PGP SIGNATURE----- Merge tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This brings two main changes to Landlock: - A signal scoping fix with a new interface for user space to know if it is compatible with the running kernel. - Audit support to give visibility on why access requests are denied, including the origin of the security policy, missing access rights, and description of object(s). This was designed to limit log spam as much as possible while still alerting about unexpected blocked access. With these changes come new and improved documentation, and a lot of new tests" * tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits) landlock: Add audit documentation selftests/landlock: Add audit tests for network selftests/landlock: Add audit tests for filesystem selftests/landlock: Add audit tests for abstract UNIX socket scoping selftests/landlock: Add audit tests for ptrace selftests/landlock: Test audit with restrict flags selftests/landlock: Add tests for audit flags and domain IDs selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags selftests/landlock: Add test for invalid ruleset file descriptor samples/landlock: Enable users to log sandbox denials landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF landlock: Add LANDLOCK_RESTRICT_SELF_LOG__EXEC_ flags landlock: Log scoped denials landlock: Log TCP bind and connect denials landlock: Log truncate and IOCTL denials landlock: Factor out IOCTL hooks landlock: Log file-related denials landlock: Log mount-related denials landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials ...	2025-03-28 12:37:13 -07:00
Linus Torvalds	78fb88eca6	Capabilities update for 6.15 This branch contains one patch: capability: Remove unused has_capability This removes a helper function whose last user (smack) stopped using it in 2018. This has been in linux-next for most of the the last cycle with no apparent issues. It is available at: git@git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git #caps-20250327 on top of commit `2014c95afe` (tag: v6.14-rc1) -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmfmHIcACgkQNXDaFycK ziRQFggAl1fQCS/LNhFY7SAmkv8S6yJemYt8hNvRjlOwwMB2aZbFFfZeVGdO73od mkQAoZA0ppPT8aXH2pbfcCsKI2UFQQ/cm6Jpl4wCEZmKr2ZTxcH5zCRbUQeiR/7t QvcWVXCC47BnI9ab+OiCCaCw5HtAuweax6IGjFTregjUAEcBk8aoF9PJMvlT0e6j WQqWegPVS1GSiQqWGxPzdfnUOSkGuak9setg53pCTQE6UjMIK7uVxfSEJpe3pKvH ZvRvkif5UJrdUAr9MKMvf/gkKEl5xpAoTRdoaiWdzZ/arZ77HD32qoi/o3s4P+7a 7ZHvAc+Xjm0AIO/4JrjzFw0zH/svSg== =rbE0 -----END PGP SIGNATURE----- Merge tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux Pull capabilities update from Serge Hallyn: "This contains just one patch that removes a helper function whose last user (smack) stopped using it in 2018" * tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux: capability: Remove unused has_capability	2025-03-28 12:09:33 -07:00
Linus Torvalds	a2d4f473df	integrity-v6.15 -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ+WAJBQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5QxeAP4/XqKc8OX5tKnySy5xe6r5ghYLIQLR zcw5lUbKEFrAewEAvvdciA5fLBRHE4xbMRtu30Y1Rxiqq1XY+3ilcAKjpws= =B+UO -----END PGP SIGNATURE----- Merge tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull ima updates from Mimi Zohar: "Two performance improvements, which minimize the number of integrity violations" * tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: limit the number of ToMToU integrity violations ima: limit the number of open-writers integrity violations	2025-03-28 12:06:58 -07:00
Linus Torvalds	f174ac5ba2	ipe/stable-6.15 PR 20250324 -----BEGIN PGP SIGNATURE----- iIcEABYIAC8WIQQzmBmZPBN6m/hUJmnyomI6a/yO7QUCZ+HCcBEcd3VmYW5Aa2Vy bmVsLm9yZwAKCRDyomI6a/yO7SwKAP9AOYDTVMThFW+lWAuEbfATW+Tf+QRxhjfq IIJ5F8b2bgD+PrMveqeKRwLnho/rshgw7NQLwVLOFM/JgnDvP0JavQg= =2Vtz -----END PGP SIGNATURE----- Merge tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull ipe update from Fan Wu: "This contains just one commit from Randy Dunlap, which fixes kernel-doc warnings in the IPE subsystem" * tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: ipe: policy_fs: fix kernel-doc warnings	2025-03-28 12:00:40 -07:00
Mimi Zohar	a414016218	ima: limit the number of ToMToU integrity violations Each time a file in policy, that is already opened for read, is opened for write, a Time-of-Measure-Time-of-Use (ToMToU) integrity violation audit message is emitted and a violation record is added to the IMA measurement list. This occurs even if a ToMToU violation has already been recorded. Limit the number of ToMToU integrity violations per file open for read. Note: The IMA_MAY_EMIT_TOMTOU atomic flag must be set from the reader side based on policy. This may result in a per file open for read ToMToU violation. Since IMA_MUST_MEASURE is only used for violations, rename the atomic IMA_MUST_MEASURE flag to IMA_MAY_EMIT_TOMTOU. Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6 Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-03-27 12:40:12 -04:00
Mimi Zohar	5b3cd80115	ima: limit the number of open-writers integrity violations Each time a file in policy, that is already opened for write, is opened for read, an open-writers integrity violation audit message is emitted and a violation record is added to the IMA measurement list. This occurs even if an open-writers violation has already been recorded. Limit the number of open-writers integrity violations for an existing file open for write to one. After the existing file open for write closes (__fput), subsequent open-writers integrity violations may be emitted. Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6 Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-03-27 12:35:51 -04:00
Linus Torvalds	592329e5e9	Summary * Move vm_table members out of kernel/sysctl.c All vm_table array members have moved to their respective subsystems leading to the removal of vm_table from kernel/sysctl.c. This increases modularity by placing the ctl_tables closer to where they are actually used and at the same time reducing the chances of merge conflicts in kernel/sysctl.c. * ctl_table range fixes Replace the proc_handler function that checks variable ranges in coredump_sysctls and vdso_table with the one that actually uses the extra{1,2} pointers as min/max values. This tightens the range of the values that users can pass into the kernel effectively preventing {under,over}flows. * Misc fixes Correct grammar errors and typos in test messages. Update sysctl files in MAINTAINERS. Constified and removed array size in declaration for alignment_tbl * Testing - These have all been in linux-next for at least 1 month - They have gone through 0-day - Ran all these through sysctl selftests in x86_64 -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmfhV8EACgkQupfNUreW QU/udAv/VCXGkndQsJ5biXpXYFnokX0gIEaYzzHiqrFycZqr8ys0/wWzc+ar1LjF Jvanl2uKB0mUviLKt7Gk0+Hri+PJlYIrbx+5K5eo2wsKUUxFykqLLm59y/orPODl gyPQjKNpHJb7COsnEc3Lrq/fvol4NPHlcBPXG8NwehccTeBHZ1ninfo+pSnxh3o8 kI3GSLLxD4K9AgBl5QuVWH4gU7o//u7lUkKzy03NW+2jmuRv3dRcYF7IdgMINNee AeXnygdSBxLzECBvmkfNdyg+AmL8hdsmzbsIh7UuJDvxLlQOInVLZa+sXBotCOIc TImCrr1Ws1OuGrD0kpH+21tJvc8pNFWt61QlulObQdrLndWHdZEGyGOusLpXTwbn jIWZmMvzk1foSwdgzwPFzUqPEpW3FrBVDo4Z4kenBDrCp56QTX7hGRvkNYJNKvot Ue+i8BeHR/Gm/p+UMqgsSTOaNJXTqZhFqwJQVzxU/9LN/vkS0On6fbjgBd5X6Pn+ a5dlc9gy =0bcX -----END PGP SIGNATURE----- Merge tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move vm_table members out of kernel/sysctl.c All vm_table array members have moved to their respective subsystems leading to the removal of vm_table from kernel/sysctl.c. This increases modularity by placing the ctl_tables closer to where they are actually used and at the same time reducing the chances of merge conflicts in kernel/sysctl.c. - ctl_table range fixes Replace the proc_handler function that checks variable ranges in coredump_sysctls and vdso_table with the one that actually uses the extra{1,2} pointers as min/max values. This tightens the range of the values that users can pass into the kernel effectively preventing {under,over}flows. - Misc fixes Correct grammar errors and typos in test messages. Update sysctl files in MAINTAINERS. Constified and removed array size in declaration for alignment_tbl * tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits) selftests/sysctl: fix wording of help messages selftests: fix spelling/grammar errors in sysctl/sysctl.sh MAINTAINERS: Update sysctl file list in MAINTAINERS sysctl: Fix underflow value setting risk in vm_table coredump: Fixes core_pipe_limit sysctl proc_handler sysctl: remove unneeded include sysctl: remove the vm_table sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c fs: dcache: move the sysctl to fs/dcache.c sunrpc: simplify rpcauth_cache_shrink_count() fs: drop_caches: move sysctl to fs/drop_caches.c fs: fs-writeback: move sysctl to fs/fs-writeback.c mm: nommu: move sysctl to mm/nommu.c security: min_addr: move sysctl to security/min_addr.c mm: mmap: move sysctl to mm/mmap.c mm: util: move sysctls to mm/util.c mm: vmscan: move vmscan sysctls to mm/vmscan.c mm: swap: move sysctl to mm/swap.c mm: filemap: move sysctl to mm/filemap.c ...	2025-03-26 21:02:05 -07:00
Mickaël Salaün	ead9079f75	landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF for the case of sandboxer tools, init systems, or runtime containers launching programs sandboxing themselves in an inconsistent way. Setting this flag should only depends on runtime configuration (i.e. not hardcoded). We don't create a new ruleset's option because this should not be part of the security policy: only the task that enforces the policy (not the one that create it) knows if itself or its children may request denied actions. This is the first and only flag that can be set without actually restricting the caller (i.e. without providing a ruleset). Extend struct landlock_cred_security with a u8 log_subdomains_off. struct landlock_file_security is still 16 bytes. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Closes: https://github.com/landlock-lsm/linux/issues/3 Link: https://lore.kernel.org/r/20250320190717.2287696-19-mic@digikod.net [mic: Fix comment] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:43 +01:00
Mickaël Salaün	12bfcda73a	landlock: Add LANDLOCK_RESTRICT_SELF_LOG__EXEC_ flags Most of the time we want to log denied access because they should not happen and such information helps diagnose issues. However, when sandboxing processes that we know will try to access denied resources (e.g. unknown, bogus, or malicious binary), we might want to not log related access requests that might fill up logs. By default, denied requests are logged until the task call execve(2). If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied requests will not be logged for the same executed file. If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied requests from after an execve(2) call will be logged. The rationale is that a program should know its own behavior, but not necessarily the behavior of other programs. Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific Landlock domain, it makes it possible to selectively mask some access requests that would be logged by a parent domain, which might be handy for unprivileged processes to limit logs. However, system administrators should still use the audit filtering mechanism. There is intentionally no audit nor sysctl configuration to re-enable these logs. This is delegated to the user space program. Increment the Landlock ABI version to reflect this interface change. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-18-mic@digikod.net [mic: Rename variables and fix __maybe_unused] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:42 +01:00
Mickaël Salaün	1176a15b5e	landlock: Log scoped denials Add audit support for unix_stream_connect, unix_may_send, task_kill, and file_send_sigiotask hooks. The related blockers are: - scope.abstract_unix_socket - scope.signal Audit event sample for abstract unix socket: type=LANDLOCK_DENY msg=audit(1729738800.268:30): domain=195ba459b blockers=scope.abstract_unix_socket path=00666F6F Audit event sample for signal: type=LANDLOCK_DENY msg=audit(1729738800.291:31): domain=195ba459b blockers=scope.signal opid=1 ocomm="systemd" Refactor and simplify error handling in LSM hooks. Extend struct landlock_file_security with fown_layer and use it to log the blocking domain. The struct aligned size is still 16 bytes. Cc: Günther Noack <gnoack@google.com> Cc: Tahera Fahimi <fahimitahera@gmail.com> Link: https://lore.kernel.org/r/20250320190717.2287696-17-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:42 +01:00
Mickaël Salaün	9f74411a40	landlock: Log TCP bind and connect denials Add audit support to socket_bind and socket_connect hooks. The related blockers are: - net.bind_tcp - net.connect_tcp Audit event sample: type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=net.connect_tcp daddr=127.0.0.1 dest=80 Cc: Günther Noack <gnoack@google.com> Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com> Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Link: https://lore.kernel.org/r/20250320190717.2287696-16-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:41 +01:00
Mickaël Salaün	20fd295494	landlock: Log truncate and IOCTL denials Add audit support to the file_truncate and file_ioctl hooks. Add a deny_masks_t type and related helpers to store the domain's layer level per optional access rights (i.e. LANDLOCK_ACCESS_FS_TRUNCATE and LANDLOCK_ACCESS_FS_IOCTL_DEV) when opening a file, which cannot be inferred later. In practice, the landlock_file_security aligned blob size is still 16 bytes because this new one-byte deny_masks field follows the existing two-bytes allowed_access field and precede the packed fown_subject. Implementing deny_masks_t with a bitfield instead of a struct enables a generic implementation to store and extract layer levels. Add KUnit tests to check the identification of a layer level from a deny_masks_t, and the computation of a deny_masks_t from an access right with its layer level or a layer_mask_t array. Audit event sample: type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.ioctl_dev path="/dev/tty" dev="devtmpfs" ino=9 ioctlcmd=0x5401 Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-15-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:41 +01:00
Mickaël Salaün	e120b3c293	landlock: Factor out IOCTL hooks Compat and non-compat IOCTL hooks are almost the same, except to compare the IOCTL command. Factor out these two IOCTL hooks to highlight the difference and minimize audit changes (see next commit). Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-14-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:40 +01:00
Mickaël Salaün	2fc80c69df	landlock: Log file-related denials Add audit support for path_mkdir, path_mknod, path_symlink, path_unlink, path_rmdir, path_truncate, path_link, path_rename, and file_open hooks. The dedicated blockers are: - fs.execute - fs.write_file - fs.read_file - fs.read_dir - fs.remove_dir - fs.remove_file - fs.make_char - fs.make_dir - fs.make_reg - fs.make_sock - fs.make_fifo - fs.make_block - fs.make_sym - fs.refer - fs.truncate - fs.ioctl_dev Audit event sample for a denied link action: type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.refer path="/usr/bin" dev="vda2" ino=351 type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.make_reg,fs.refer path="/usr/local" dev="vda2" ino=365 We could pack blocker names (e.g. "fs:make_reg,refer") but that would increase complexity for the kernel and log parsers. Moreover, this could not handle blockers of different classes (e.g. fs and net). Make it simple and flexible instead. Add KUnit tests to check the identification from a layer_mask_t array of the first layer level denying such request. Cc: Günther Noack <gnoack@google.com> Depends-on: `058518c209` ("landlock: Align partial refer access checks with final ones") Depends-on: `d617f0d72d` ("landlock: Optimize file path walks and prepare for audit support") Link: https://lore.kernel.org/r/20250320190717.2287696-13-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:39 +01:00
Mickaël Salaün	c56f649646	landlock: Log mount-related denials Add audit support for sb_mount, move_mount, sb_umount, sb_remount, and sb_pivot_root hooks. The new related blocker is "fs.change_topology". Audit event sample: type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.change_topology name="/" dev="tmpfs" ino=1 Remove landlock_get_applicable_domain() and get_current_fs_domain() which are now fully replaced with landlock_get_applicable_subject(). Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-12-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:39 +01:00
Mickaël Salaün	1d636984e0	landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status Asynchronously log domain information when it first denies an access. This minimize the amount of generated logs, which makes it possible to always log denials for the current execution since they should not happen. These records are identified with the new AUDIT_LANDLOCK_DOMAIN type. The AUDIT_LANDLOCK_DOMAIN message contains: - the "domain" ID which is described; - the "status" which can either be "allocated" or "deallocated"; - the "mode" which is for now only "enforcing"; - for the "allocated" status, a minimal set of properties to easily identify the task that loaded the domain's policy with landlock_restrict_self(2): "pid", "uid", executable path ("exe"), and command line ("comm"); - for the "deallocated" state, the number of "denials" accounted to this domain, which is at least 1. This requires each domain to save these task properties at creation time in the new struct landlock_details. A reference to the PID is kept for the lifetime of the domain to avoid race conditions when investigating the related task. The executable path is resolved and stored to not keep a reference to the filesystem and block related actions. All these metadata are stored for the lifetime of the related domain and should then be minimal. The required memory is not accounted to the task calling landlock_restrict_self(2) contrary to most other Landlock allocations (see related comment). The AUDIT_LANDLOCK_DOMAIN record follows the first AUDIT_LANDLOCK_ACCESS record for the same domain, which is always followed by AUDIT_SYSCALL and AUDIT_PROCTITLE. This is in line with the audit logic to first record the cause of an event, and then add context with other types of record. Audit event sample for a first denial: type=LANDLOCK_ACCESS msg=audit(1732186800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd" type=LANDLOCK_DOMAIN msg=audit(1732186800.349:44): domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer" type=SYSCALL msg=audit(1732186800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0 Audit event sample for a following denial: type=LANDLOCK_ACCESS msg=audit(1732186800.372:45): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd" type=SYSCALL msg=audit(1732186800.372:45): arch=c000003e syscall=101 success=no [...] pid=300 auid=0 Log domain deletion with the "deallocated" state when a domain was previously logged. This makes it possible for log parsers to free potential resources when a domain ID will never show again. The number of denied access requests is useful to easily check how many access requests a domain blocked and potentially if some of them are missing in logs because of audit rate limiting, audit rules, or Landlock log configuration flags (see following commit). Audit event sample for a deletion of a domain that denied something: type=LANDLOCK_DOMAIN msg=audit(1732186800.393:46): domain=195ba459b status=deallocated denials=2 Cc: Günther Noack <gnoack@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-11-mic@digikod.net [mic: Update comment and GFP flag for landlock_log_drop_domain()] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:38 +01:00
Mickaël Salaün	33e65b0d3a	landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials Add a new AUDIT_LANDLOCK_ACCESS record type dedicated to an access request denied by a Landlock domain. AUDIT_LANDLOCK_ACCESS indicates that something unexpected happened. For now, only denied access are logged, which means that any AUDIT_LANDLOCK_ACCESS record is always followed by a SYSCALL record with "success=no". However, log parsers should check this syscall property because this is the only sign that a request was denied. Indeed, we could have "success=yes" if Landlock would support a "permissive" mode. We could also add a new field to AUDIT_LANDLOCK_DOMAIN for this mode (see following commit). By default, the only logged access requests are those coming from the same executed program that enforced the Landlock restriction on itself. In other words, no audit record are created for a task after it called execve(2). This is required to avoid log spam because programs may only be aware of their own restrictions, but not the inherited ones. Following commits will allow to conditionally generate AUDIT_LANDLOCK_ACCESS records according to dedicated landlock_restrict_self(2)'s flags. The AUDIT_LANDLOCK_ACCESS message contains: - the "domain" ID restricting the action on an object, - the "blockers" that are missing to allow the requested access, - a set of fields identifying the related object (e.g. task identified with "opid" and "ocomm"). The blockers are implicit restrictions (e.g. ptrace), or explicit access rights (e.g. filesystem), or explicit scopes (e.g. signal). This field contains a list of at least one element, each separated with a comma. The initial blocker is "ptrace", which describe all implicit Landlock restrictions related to ptrace (e.g. deny tracing of tasks outside a sandbox). Add audit support to ptrace_access_check and ptrace_traceme hooks. For the ptrace_access_check case, we log the current/parent domain and the child task. For the ptrace_traceme case, we log the parent domain and the current/child task. Indeed, the requester and the target are the current task, but the action would be performed by the parent task. Audit event sample: type=LANDLOCK_ACCESS msg=audit(1729738800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd" type=SYSCALL msg=audit(1729738800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0 A following commit adds user documentation. Add KUnit tests to check reading of domain ID relative to layer level. The quick return for non-landlocked tasks is moved from task_ptrace() to each LSM hooks. It is not useful to inline the audit_enabled check because other computation are performed by landlock_log_denial(). Use scoped guards for RCU read-side critical sections. Cc: Günther Noack <gnoack@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-10-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:38 +01:00
Mickaël Salaün	14f6c14e9f	landlock: Identify domain execution crossing Extend struct landlock_cred_security with a domain_exec bitmask to identify which Landlock domain were created by the current task's bprm. The whole bitmask is reset on each execve(2) call. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-9-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:37 +01:00
Mickaël Salaün	79625f1b3a	landlock: Prepare to use credential instead of domain for fowner This cosmetic change is needed for audit support, specifically to be able to filter according to cross-execution boundaries. struct landlock_file_security's size stay the same for now but it will increase with struct landlock_cred_security's size. Only save Landlock domain in hook_file_set_fowner() if the current domain has LANDLOCK_SCOPE_SIGNAL, which was previously done for each hook_file_send_sigiotask() calls. This should improve a bit performance. Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope variable. Use scoped guards for RCU read-side critical sections. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-8-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:37 +01:00
Mickaël Salaün	8d20efa9dc	landlock: Prepare to use credential instead of domain for scope This cosmetic change that is needed for audit support, specifically to be able to filter according to cross-execution boundaries. Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope variable. Use scoped guards for RCU read-side critical sections. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-7-mic@digikod.net [mic: Update headers] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:36 +01:00
Mickaël Salaün	93f33f0cb2	landlock: Prepare to use credential instead of domain for network This cosmetic change that is needed for audit support, specifically to be able to filter according to cross-execution boundaries. Optimize current_check_access_socket() to only handle the access request. Remove explicit domain->num_layers check which is now part of the landlock_get_applicable_subject() call. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-6-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:35 +01:00
Mickaël Salaün	ae2483a260	landlock: Prepare to use credential instead of domain for filesystem This cosmetic change is needed for audit support, specifically to be able to filter according to cross-execution boundaries. Add landlock_get_applicable_subject(), mainly a copy of landlock_get_applicable_domain(), which will fully replace it in a following commit. Optimize current_check_access_path() to only handle the access request. Partially replace get_current_fs_domain() with explicit calls to landlock_get_applicable_subject(). The remaining ones will follow with more changes. Remove explicit domain->num_layers check which is now part of the landlock_get_applicable_subject() call. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:35 +01:00
Mickaël Salaün	5b95b329be	landlock: Move domain hierarchy management Create a new domain.h file containing the struct landlock_hierarchy definition and helpers. This type will grow with audit support. This also prepares for a new domain type. Cc: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250320190717.2287696-4-mic@digikod.net Reviewed-by: Günther Noack <gnoack3000@gmail.com> Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:34 +01:00
Mickaël Salaün	d9d2a68ed4	landlock: Add unique ID generator Landlock IDs can be generated to uniquely identify Landlock objects. For now, only Landlock domains get an ID at creation time. These IDs map to immutable domain hierarchies. Landlock IDs have important properties: - They are unique during the lifetime of the running system thanks to the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs. - They are always greater than 2^32 and must then be stored in 64-bit integer types. - The initial ID (at boot time) is randomly picked between 2^32 and 2^33, which limits collisions in logs across different boots. - IDs are sequential, which enables users to order them. - IDs may not be consecutive but increase with a random 2^4 step, which limits side channels. Such IDs can be exposed to unprivileged processes, even if it is not the case with this audit patch series. The domain IDs will be useful for user space to identify sandboxes and get their properties. These Landlock IDs are more secure that other absolute kernel IDs such as pipe's inodes which rely on a shared global counter. For checkpoint/restore features (i.e. CRIU), we could easily implement a privileged interface (e.g. sysfs) to set the next ID counter. IDR/IDA are not used because we only need a bijection from Landlock objects to Landlock IDs, and we must not recycle IDs. This enables us to identify all Landlock objects during the lifetime of the system (e.g. in logs), but not to access an object from an ID nor know if an ID is assigned. Using a counter is simpler, it scales (i.e. avoids growing memory footprint), and it does not require locking. We'll use proper file descriptors (with IDs used as inode numbers) to access Landlock objects. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:34 +01:00
Mickaël Salaün	9b08a16637	lsm: Add audit_log_lsm_data() helper Extract code from dump_common_audit_data() into the audit_log_lsm_data() helper. This helps reuse common LSM audit data while not abusing AUDIT_AVC records because of the common_lsm_audit() helper. Depends-on: `7ccbe076d9` ("lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set") Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: James Morris <jmorris@namei.org> Cc: Serge E. Hallyn <serge@hallyn.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250320190717.2287696-2-mic@digikod.net Reviewed-by: Günther Noack <gnoack3000@gmail.com> Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:33 +01:00
Mickaël Salaün	18eb75f3af	landlock: Always allow signals between threads of the same process Because Linux credentials are managed per thread, user space relies on some hack to synchronize credential update across threads from the same process. This is required by the Native POSIX Threads Library and implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to synchronize threads. See nptl(7) and libpsx(3). Furthermore, some runtimes like Go do not enable developers to have control over threads [1]. To avoid potential issues, and because threads are not security boundaries, let's relax the Landlock (optional) signal scoping to always allow signals sent between threads of the same process. This exception is similar to the __ptrace_may_access() one. hook_file_set_fowner() now checks if the target task is part of the same process as the caller. If this is the case, then the related signal triggered by the socket will always be allowed. Scoping of abstract UNIX sockets is not changed because kernel objects (e.g. sockets) should be tied to their creator's domain at creation time. Note that creating one Landlock domain per thread puts each of these threads (and their future children) in their own scope, which is probably not what users expect, especially in Go where we do not control threads. However, being able to drop permissions on all threads should not be restricted by signal scoping. We are working on a way to make it possible to atomically restrict all threads of a process with the same domain [2]. Add erratum for signal scoping. Closes: https://github.com/landlock-lsm/go-landlock/issues/36 Fixes: `54a6e6bbf3` ("landlock: Add signal scoping") Fixes: `c899496501` ("selftests/landlock: Test signal scoping for threads") Depends-on: `26f204380a` ("fs: Fix file_set_fowner LSM hook inconsistencies") Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1] Link: https://github.com/landlock-lsm/linux/issues/2 [2] Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Tahera Fahimi <fahimitahera@gmail.com> Cc: stable@vger.kernel.org Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net [mic: Add extra pointer check and RCU guard, and ease backport] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-26 13:59:29 +01:00
Linus Torvalds	61af143fbe	Smack patches for v6.15 -----BEGIN PGP SIGNATURE----- iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmfi1Z4XHGNhc2V5QHNj aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGPlRAAua8w+rpw06Njdi8LNDytiMil Z8a1dbF2yupLclaydLuwxOvJbdjNcCzEVDDFNYEfh6iBiXsaoqr1FO13mmdcjon7 NXPscvhBWbb0dXvPEk74upWk2HRUIS0BI3tofY1YaGJLdWZRM2z6qvHcb4jNkM89 o+wzzxme7HNwez4SpB+vmm8WZ4AH3rX7q0ihf3XOpplvxeKwob3ZCu+nNQe0AXMI FjMXPn148O96UKJgYJFMV1rLzWmYXzrrDBvzYS79walyti9ct1SYKRY4Zs80MiJK 0mQGDpg60PFUldcsnBD9Pp5nifcLKg5nNUWb15AhRw1sR4wnKl4DxCA4032NYB/x wuVW2rsRjPDuzD6XGS16FZDv86qSyxPtJmvk3qKJmiIQVRc1xhJWAH57A+UzgnLx UGvIcIoJjifTX4EUZXdCdH44RBBs/tjCbrBKROPnKBPbkLHNxAzW/Z65hLvxHgg8 f0vFuSNAL5dCNRdspWgE5BayM0RoW/HZbY15/O+oc7hDgs2ORgKPtayt4uJZr1km PhxG3/9BWRjga44RRwIGCsxzqVdO0l5FYYqh+AYnzTOz8S+kncmcPHbnWOYVU0aH 9u5jE46TCmRxVnEaco+1dhGBUZaFgjXpPv+PxT+LZgTVHLbhBuONCEIot8ejFOKc TomnIzxqMacRAlURNXM= =SUpV -----END PGP SIGNATURE----- Merge tag 'Smack-for-6.15' of https://github.com/cschaufler/smack-next Pull smack updates from Casey Schaufler: "This is a larger set of patches than usual, consisting of a set of build clean-ups, a rework of error handling in setting up CIPSO label specification and a bug fix in network labeling" * tag 'Smack-for-6.15' of https://github.com/cschaufler/smack-next: smack: recognize ipv4 CIPSO w/o categories smack: Revert "smackfs: Added check catlen" smack: remove /smack/logging if audit is not configured smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label smack: dont compile ipv6 code unless ipv6 is configured Smack: fix typos and spelling errors	2025-03-25 16:04:11 -07:00
Linus Torvalds	59c017ce9e	selinux/stable-6.15 PR 20250323 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmfgWewUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNTXA/9F7Fo5ov6mP15jChSSZWuVPBdi1gD y8Q8sCbu/KeCRO1Qb4QTv8ZCVGkP+EDK47IIvLXj27Aa19y1m3E4r1mddCSBQ3eu jSqR/kOXf3j8AWPP2m4qYK/EJvNqNd/V67PkktFal+95crcmz3IDV68qWuNafdSc r8VuprrEw+NSuKhPh4e2tM0hvOmAzePuvI6gGPb9z7Fj807/qfSOteAkvYpJ1y+d vZzHLeu3FRExxu4wKZZymGpT2+5Xl/MrjRJUtKuJdxXW8FphPUr5cfHDIP0Ae97w J70RGr0Oy02dQnCtAMkOGi7lpS1S1r0Qnhr+eloQQvG7J2eRRPZqGrmaU69qopAo JY/Xc7/r29pGwGnXtiHKZ4ej65mTIN9bmPsHIjjr01hiB/gEUnX2vdVSwVYLxOsF dzCnXb1VBc4mSIJ1Sjst0a6CRNPVA3U/bCfCbvfeyhn6A0XHmJI1PDRbxEXavnki sQIAtLv5M0Pyzyjij+6qHfd8TsUgiH/rtR6st31SnL5iqIWkE9wPMFldg064vHgS 8dECnF7G9ZU/OErJjTQVshJE3fDEJvbQj8YIq7u1gQOZV02G7U3q4R3Aoj3GoSKJ dMjoeG18+yuIevW/OHWtbjp4QMpp2R4xuXaJJlfsB2OaOX6jSS4S5KpYO3eKQ/Jd kNQxuG8VD3tK8jc= =QD7q -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add additional SELinux access controls for kernel file reads/loads The SELinux kernel file read/load access controls were never updated beyond the initial kernel module support, this pull request adds support for firmware, kexec, policies, and x.509 certificates. - Add support for wildcards in network interface names There are a number of userspace tools which auto-generate network interface names using some pattern of <XXXX>-<NN> where <XXXX> is a fixed string, e.g. "podman", and <NN> is a increasing counter. Supporting wildcards in the SELinux policy for network interfaces simplifies the policy associted with these interfaces. - Fix a potential problem in the kernel read file SELinux code SELinux should always check the file label in the security_kernel_read_file() LSM hook, regardless of if the file is being read in chunks. Unfortunately, the existing code only considered the file label on the first chunk; this pull request fixes this problem. There is more detail in the individual commit, but thankfully the existing code didn't expose a bug due to multi-stage reads only taking place in one driver, and that driver loading a file type that isn't targeted by the SELinux policy. - Fix the subshell error handling in the example policy loader Minor fix to SELinux example policy loader in scripts/selinux due to an undesired interaction with subshells and errexit. * tag 'selinux-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: get netif_wildcard policycap from policy instead of cache selinux: support wildcard network interface names selinux: Chain up tool resolving errors in install_policy.sh selinux: add permission checks for loading other kinds of kernel files selinux: always check the file label in selinux_kernel_read_file() selinux: fix spelling error	2025-03-25 15:52:32 -07:00
Linus Torvalds	054570267d	lsm/stable-6.15 PR 20250323 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmfgWgMUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNW5RAAvCDq5gBtY0aTNlULe637EVLSh+t8 PkSzHzu/NlzU6BfjtwSm2fuML8welTGxSwUPxUzMCI91gPdkGeFktefavT3xa+QI BHWROn7fEJ/KmRZvngPeIkgLr5xhF5nBJmc/Jw71qem20zRzNgJnpzMX16d10Phx dxd2xOO1qM3bv6Z9RcIssZRGaN+PHngpWWg+0B69XuaBUso87S6NDyKNn1XPmvoz as96k+Wk/xAZGVEeCbs/+H5rBx6DLg+FfTRa06Oh4BFsqedpkDPxLrTgCJGJkA0H dsK6O/993zvjx0Jn4ZPoJ9n35S82BmkCsz4bGq1xVl6FYUiMcm3/8yO41wllS+w4 j+RlTU/RIdB7n8EKyMMl1hj1stTvt3Bi9F5Cbf7ZEv0snfR00K4KVpi17jnFjUHv kpOiEtXZb/NGQip7UAuUq0PisfqbiO4jJurYHRetDgv1WCy6+C8ufM5t6I+cnvmG VG+dlxcW+rDIn6bLRVuGi9TJRsQ6eox9ipa+qEKNNiOXgftELcgT7m74nAS5m0uv n5rDa221nPXecEB0X7d6YUFk711lly90dbelNeLrmv1w6jl8L1PpS1oBaW+UzGu9 46eGBd6pzu9otvK9WVyDEdotDOCrgH0sd7pTetqDhLJZ7KrGwyyqO2gD/JroUKcC lnxBQwPnat86iI8= =oxfV -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Various minor updates to the LSM Rust bindings Changes include marking trivial Rust bindings as inlines and comment tweaks to better reflect the LSM hooks. - Add LSM/SELinux access controls to io_uring_allowed() Similar to the io_uring_disabled sysctl, add a LSM hook to io_uring_allowed() to enable LSMs a simple way to enforce security policy on the use of io_uring. This pull request includes SELinux support for this new control using the io_uring/allowed permission. - Remove an unused parameter from the security_perf_event_open() hook The perf_event_attr struct parameter was not used by any currently supported LSMs, remove it from the hook. - Add an explicit MAINTAINERS entry for the credentials code We've seen problems in the past where patches to the credentials code sent by non-maintainers would often languish on the lists for multiple months as there was no one explicitly tasked with the responsibility of reviewing and/or merging credentials related code. Considering that most of the code under security/ has a vested interest in ensuring that the credentials code is well maintained, I'm volunteering to look after the credentials code and Serge Hallyn has also volunteered to step up as an official reviewer. I posted the MAINTAINERS update as a RFC to LKML in hopes that someone else would jump up with an "I'll do it!", but beyond Serge it was all crickets. - Update Stephen Smalley's old email address to prevent confusion This includes a corresponding update to the mailmap file. * tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: mailmap: map Stephen Smalley's old email addresses lsm: remove old email address for Stephen Smalley MAINTAINERS: add Serge Hallyn as a credentials reviewer MAINTAINERS: add an explicit credentials entry cred,rust: mark Credential methods inline lsm,rust: reword "destroy" -> "release" in SecurityCtx lsm,rust: mark SecurityCtx methods inline perf: Remove unnecessary parameter of security check lsm: fix a missing security_uring_allowed() prototype io_uring,lsm,selinux: add LSM hooks for io_uring_setup() io_uring: refactor io_uring_allowed()	2025-03-25 15:44:19 -07:00
Linus Torvalds	fc13a78e1f	hardening updates for v6.15-rc1 - loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan Vadivel) - samples/check-exec: Fix script name (Mickaël Salaün) - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov) - lib/string_choices: Sort by function name (R Sundar) - hardening: Allow default HARDENED_USERCOPY to be set at compile time (Mel Gorman) - uaccess: Split out compile-time checks into ucopysize.h - kbuild: clang: Support building UM with SUBARCH=i386 - x86: Enable i386 FORTIFY_SOURCE on Clang 16+ - ubsan/overflow: Rework integer overflow sanitizer option - Add missing __nonstring annotations for callers of memtostr()/strtomem() - Add __must_be_noncstr() and have memtostr()/strtomem() check for it - Introduce __nonstring_array for silencing future GCC 15 warnings -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hGrgAKCRA2KwveOeQk u1WvAQC3ZxFu3b0Omfmht2pPqCltf2UOQNvUx3egjoeXpUaNSgD+Lxr/T4xksy7E jHh7rCYDkruOWs3DHA5JjRQcf0BBLQo= =FTQp -----END PGP SIGNATURE----- Merge tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "As usual, it's scattered changes all over. Patches touching things outside of our traditional areas in the tree have been Acked by maintainers or were trivial changes: - loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan Vadivel) - samples/check-exec: Fix script name (Mickaël Salaün) - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov) - lib/string_choices: Sort by function name (R Sundar) - hardening: Allow default HARDENED_USERCOPY to be set at compile time (Mel Gorman) - uaccess: Split out compile-time checks into ucopysize.h - kbuild: clang: Support building UM with SUBARCH=i386 - x86: Enable i386 FORTIFY_SOURCE on Clang 16+ - ubsan/overflow: Rework integer overflow sanitizer option - Add missing __nonstring annotations for callers of memtostr()/strtomem() - Add __must_be_noncstr() and have memtostr()/strtomem() check for it - Introduce __nonstring_array for silencing future GCC 15 warnings" * tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits) compiler_types: Introduce __nonstring_array hardening: Enable i386 FORTIFY_SOURCE on Clang 16+ x86/build: Remove -ffreestanding on i386 with GCC ubsan/overflow: Enable ignorelist parsing and add type filter ubsan/overflow: Enable pattern exclusions ubsan/overflow: Rework integer overflow sanitizer option to turn on everything samples/check-exec: Fix script name yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl() kbuild: clang: Support building UM with SUBARCH=i386 loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported lib/string_choices: Rearrange functions in sorted order string.h: Validate memtostr()/strtomem() arguments more carefully compiler.h: Introduce __must_be_noncstr() nilfs2: Mark on-disk strings as nonstring uapi: stddef.h: Introduce __kernel_nonstring x86/tdx: Mark message.bytes as nonstring string: kunit: Mark nonstring test strings as __nonstring scsi: qla2xxx: Mark device strings as nonstring scsi: mpt3sas: Mark device strings as nonstring scsi: mpi3mr: Mark device strings as nonstring ...	2025-03-24 15:18:08 -07:00
Randy Dunlap	6df401a2ee	ipe: policy_fs: fix kernel-doc warnings Use the "struct" keyword in kernel-doc when describing struct ipefs_file. Add kernel-doc for the struct members also. Don't use kernel-doc notation for 'policy_subdir'. kernel-doc does not support documentation comments for data definitions. This eliminates multiple kernel-doc warnings: security/ipe/policy_fs.c:21: warning: cannot understand function prototype: 'struct ipefs_file ' security/ipe/policy_fs.c:407: warning: cannot understand function prototype: 'const struct ipefs_file policy_subdir[] = ' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Fan Wu <wufan@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jmorris@namei.org> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: linux-security-module@vger.kernel.org Signed-off-by: Fan Wu <wufan@kernel.org>	2025-03-24 13:36:00 -07:00
Linus Torvalds	26d8e43079	vfs-6.15-rc1.async.dir -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90rNwAKCRCRxhvAZXjc onBJAP9Z8Ywmlb5KQ1E3HvDmkwyY6yOSyZ9/CmbzrkCJ8ywYkQD/d9/xt0EP/O/q N8YtzXArHWt7u0YbcVpy9WK3F72BdwU= =VJgY -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs async dir updates from Christian Brauner: "This contains cleanups that fell out of the work from async directory handling: - Change kern_path_locked() and user_path_locked_at() to never return a negative dentry. This simplifies the usability of these helpers in various places - Drop d_exact_alias() from the remaining place in NFS where it is still used. This also allows us to drop the d_exact_alias() helper completely - Drop an unnecessary call to fh_update() from nfsd_create_locked() - Change i_op->mkdir() to return a struct dentry Change vfs_mkdir() to return a dentry provided by the filesystems which is hashed and positive. This allows us to reduce the number of cases where the resulting dentry is not positive to very few cases. The code in these places becomes simpler and easier to understand. - Repack DENTRY_* and LOOKUP_* flags" * tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: doc: fix inline emphasis warning VFS: Change vfs_mkdir() to return the dentry. nfs: change mkdir inode_operation to return alternate dentry if needed. fuse: return correct dentry for ->mkdir ceph: return the correct dentry on mkdir hostfs: store inode in dentry after mkdir if possible. Change inode_operations.mkdir to return struct dentry * nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked() nfs/vfs: discard d_exact_alias() VFS: add common error checks to lookup_one_qstr_excl() VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry VFS: repack LOOKUP_ bit flags. VFS: repack DENTRY_ flags.	2025-03-24 10:47:14 -07:00
Linus Torvalds	fd101da676	vfs-6.15-rc1.mount -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90qAwAKCRCRxhvAZXjc on7lAP0akpIsJMWREg9tLwTNTySI1b82uKec0EAgM6T7n/PYhAD/T4zoY8UYU0Pr qCxwTXHUVT6bkNhjREBkfqq9OkPP8w8= =GxeN -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - Mount notifications The day has come where we finally provide a new api to listen for mount topology changes outside of /proc/<pid>/mountinfo. A mount namespace file descriptor can be supplied and registered with fanotify to listen for mount topology changes. Currently notifications for mount, umount and moving mounts are generated. The generated notification record contains the unique mount id of the mount. The listmount() and statmount() api can be used to query detailed information about the mount using the received unique mount id. This allows userspace to figure out exactly how the mount topology changed without having to generating diffs of /proc/<pid>/mountinfo in userspace. - Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount api - Support detached mounts in overlayfs Since last cycle we support specifying overlayfs layers via file descriptors. However, we don't allow detached mounts which means userspace cannot user file descriptors received via open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to attach them to a mount namespace via move_mount() first. This is cumbersome and means they have to undo mounts via umount(). Allow them to directly use detached mounts. - Allow to retrieve idmappings with statmount Currently it isn't possible to figure out what idmapping has been attached to an idmapped mount. Add an extension to statmount() which allows to read the idmapping from the mount. - Allow creating idmapped mounts from mounts that are already idmapped So far it isn't possible to allow the creation of idmapped mounts from already idmapped mounts as this has significant lifetime implications. Make the creation of idmapped mounts atomic by allow to pass struct mount_attr together with the open_tree_attr() system call allowing to solve these issues without complicating VFS lookup in any way. The system call has in general the benefit that creating a detached mount and applying mount attributes to it becomes an atomic operation for userspace. - Add a way to query statmount() for supported options Allow userspace to query which mount information can be retrieved through statmount(). - Allow superblock owners to force unmount * tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits) umount: Allow superblock owners to force umount selftests: add tests for mount notification selinux: add FILE__WATCH_MOUNTNS samples/vfs: fix printf format string for size_t fs: allow changing idmappings fs: add kflags member to struct mount_kattr fs: add open_tree_attr() fs: add copy_mount_setattr() helper fs: add vfs_open_tree() helper statmount: add a new supported_mask field samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP selftests: add tests for using detached mount with overlayfs samples/vfs: check whether flag was raised statmount: allow to retrieve idmappings uidgid: add map_id_range_up() fs: allow detached mounts in clone_private_mount() selftests/overlayfs: test specifying layers as O_PATH file descriptors fs: support O_PATH fds with FSCONFIG_SET_FD vfs: add notifications for mount attach and detach fanotify: notify on mount attach and detach ...	2025-03-24 09:34:10 -07:00
Linus Torvalds	99c21beaab	vfs-6.15-rc1.misc -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90p4AAKCRCRxhvAZXjc ojMIAP9atkG3u7+490+NGWLdulQlaHnD51Owa9MiW87UfKpsTQEArwi/NrJqXJNT PFQ2xIa5TxG+9haChR89w3kjZ6b/hgs= =iDkx -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Add CONFIG_DEBUG_VFS infrastucture: - Catch invalid modes in open - Use the new debug macros in inode_set_cached_link() - Use debug-only asserts around fd allocation and install - Place f_ref to 3rd cache line in struct file to resolve false sharing Cleanups: - Start using anon_inode_getfile_fmode() helper in various places - Don't take f_lock during SEEK_CUR if exclusion is guaranteed by f_pos_lock - Add unlikely() to kcmp() - Remove legacy ->remount_fs method from ecryptfs after port to the new mount api - Remove invalidate_inodes() in favour of evict_inodes() - Simplify ep_busy_loopER by removing unused argument - Avoid mmap sem relocks when coredumping with many missing pages - Inline getname() - Inline new_inode_pseudo() and de-staticize alloc_inode() - Dodge an atomic in putname if ref == 1 - Consistently deref the files table with rcu_dereference_raw() - Dedup handling of struct filename init and refcounts bumps - Use wq_has_sleeper() in end_dir_add() - Drop the lock trip around I_NEW wake up in evict() - Load the ->i_sb pointer once in inode_sb_list_{add,del} - Predict not reaching the limit in alloc_empty_file() - Tidy up do_sys_openat2() with likely/unlikely - Call inode_sb_list_add() outside of inode hash lock - Sort out fd allocation vs dup2 race commentary - Turn page_offset() into a wrapper around folio_pos() - Remove locking in exportfs around ->get_parent() call - try_lookup_one_len() does not need any locks in autofs - Fix return type of several functions from long to int in open - Fix return type of several functions from long to int in ioctls Fixes: - Fix watch queue accounting mismatch" * tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits) fs: sort out fd allocation vs dup2 race commentary, take 2 fs: call inode_sb_list_add() outside of inode hash lock fs: tidy up do_sys_openat2() with likely/unlikely fs: predict not reaching the limit in alloc_empty_file() fs: load the ->i_sb pointer once in inode_sb_list_{add,del} fs: drop the lock trip around I_NEW wake up in evict() fs: use wq_has_sleeper() in end_dir_add() VFS/autofs: try_lookup_one_len() does not need any locks fs: dedup handling of struct filename init and refcounts bumps fs: consistently deref the files table with rcu_dereference_raw() exportfs: remove locking around ->get_parent() call. fs: use debug-only asserts around fd allocation and install fs: dodge an atomic in putname if ref == 1 vfs: Remove invalidate_inodes() ecryptfs: remove NULL remount_fs from super_operations watch_queue: fix pipe accounting mismatch fs: place f_ref to 3rd cache line in struct file to resolve false sharing epoll: simplify ep_busy_loop by removing always 0 argument fs: Turn page_offset() into a wrapper around folio_pos() kcmp: improve performance adding an unlikely hint to task comparisons ...	2025-03-24 09:13:50 -07:00
David Howells	75845c6c1a	keys: Fix UAF in key_put() Once a key's reference count has been reduced to 0, the garbage collector thread may destroy it at any time and so key_put() is not allowed to touch the key after that point. The most key_put() is normally allowed to do is to touch key_gc_work as that's a static global variable. However, in an effort to speed up the reclamation of quota, this is now done in key_put() once the key's usage is reduced to 0 - but now the code is looking at the key after the deadline, which is forbidden. Fix this by using a flag to indicate that a key can be gc'd now rather than looking at the key's refcount in the garbage collector. Fixes: `9578e327b2` ("keys: update key quotas in key_put()") Reported-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/673b6aec.050a0220.87769.004a.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2025-03-22 15:36:49 +02:00
Mickaël Salaün	6d9ac5e4d7	landlock: Prepare to add second errata Potentially include errata for Landlock ABI v5 (Linux 6.10) and v6 (Linux 6.12). That will be useful for the following signal scoping erratum. As explained in errata.h, this commit should be backportable without conflict down to ABI v5. It must then not include the errata/abi-6.h file. Fixes: `54a6e6bbf3` ("landlock: Add signal scoping") Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-21 12:12:21 +01:00
Mickaël Salaün	48fce74fe2	landlock: Add erratum for TCP fix Add erratum for the TCP socket identification fixed with commit `854277e2cc` ("landlock: Fix non-TCP sockets restriction"). Fixes: `854277e2cc` ("landlock: Fix non-TCP sockets restriction") Cc: Günther Noack <gnoack@google.com> Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-4-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-21 12:12:20 +01:00
Mickaël Salaün	15383a0d63	landlock: Add the errata interface Some fixes may require user space to check if they are applied on the running kernel before using a specific feature. For instance, this applies when a restriction was previously too restrictive and is now getting relaxed (e.g. for compatibility reasons). However, non-visible changes for legitimate use (e.g. security fixes) do not require an erratum. Because fixes are backported down to a specific Landlock ABI, we need a way to avoid cherry-pick conflicts. The solution is to only update a file related to the lower ABI impacted by this issue. All the ABI files are then used to create a bitmask of fixes. The new errata interface is similar to the one used to get the supported Landlock ABI version, but it returns a bitmask instead because the order of fixes may not match the order of versions, and not all fixes may apply to all versions. The actual errata will come with dedicated commits. The description is not actually used in the code but serves as documentation. Create the landlock_abi_version symbol and use its value to check errata consistency. Update test_base's create_ruleset_checks_ordering tests and add errata tests. This commit is backportable down to the first version of Landlock. Fixes: `3532b0b435` ("landlock: Enable user space to infer supported features") Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-21 12:12:19 +01:00
Mickaël Salaün	624f177d8f	landlock: Move code to ease future backports To ease backports in setup.c, let's group changes from __lsm_ro_after_init to __ro_after_init with commit `f22f9aaf6c` ("selinux: remove the runtime disable functionality"), and the landlock_lsmid addition with commit `f3b8788cde` ("LSM: Identify modules by more than name"). That will help to backport the following errata. Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318161443.279194-2-mic@digikod.net Fixes: `f3b8788cde` ("LSM: Identify modules by more than name") Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-03-21 12:12:16 +01:00
Arnd Bergmann	edc8e80bf8	crypto: lib/Kconfig - hide library options Any driver that needs these library functions should already be selecting the corresponding Kconfig symbols, so there is no real point in making these visible. The original patch that made these user selectable described problems with drivers failing to select the code they use, but for consistency it's better to always use 'select' on a symbol than to mix it with 'depends on'. Fixes: `e56e189855` ("lib/crypto: add prompts back to crypto libraries") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-03-21 17:33:39 +08:00
Christian Göttsche	a3d3043ef2	selinux: get netif_wildcard policycap from policy instead of cache Retrieve the netif_wildcard policy capability in security_netif_sid() from the locked active policy instead of the cached value in selinux_state. Fixes: `8af43b61c1` ("selinux: support wildcard network interface names") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: /netlabel/netif/ due to a typo in the description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-03-17 16:22:04 -04:00
Blaise Boscaccy	082f1db02c	security: Propagate caller information in bpf hooks Certain bpf syscall subcommands are available for usage from both userspace and the kernel. LSM modules or eBPF gatekeeper programs may need to take a different course of action depending on whether or not a BPF syscall originated from the kernel or userspace. Additionally, some of the bpf_attr struct fields contain pointers to arbitrary memory. Currently the functionality to determine whether or not a pointer refers to kernel memory or userspace memory is exposed to the bpf verifier, but that information is missing from various LSM hooks. Here we augment the LSM hooks to provide this data, by simply passing a boolean flag indicating whether or not the call originated in the kernel, in any hook that contains a bpf_attr struct that corresponds to a subcommand that may be called from the kernel. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250310221737.821889-2-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Stephen Smalley	9da4f4f987	lsm: remove old email address for Stephen Smalley Remove my old, no longer functioning, email address from comments. Could alternatively replace with my current email but seems redundant with MAINTAINERS and prone to being out of date. Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: subject tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-03-10 15:58:43 -04:00
Greg Kroah-Hartman	993a47bd7b	Linux 6.14-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfOKBUeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1aQH/iC+Oyij4VxAjBek BOXIT/p6CwlIXb8ObiWWcRjDPizlcxb3RaV8J2RO+IqaQ2wltxpFANq2G7Re2FPm SNcEpIURAOVcxHGedcfFA91srO5F4FzNTO8LVp7MIbcgMYy3pdk+dbZmi6A691R+ t9pb74m+MAnF1o/MUx7pUlhAT/4ymuuR0F7WCSg4h0Xwe5m0nlJY89kJBC7PCjyd n3mdhsz3rDSLmt/z/T7HGD89r8sYSvm9cOKtL3ELgGTrm7boQV8ii9Y9w04DI8PQ JmIernugcCxmhH36mVUAHgJf2+/T388xFUh/D5+skeUOUZpaJZG866rnb32WpsHc eWLFUeg= =Wypt -----END PGP SIGNATURE----- Merge 6.14-rc6 into driver-core-next We need the driver core fix in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-03-10 17:37:25 +01:00
Kees Cook	d70da12453	hardening: Enable i386 FORTIFY_SOURCE on Clang 16+ The i386 regparm bug exposed with FORTIFY_SOURCE with Clang was fixed in Clang 16[1]. Link: `c167c0a4dc` [1] Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250308042929.1753543-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-08 09:16:42 -08:00
Jan Kara	93fd0d46cb	vfs: Remove invalidate_inodes() The function can be replaced by evict_inodes. The only difference is that evict_inodes() skips the inodes with positive refcount without touching ->i_lock, but they are equivalent as evict_inodes() repeats the refcount check after having grabbed ->i_lock. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250307144318.28120-2-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-08 12:19:22 +01:00
Dr. David Alan Gilbert	4ae89b1fe7	capability: Remove unused has_capability The vanilla has_capability() function has been unused since 2018's commit `dcb569cf6a` ("Smack: ptrace capability use fixes") Remove it. Fixup a comment in security/commoncap.c that referenced it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Serge Hallyn <sergeh@kernel.org>	2025-03-07 22:03:09 -06:00
Oleg Nesterov	71c769231f	yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl() current->group_leader is stable, no need to take rcu_read_lock() and do get/put_task_struct(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250219161417.GA20851@redhat.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-07 19:58:05 -08:00
Christian Göttsche	8af43b61c1	selinux: support wildcard network interface names Add support for wildcard matching of network interface names. This is useful for auto-generated interfaces, for example podman creates network interfaces for containers with the naming scheme podman0, podman1, podman2, ... To maintain backward compatibility guard this feature with a new policy capability 'netif_wildcard'. Netifcon definitions are compared against in the order given by the policy, so userspace tools should sort them in a reasonable order. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-03-07 15:11:10 -05:00
Arulpandiyan Vadivel	d73ef9ec87	loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported Updated the MODULE_COMPRESS_NONE with MODULE_COMPRESS as it was no longer available from kernel modules. As MODULE_COMPRESS and MODULE_DECOMPRESS depends on MODULES removing MODULES as well. Fixes: `c7ff693fa2` ("module: Split modules_install compression and in-kernel decompression") Signed-off-by: Arulpandiyan Vadivel <arulpandiyan.vadivel@siemens.com> Link: https://lore.kernel.org/r/20250302103831.285381-1-arulpandiyan.vadivel@siemens.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-03 09:35:50 -08:00
Mel Gorman	ca758b147e	fortify: Move FORTIFY_SOURCE under 'Kernel hardening options' FORTIFY_SOURCE is a hardening option both at build and runtime. Move it under 'Kernel hardening options'. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250123221115.19722-5-mgorman@techsingularity.net Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-28 11:51:31 -08:00
Mel Gorman	d2132f453e	mm: security: Allow default HARDENED_USERCOPY to be set at compile time HARDENED_USERCOPY defaults to on if enabled at compile time. Allow hardened_usercopy= default to be set at compile time similar to init_on_alloc= and init_on_free=. The intent is that hardening options that can be disabled at runtime can set their default at build time. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Link: https://lore.kernel.org/r/20250123221115.19722-3-mgorman@techsingularity.net Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-28 11:51:31 -08:00
Mel Gorman	f4d4e8b9d6	mm: security: Move hardened usercopy under 'Kernel hardening options' There is a submenu for 'Kernel hardening options' under "Security". Move HARDENED_USERCOPY under the hardening options as it is clearly related. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250123221115.19722-2-mgorman@techsingularity.net Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-28 11:51:31 -08:00
NeilBrown	88d5baf690	Change inode_operations.mkdir to return struct dentry * Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. The recommendation to use d_drop();d_splice_alias() is ugly but fits with current practice. A planned future patch will change this. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-27 20:00:17 +01:00
Miklos Szeredi	7d90fb5253	selinux: add FILE__WATCH_MOUNTNS Watching mount namespaces for changes (mount, umount, move mount) was added by previous patches. This patch adds the file/watch_mountns permission that can be applied to nsfs files (/proc/$$/ns/mnt), making it possible to allow or deny watching a particular namespace for changes. Suggested-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/all/CAHC9VhTOmCjCSE2H0zwPOmpFopheexVb6jyovz92ZtpKtoVv6A@mail.gmail.com/ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250224154836.958915-1-mszeredi@redhat.com Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-27 09:16:04 +01:00
"Kipp N. Davis"	2c2b1e0597	selinux: add permission checks for loading other kinds of kernel files Although the LSM hooks for loading kernel modules were later generalized to cover loading other kinds of files, SELinux didn't implement corresponding permission checks, leaving only the module case covered. Define and add new permission checks for these other cases. Signed-off-by: Cameron K. Williams <ckwilliams.work@gmail.com> Signed-off-by: Kipp N. Davis <kippndavis.work@gmx.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: merge fuzz, line length, and spacing fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-02-26 15:14:43 -05:00
Linus Torvalds	c0d35086a2	Landlock fix for v6.14-rc5 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ785FhAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSILQBAMFwpFClzjVeWyLFNd/gaTlPWeedvnag+yZu CK9q39jOAP9tj1unQBFpsI7jsTk6ZxPVxb4DymPIirZMb/5FuxUnDQ== =QrSM -----END PGP SIGNATURE----- Merge tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fixes to TCP socket identification, documentation, and tests" * tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add binaries to .gitignore selftests/landlock: Test that MPTCP actions are not restricted selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP landlock: Fix non-TCP sockets restriction landlock: Minor typo and grammar fixes in IPC scoping documentation landlock: Fix grammar error selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB	2025-02-26 11:55:44 -08:00
Linus Torvalds	d62fdaf51b	integrity-v6.14-fix -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ78LSBQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5X1kAQCsB9LO0NX+DhZILASeSJfzjAkY0cGY jzQ3eq3keLkXrQD/RjeKD9e/qjBrXX0C4qbr9e8+JYuzY22o911bEiK67wg= =lOQV -----END PGP SIGNATURE----- Merge tag 'integrity-v6.14-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fixes from Mimi Zohar: "One bugfix and one spelling cleanup. The bug fix restores a performance improvement" * tag 'integrity-v6.14-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: Reset IMA_NONACTION_RULE_FLAGS after post_setattr integrity: fix typos and spelling errors	2025-02-26 11:47:19 -08:00
Luo Gengkun	9ec84f79c5	perf: Remove unnecessary parameter of security check It seems that the attr parameter was never been used in security checks since it was first introduced by: commit `da97e18458` ("perf_event: Add support for LSM and SELinux checks") so remove it. Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-02-26 14:13:58 -05:00
Greg Kroah-Hartman	2ce177e9b3	Merge 6.14-rc3 into driver-core-next We need the faux_device changes in here for future work. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 07:24:33 +01:00
Konstantin Andreev	a158a937d8	smack: recognize ipv4 CIPSO w/o categories If SMACK label has CIPSO representation w/o categories, e.g.: \| # cat /smack/cipso2 \| foo 10 \| @ 250/2 \| ... then SMACK does not recognize such CIPSO in input ipv4 packets and substitues '' label instead. Audit records may look like \| lsm=SMACK fn=smack_socket_sock_rcv_skb action=denied \| subject="" object="_" requested=w pid=0 comm="swapper/1" ... This happens in two steps: 1) security/smack/smackfs.c`smk_set_cipso does not clear NETLBL_SECATTR_MLS_CAT from (struct smack_known )skp->smk_netlabel.flags on assigning CIPSO w/o categories: \| rcu_assign_pointer(skp->smk_netlabel.attr.mls.cat, ncats.attr.mls.cat); \| skp->smk_netlabel.attr.mls.lvl = ncats.attr.mls.lvl; 2) security/smack/smack_lsm.c`smack_from_secattr can not match skp->smk_netlabel with input packet's struct netlbl_lsm_secattr sap because sap->flags have not NETLBL_SECATTR_MLS_CAT (what is correct) but skp->smk_netlabel.flags have (what is incorrect): \| if ((sap->flags & NETLBL_SECATTR_MLS_CAT) == 0) { \| if ((skp->smk_netlabel.flags & \| NETLBL_SECATTR_MLS_CAT) == 0) \| found = 1; \| break; \| } This commit sets/clears NETLBL_SECATTR_MLS_CAT in skp->smk_netlabel.flags according to the presense of CIPSO categories. The update of smk_netlabel is not atomic, so input packets processing still may be incorrect during short time while update proceeds. Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-16 14:17:55 -08:00
Konstantin Andreev	c7fb50cecf	smack: Revert "smackfs: Added check catlen" This reverts commit `ccfd889acb` The indicated commit * does not describe the problem that change tries to solve * has programming issues * introduces a bug: forever clears NETLBL_SECATTR_MLS_CAT in (struct smack_known *)skp->smk_netlabel.flags Reverting the commit to reapproach original problem Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-16 10:30:23 -08:00
Sebastian Andrzej Siewior	741c10b096	kernfs: Use RCU to access kernfs_node::name. Using RCU lifetime rules to access kernfs_node::name can avoid the trouble with kernfs_rename_lock in kernfs_name() and kernfs_path_from_node() if the fs was created with KERNFS_ROOT_INVARIANT_PARENT. This is usefull as it allows to implement kernfs_path_from_node() only with RCU protection and avoiding kernfs_rename_lock. The lock is only required if the __parent node can be changed and the function requires an unchanged hierarchy while it iterates from the node to its parent. The change is needed to allow the lookup of the node's path (kernfs_path_from_node()) from context which runs always with disabled preemption and or interrutps even on PREEMPT_RT. The problem is that kernfs_rename_lock becomes a sleeping lock on PREEMPT_RT. I went through all ::name users and added the required access for the lookup with a few extensions: - rdtgroup_pseudo_lock_create() drops all locks and then uses the name later on. resctrl supports rename with different parents. Here I made a temporal copy of the name while it is used outside of the lock. - kernfs_rename_ns() accepts NULL as new_parent. This simplifies sysfs_move_dir_ns() where it can set NULL in order to reuse the current name. - kernfs_rename_ns() is only using kernfs_rename_lock if the parents are different. All users use either kernfs_rwsem (for stable path view) or just RCU for the lookup. The ::name uses always RCU free. Use RCU lifetime guarantees to access kernfs_node::name. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/67251dc6.050a0220.529b6.015e.GAE@google.com/ Reported-by: Hillf Danton <hdanton@sina.com> Closes: https://lore.kernel.org/20241102001224.2789-1-hdanton@sina.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-7-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-15 17:46:32 +01:00
Mikhail Ivanov	854277e2cc	landlock: Fix non-TCP sockets restriction Use sk_is_tcp() to check if socket is TCP in bind(2) and connect(2) hooks. SMC, MPTCP, SCTP protocols are currently restricted by TCP access rights. The purpose of TCP access rights is to provide control over ports that can be used by userland to establish a TCP connection. Therefore, it is incorrect to deny bind(2) and connect(2) requests for a socket of another protocol. However, SMC, MPTCP and RDS implementations use TCP internal sockets to establish communication or even to exchange packets over a TCP connection [1]. Landlock rules that configure bind(2) and connect(2) usage for TCP sockets should not cover requests for sockets of such protocols. These protocols have different set of security issues and security properties, therefore, it is necessary to provide the userland with the ability to distinguish between them (eg. [2]). Control over TCP connection used by other protocols can be achieved with upcoming support of socket creation control [3]. [1] https://lore.kernel.org/all/62336067-18c2-3493-d0ec-6dd6a6d3a1b5@huawei-partners.com/ [2] https://lore.kernel.org/all/20241204.fahVio7eicim@digikod.net/ [3] https://lore.kernel.org/all/20240904104824.1844082-1-ivanov.mikhail1@huawei-partners.com/ Closes: https://github.com/landlock-lsm/linux/issues/40 Fixes: `fff69fb03d` ("landlock: Support network rules with TCP bind and connect") Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> Link: https://lore.kernel.org/r/20250205093651.1424339-2-ivanov.mikhail1@huawei-partners.com [mic: Format commit message to 72 columns] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:09 +01:00
Tanya Agarwal	143c9aae04	landlock: Fix grammar error Fix grammar error in comments that were identified using the codespell tool. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20250123194208.2660-1-tanyaagarwal25699@gmail.com [mic: Simplify commit message] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:08 +01:00
Konstantin Andreev	bf9f14c91a	smack: remove /smack/logging if audit is not configured If CONFIG_AUDIT is not set then SMACK does not generate audit messages, however, keeps audit control file, /smack/logging, while there is no entity to control. This change removes audit control file /smack/logging when audit is not configured in the kernel Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-13 18:33:55 -08:00
Konstantin Andreev	6cce0cc386	smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label Since inception [1], SMACK initializes ipv* child socket security for connection-oriented communications (tcp/sctp/dccp) during accept() syscall, in the security_sock_graft() hook: \| void smack_sock_graft(struct sock sk, ...) \| { \| // only ipv4 and ipv6 are eligible here \| // ... \| ssp = sk->sk_security; // socket security \| ssp->smk_in = skp; // process label: smk_of_current() \| ssp->smk_out = skp; // process label: smk_of_current() \| } This approach is incorrect for two reasons: A) initialization occurs too late for child socket security: The child socket is created by the kernel once the handshake completes (e.g., for tcp: after receiving ack for syn+ack). Data can legitimately start arriving to the child socket immediately, long before the application calls accept() on the socket. Those data are (currently — were) processed by SMACK using incorrect child socket security attributes. B) Incoming connection requests are handled using the listening socket's security, hence, the child socket must inherit the listening socket's security attributes. smack_sock_graft() initilizes the child socket's security with a process label, as is done for a new socket() But ... the process label is not necessarily the same as the listening socket label. A privileged application may legitimately set other in/out labels for a listening socket. When this happens, SMACK processes incoming packets using incorrect socket security attributes. In [2] Michael Lontke noticed (A) and fixed it in [3] by adding socket initialization into security_sk_clone_security() hook like \| void smack_sk_clone_security(struct sock oldsk, struct sock newsk) \| { \| (struct socket_smack )newsk->sk_security = \| (struct socket_smack *)oldsk->sk_security; \| } This initializes the child socket security with the parent (listening) socket security at the appropriate time. I was forced to revisit this old story because smack_sock_graft() was left in place by [3] and continues overwriting the child socket's labels with the process label, and there might be a reason for this, so I undertook a study. If the process label differs from the listening socket's labels, the following occurs for ipv4: assigning the smk_out is not accompanied by netlbl_sock_setattr, so the outgoing packet's cipso label does not change. So, the only effect of this assignment for interhost communications is a divergence between the program-visible “out” socket label and the cipso network label. For intrahost communications this label, however, becomes visible via secmark netfilter marking, and is checked for access rights by the client, receiving side. Assigning the smk_in affects both interhost and intrahost communications: the server begins to check access rights against an wrong label. Access check against wrong label (smk_in or smk_out), unsurprisingly fails, breaking the connection. The above affects protocols that calls security_sock_graft() during accept(), namely: {tcp,dccp,sctp}/{ipv4,ipv6} One extra security_sock_graft() caller, crypto/af_alg.c`af_alg_accept is not affected, because smack_sock_graft() does nothing for PF_ALG. To reproduce, assign non-default in/out labels to a listening socket, setup rules between these labels and client label, attempt to connect and send some data. Ipv6 specific: ipv6 packets do not convey SMACK labels. To reproduce the issue in interhost communications set opposite labels in /smack/ipv6host on both hosts. Ipv6 intrahost communications do not require tricking, because SMACK labels are conveyed via secmark netfilter marking. So, currently smack_sock_graft() is not useful, but harmful, therefore, I have removed it. This fixes the issue for {tcp,dccp}/{ipv4,ipv6}, but not sctp/{ipv4,ipv6}. Although this change is necessary for sctp+smack to function correctly, it is not sufficient because: sctp/ipv4 does not call security_sk_clone() and sctp/ipv6 ignores SMACK completely. These are separate issues, belong to other subsystem, and should be addressed separately. [1] 2008-02-04, Fixes: `e114e47377` ("Smack: Simplified Mandatory Access Control Kernel") [2] Michael Lontke, 2022-08-31, SMACK LSM checks wrong object label during ingress network traffic Link: https://lore.kernel.org/linux-security-module/6324997ce4fc092c5020a4add075257f9c5f6442.camel@elektrobit.com/ [3] 2022-08-31, michael.lontke, commit `4ca165fc6c` ("SMACK: Add sk_clone_security LSM hook") Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-13 13:25:14 -08:00
Konstantin Andreev	bfcf4004bc	smack: dont compile ipv6 code unless ipv6 is configured I want to be sure that ipv6-specific code is not compiled in kernel binaries if ipv6 is not configured. [1] was getting rid of "unused variable" warning, but, with that, it also mandated compilation of a handful ipv6- specific functions in ipv4-only kernel configurations: smk_ipv6_localhost, smack_ipv6host_label, smk_ipv6_check. Their compiled bodies are likely to be removed by compiler from the resulting binary, but, to be on the safe side, I remove them from the compiler view. [1] Fixes: `00720f0e7f` ("smack: avoid unused 'sip' variable warning") Signed-off-by: Konstantin Andreev <andreev@swemel.ru> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-12 13:05:15 -08:00
Casey Schaufler	2aad5cd1db	Smack: fix typos and spelling errors Fix typos and spelling errors in security/smack module comments that were identified using the codespell tool. No functional changes - documentation only. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2025-02-11 14:34:02 -08:00
Linus Torvalds	09fbf3d502	Redo of pathname patternization and fix spelling errors. Tetsuo Handa (3): tomoyo: use better patterns for procfs in learning mode tomoyo: fix spelling errors Tanya Agarwal (1): tomoyo: fix spelling error security/tomoyo/common.c \| 145 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------ security/tomoyo/domain.c \| 2 - security/tomoyo/securityfs_if.c \| 6 +-- security/tomoyo/tomoyo.c \| 5 -- 4 files changed, 117 insertions(+), 41 deletions(-) -----BEGIN PGP SIGNATURE----- iQJXBAABCABBFiEEQ8gzaWI9etOpbC/HQl8SjQxk9SoFAmerSHojHHBlbmd1aW4t a2VybmVsQGktbG92ZS5zYWt1cmEubmUuanAACgkQQl8SjQxk9Sreeg/+KHl8qDxi eLbml7ihosJ+qOJhCdHgsboNqoIrPFS7fp89YLwOBoBBsvT1zbqwcKJhAU0bAlFt bDHcKnt0anrokzxAhajVlikmnKP4CeW/fFf4m2UXJ7/uYinoYyizutqsduthF6G9 MwxCpSnReOKsys8tVZyBtCNk53vKk6o0I1GGZ56GG7fNOz6UR81z8mUhJRcqj9Gk 0gqdfC8RLNbXbygOHbZINM3TUmn/MW9E/ALRU+a1Ay+9ZazJE26R87N7/O7ksDB3 LGB1OZ6eEJkCMJm583wa2AdmUdaT5DPv7PWVEYkPjFkYjQzPO4GuBONPJHsElM1O OybzMuQKlWqMWvFwky7wbUNaKkxqXNm9BVCg1tdbuQubB6moZSUjcZTM4SXnSD5K Vn5AGfgLHMJ5nC576ZbyAARYdo3I0ffkv1Ql8sUio3giybLw+QFika1dpQrVupka fWZxj3znJXP0sNpunb9ffOMlf92jj/XD9382bBNUbR/aEP50YwvYnAfWek1+H4YI kVDFiCb3PVx1pwfJjKezcsN1Rp0JXRFXMDcd0Pm7n0vCs3MdWwJ44/xUICBs/UR2 B/+ocCQykQDgMYW9LC9fZnZ+A0KfhLJpUNjqX+B2pQR3Cyr51l+BY8QyUlBL3Eci XQyX+dqDdh0L7dXyBjowYGvo9cEJ6aNkwQ8= =cPFI -----END PGP SIGNATURE----- Merge tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo Pull tomoyo fixes from Tetsuo Handa: "Redo of pathname patternization and fix spelling errors" * tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: use better patterns for procfs in learning mode tomoyo: fix spelling errors tomoyo: fix spelling error	2025-02-11 10:19:36 -08:00
Nathan Chancellor	3e45553acb	apparmor: Remove unused variable 'sock' in __file_sock_perm() When CONFIG_SECURITY_APPARMOR_DEBUG_ASSERTS is disabled, there is a warning that sock is unused: security/apparmor/file.c: In function '__file_sock_perm': security/apparmor/file.c:544:24: warning: unused variable 'sock' [-Wunused-variable] 544 \| struct socket sock = (struct socket ) file->private_data; \| ^~~~ sock was moved into aa_sock_file_perm(), where the same check is present, so remove sock and the assertion from __file_sock_perm() to fix the warning. Fixes: `c05e705812` ("apparmor: add fine grained af_unix mediation") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501190757.myuLxLyL-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:18:45 -08:00
Mateusz Guzik	67e370aa7f	apparmor: use the condition in AA_BUG_FMT even with debug disabled This follows the established practice and fixes a build failure for me: security/apparmor/file.c: In function ‘__file_sock_perm’: security/apparmor/file.c:544:24: error: unused variable ‘sock’ [-Werror=unused-variable] 544 \| struct socket sock = (struct socket ) file->private_data; \| ^~~~ Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:17:59 -08:00
Tanya Agarwal	aabbe6f908	apparmor: fix typos and spelling errors Fix typos and spelling errors in apparmor module comments that were identified using the codespell tool. No functional changes - documentation only. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Ryan Lee <ryan.lee@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:17:49 -08:00
Jiapeng Chong	04fe43104e	apparmor: Modify mismatched function name No functional modification involved. security/apparmor/lib.c:93: warning: expecting prototype for aa_mask_to_str(). Prototype was for val_mask_to_str() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=13606 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:17:33 -08:00
Jiapeng Chong	aa904fa118	apparmor: Modify mismatched function name No functional modification involved. security/apparmor/file.c:184: warning: expecting prototype for aa_lookup_fperms(). Prototype was for aa_lookup_condperms() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=13605 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:16:45 -08:00
Nathan Chancellor	509f8cb2ff	apparmor: Fix checking address of an array in accum_label_info() clang warns: security/apparmor/label.c:206:15: error: address of array 'new->vec' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] 206 \| AA_BUG(!new->vec); \| ~~~~~~^~~ The address of this array can never be NULL because it is not at the beginning of a structure. Convert the assertion to check that the new pointer is not NULL. Fixes: `de4754c801` ("apparmor: carry mediation check on label") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501191802.bDp2voTJ-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-02-10 11:16:30 -08:00
Hamza Mahfooz	c6ad9fdbd4	io_uring,lsm,selinux: add LSM hooks for io_uring_setup() It is desirable to allow LSM to configure accessibility to io_uring because it is a coarse yet very simple way to restrict access to it. So, add an LSM for io_uring_allowed() to guard access to io_uring. Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Acked-by: Jens Axboe <axboe@kernel.dk> [PM: merge fuzz due to changes in preceding patches, subj tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-02-07 17:17:49 -05:00
Paul Moore	5fc80fb5b7	selinux: always check the file label in selinux_kernel_read_file() Commit `2039bda1fa` ("LSM: Add "contents" flag to kernel_read_file hook") added a new flag to the security_kernel_read_file() LSM hook, "contents", which was set if a file was being read in its entirety or if it was the first chunk read in a multi-step process. The SELinux LSM callback was updated to only check against the file label if this "contents" flag was set, meaning that in multi-step reads the file label was not considered in the access control decision after the initial chunk. Thankfully the only in-tree user that performs a multi-step read is the "bcm-vk" driver and it is loading firmware, not a kernel module, so there are no security regressions to worry about. However, we still want to ensure that the SELinux code does the right thing, and always checks the file label, especially as there is a chance the file could change between chunk reads. Fixes: `2039bda1fa` ("LSM: Add "contents" flag to kernel_read_file hook") Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-02-07 15:30:54 -05:00
Kaixiong Yu	b121dd4d55	security: min_addr: move sysctl to security/min_addr.c The dac_mmap_min_addr belongs to min_addr.c, move it to min_addr.c from /kernel/sysctl.c. In the previous Linux kernel boot process, sysctl_init_bases needs to be executed before init_mmap_min_addr, So, register_sysctl_init should be executed before update_mmap_min_addr in init_mmap_min_addr. And according to the compilation condition in security/Makefile: obj-$(CONFIG_MMU) += min_addr.o if CONFIG_MMU is not defined, min_addr.c would not be included in the compilation process. So, drop the CONFIG_MMU check. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-02-07 16:53:04 +01:00
Roberto Sassu	57a0ef02fe	ima: Reset IMA_NONACTION_RULE_FLAGS after post_setattr Commit `0d73a55208` ("ima: re-introduce own integrity cache lock") mistakenly reverted the performance improvement introduced in commit `42a4c60319` ("ima: fix ima_inode_post_setattr"). The unused bit mask was subsequently removed by commit `11c60f23ed` ("integrity: Remove unused macro IMA_ACTION_RULE_FLAGS"). Restore the performance improvement by introducing the new mask IMA_NONACTION_RULE_FLAGS, equal to IMA_NONACTION_FLAGS without IMA_NEW_FILE, which is not a rule-specific flag. Finally, reset IMA_NONACTION_RULE_FLAGS instead of IMA_NONACTION_FLAGS in process_measurement(), if the IMA_CHANGE_ATTR atomic flag is set (after file metadata modification). With this patch, new files for which metadata were modified while they are still open, can be reopened before the last file close (when security.ima is written), since the IMA_NEW_FILE flag is not cleared anymore. Otherwise, appraisal fails because security.ima is missing (files with IMA_NEW_FILE set are an exception). Cc: stable@vger.kernel.org # v4.16.x Fixes: `0d73a55208` ("ima: re-introduce own integrity cache lock") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-02-04 21:36:43 -05:00
Tanya Agarwal	ceb5faef84	integrity: fix typos and spelling errors Fix typos and spelling errors in integrity module comments that were identified using the codespell tool. No functional changes - documentation only. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2025-02-04 21:36:43 -05:00
Tanya Agarwal	75eb39f2f5	selinux: fix spelling error Fix spelling error in selinux module comments that were identified using the codespell tool. No functional changes - documentation only. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-02-03 16:47:20 -05:00
Tetsuo Handa	bdc35f164b	tomoyo: use better patterns for procfs in learning mode Commit `08ae2487b2` ("tomoyo: automatically use patterns for several situations in learning mode") replaced only $PID part of procfs pathname with \$ pattern. But it turned out that we need to also replace $TID part and $FD part to make this functionality useful for e.g. /bin/lsof . Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	2025-01-31 00:27:44 +09:00
Joel Granados	1751f872cc	treewide: const qualify ctl_tables where applicable Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function. Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit `78eb4ea25c` ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers. Created this by running an spatch followed by a sed command: Spatch: virtual patch @ depends on !(file in "net") disable optional_qualifier @ identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@ + const struct ctl_table table_name [] = { ... }; sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c Reviewed-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/ Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-01-28 13:48:37 +01:00
Linus Torvalds	c159dfbdd4	Mainly individually changelogged singleton patches. The patch series in this pull are: - "lib min_heap: Improve min_heap safety, testing, and documentation" from Kuan-Wei Chiu provides various tightenings to the min_heap library code. - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some cleanup and Rust preparation in the xarray library code. - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes pathnames in some code comments. - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the new secs_to_jiffies() in various places where that is appropriate. - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen switches two filesystems to the new mount API. - "Convert ocfs2 to use folios" from Matthew Wilcox does that. - "Remove get_task_comm() and print task comm directly" from Yafang Shao removes now-unneeded calls to get_task_comm() in various places. - "squashfs: reduce memory usage and update docs" from Phillip Lougher implements some memory savings in squashfs and performs some maintainability work. - "lib: clarify comparison function requirements" from Kuan-Wei Chiu tightens the sort code's behaviour and adds some maintenance work. - "nilfs2: protect busy buffer heads from being force-cleared" from Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a corrupted image. - "nilfs2: fix kernel-doc comments for function return values" from Ryusuke Konishi fixes some nilfs kerneldoc. - "nilfs2: fix issues with rename operations" from Ryusuke Konishi addresses some nilfs BUG_ONs which syzbot was able to trigger. - "minmax.h: Cleanups and minor optimisations" from David Laight does some maintenance work on the min/max library code. - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work on the xarray library code. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5SP5QAKCRDdBJ7gKXxA jqN7AQChvwXGG43n4d5SDiA/rH7ddvowQcDqhC9cAMJ1ReR7qwEA8/LIWDE4PdMX mJnaZ1/ibpEpearrChCViApQtcyEGQI= =ti4E -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Mainly individually changelogged singleton patches. The patch series in this pull are: - "lib min_heap: Improve min_heap safety, testing, and documentation" from Kuan-Wei Chiu provides various tightenings to the min_heap library code - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some cleanup and Rust preparation in the xarray library code - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes pathnames in some code comments - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the new secs_to_jiffies() in various places where that is appropriate - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen switches two filesystems to the new mount API - "Convert ocfs2 to use folios" from Matthew Wilcox does that - "Remove get_task_comm() and print task comm directly" from Yafang Shao removes now-unneeded calls to get_task_comm() in various places - "squashfs: reduce memory usage and update docs" from Phillip Lougher implements some memory savings in squashfs and performs some maintainability work - "lib: clarify comparison function requirements" from Kuan-Wei Chiu tightens the sort code's behaviour and adds some maintenance work - "nilfs2: protect busy buffer heads from being force-cleared" from Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a corrupted image - "nilfs2: fix kernel-doc comments for function return values" from Ryusuke Konishi fixes some nilfs kerneldoc - "nilfs2: fix issues with rename operations" from Ryusuke Konishi addresses some nilfs BUG_ONs which syzbot was able to trigger - "minmax.h: Cleanups and minor optimisations" from David Laight does some maintenance work on the min/max library code - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work on the xarray library code" * tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits) ocfs2: use str_yes_no() and str_no_yes() helper functions include/linux/lz4.h: add some missing macros Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent Xarray: remove repeat check in xas_squash_marks() Xarray: distinguish large entries correctly in xas_split_alloc() Xarray: move forward index correctly in xas_pause() Xarray: do not return sibling entries from xas_find_marked() ipc/util.c: complete the kernel-doc function descriptions gcov: clang: use correct function param names latencytop: use correct kernel-doc format for func params minmax.h: remove some #defines that are only expanded once minmax.h: simplify the variants of clamp() minmax.h: move all the clamp() definitions after the min/max() ones minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp() minmax.h: reduce the #define expansion of min(), max() and clamp() minmax.h: update some comments minmax.h: add whitespace around operators and after commas nilfs2: do not update mtime of renamed directory that is not moved nilfs2: handle errors that nilfs_prepare_chunk() may return CREDITS: fix spelling mistake ...	2025-01-26 17:50:53 -08:00
Tetsuo Handa	691a1f3f18	tomoyo: fix spelling errors No functional changes. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	2025-01-26 19:13:58 +09:00
Tanya Agarwal	41f198d58b	tomoyo: fix spelling error Fix spelling error in security/tomoyo module comments that were identified using the codespell tool. No functional changes - documentation only. Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	2025-01-26 19:11:56 +09:00
Linus Torvalds	8883957b3c	\n -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmePs7oACgkQnJ2qBz9k QNmHuAf9GkLnY5u1/81xP5V9ukZ4N2yeMW0dydLS5cjWj/St5ELeMAza3jeqtJtD j36vbnmy2c5pPaGLAK8BJpMXT/R2TkmmKD004zcfqF2S3SgbGzdgO1zMZzq9KJpM woRKZtLuglDajedsDEBBcKotBhlN2+C/sQlFuL1mX4zitk9ajr0qYUB1+JqOeg5f qwPsDLT077ADpxd7lVIMcm+OqbduP5KWkBKYHpn7lJcLe1eqVMMzceJroW42zhVG Dq8Iln26bbU9Wx6FSPFCUcHEzHRHUfXmu07HN9U0X++0QgWjrmBQQLooGFB/bR4a edBrPpVas6xE4/brjgFX3gOKtv8xYg== =ewDV -----END PGP SIGNATURE----- Merge tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify pre-content notification support from Jan Kara: "This introduces a new fsnotify event (FS_PRE_ACCESS) that gets generated before a file contents is accessed. The event is synchronous so if there is listener for this event, the kernel waits for reply. On success the execution continues as usual, on failure we propagate the error to userspace. This allows userspace to fill in file content on demand from slow storage. The context in which the events are generated has been picked so that we don't hold any locks and thus there's no risk of a deadlock for the userspace handler. The new pre-content event is available only for users with global CAP_SYS_ADMIN capability (similarly to other parts of fanotify functionality) and it is an administrator responsibility to make sure the userspace event handler doesn't do stupid stuff that can DoS the system. Based on your feedback from the last submission, fsnotify code has been improved and now file->f_mode encodes whether pre-content event needs to be generated for the file so the fast path when nobody wants pre-content event for the file just grows the additional file->f_mode check. As a bonus this also removes the checks whether the old FS_ACCESS event needs to be generated from the fast path. Also the place where the event is generated during page fault has been moved so now filemap_fault() generates the event if and only if there is no uptodate folio in the page cache. Also we have dropped FS_PRE_MODIFY event as current real-world users of the pre-content functionality don't really use it so let's start with the minimal useful feature set" * tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits) fanotify: Fix crash in fanotify_init(2) fs: don't block write during exec on pre-content watched files fs: enable pre-content events on supported file systems ext4: add pre-content fsnotify hook for DAX faults btrfs: disable defrag on pre-content watched files xfs: add pre-content fsnotify hook for DAX faults fsnotify: generate pre-content permission event on page fault mm: don't allow huge faults for files with pre content watches fanotify: disable readahead if we have pre-content watches fanotify: allow to set errno in FAN_DENY permission response fanotify: report file range info with pre-content events fanotify: introduce FAN_PRE_ACCESS permission event fsnotify: generate pre-content permission event on truncate fsnotify: pass optional file access range in pre-content event fsnotify: introduce pre-content permission events fanotify: reserve event bit of deprecated FAN_DIR_MODIFY fanotify: rename a misnamed constant fanotify: don't skip extra event info if no info_mode is set fsnotify: check if file is actually being watched for pre-content events on open fsnotify: opt-in for permission events at file open time ...	2025-01-23 13:36:06 -08:00
Linus Torvalds	d0d106a2bd	bpf-next-6.14 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmeOu1YACgkQ6rmadz2v bTrrHxAAn6eqEsluWnDlzhI0OGsPjvgS00sf+MOeqiXYeS2eJ8yJuKifp38+nIQZ lIplsWU2ReUY20eizPqLPnQ7TXZGvLgp08E8yHUoZ0siWanqr9iDRfbZCCNrDMNm lMqeR1SLapMws2R/UX9JbvPn2ajIJ6Lb4wxenTfdlW6q+0hAGM6Dt0k/jBod+quq /oo+xwG3L0q4APBovJfiAFN2z6IYN03b+zLiOrpIJtMACGewEXnl3m4mkL8ZM/FV nZGPIxIUPXCpKTGEkNqxfkrnHN2wZQ4ZSKEJ6lhEEp4jrgCVITaGZ/E7jlx6fZoj bbd4YMonIPo9Nhim8p1dt8yYBhKKiE5IXIq0GqlMv5+MvAN8ylrlydpsouW1fu66 hZ1W1BxbxmrgyF0Bwo9JPOMhBHwMrmD6iH9LgiMpZf0ASeF+q9cJpoSOU5j5E9XB LpLIRf5jYTd4wZjhDmrQREReLo+Bng9DlCBu+jjh2+YTz6l6Qed+ETpENcd7lL5i IHZVbgD2RVPNJoUfdrd763HfYfDTk+50MF5FIMEyfKHz11if0E/LhBMzto22hm6b 2f8ruj/8yvg8s2dxEP3ySQgcnynlwEnGxLenUVv7uEOYKeWri1rq+fvTK5ne1OLK oHnTlkViwQb74c0r8cFW+nkyfUYTfhhBAql14rl/fMjGDO2KZ10= =f2CA -----END PGP SIGNATURE----- Merge tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "A smaller than usual release cycle. The main changes are: - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai) In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile only mode. Half of the tests are failing, since support for btf_decl_tag is still WIP, but this is a great milestone. - Convert various samples/bpf to selftests/bpf/test_progs format (Alexis Lothoré and Bastien Curutchet) - Teach verifier to recognize that array lookup with constant in-range index will always succeed (Daniel Xu) - Cleanup migrate disable scope in BPF maps (Hou Tao) - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao) - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin KaFai Lau) - Refactor verifier lock support (Kumar Kartikeya Dwivedi) This is a prerequisite for upcoming resilient spin lock. - Remove excessive 'may_goto +0' instructions in the verifier that LLVM leaves when unrolls the loops (Yonghong Song) - Remove unhelpful bpf_probe_write_user() warning message (Marco Elver) - Add fd_array_cnt attribute for prog_load command (Anton Protopopov) This is a prerequisite for upcoming support for static_branch" * tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits) selftests/bpf: Add some tests related to 'may_goto 0' insns bpf: Remove 'may_goto 0' instruction in opt_remove_nops() bpf: Allow 'may_goto 0' instruction in verifier selftests/bpf: Add test case for the freeing of bpf_timer bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT bpf: Free element after unlock in __htab_map_lookup_and_delete_elem() bpf: Bail out early in __htab_map_lookup_and_delete_elem() bpf: Free special fields after unlock in htab_lru_map_delete_node() tools: Sync if_xdp.h uapi tooling header libbpf: Work around kernel inconsistently stripping '.llvm.' suffix bpf: selftests: verifier: Add nullness elision tests bpf: verifier: Support eliding map lookup nullness bpf: verifier: Refactor helper access type tracking bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write bpf: verifier: Add missing newline on verbose() call selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED libbpf: Fix return zero when elf_begin failed selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test veristat: Load struct_ops programs only once ...	2025-01-23 08:04:07 -08:00
Linus Torvalds	754916d4a2	capabilities patches for 6.14-rc1 This branch contains basically the same two patches as last time: 1. A patch by Paul Moore to remove the cap_mmap_file() hook, as it simply returned the default return value and so doesn't need to exist. 2. A patch by Jordan Rome to add a trace event for cap_capable(), updated to address your feedback during the last cycle. Both patches have been sitting in linux-next since 6.13-rc1 with no issues. Signed-off-by: Serge E. Hallyn <serge@hallyn.com> -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmeOxO0ACgkQNXDaFycK ziSbqwf9FmQbCG9zpgHhAaODz8GXPn1EYm0TfabbfuG+hRvTQLt/7eVuLB6Tt69l lx7zM8HUjZLQW8qsDc1nmdnrvvLK6z8e97yGBBMG4uzFyzsCgNQowyDRz69IOG+l eTCUMXOQXYtO4OYm7pECBeUos8yCOpW7vdZzyyKInw0A8JXy98K880HlYoiYc7wI 9xXtKWTmqry156llwIYU/opo/Pag480Y2hzP9x5EqvTNqJ/iMEUb2Dswhf+53dOY HePwerTu1BYYupSC2gl3ujl/m6R2BroLBmOMApLiAhNtRZCm+J6rkhmMW9cFqyxZ Nyw8nAuc08cAKoobAdggD+cgFy9e6g== =WKYe -----END PGP SIGNATURE----- Merge tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux Pull capabilities updates from Serge Hallyn: - remove the cap_mmap_file() hook, as it simply returned the default return value and so doesn't need to exist (Paul Moore) - add a trace event for cap_capable() (Jordan Rome) * tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux: security: add trace event for cap_capable capabilities: remove cap_mmap_file()	2025-01-23 08:00:16 -08:00
Linus Torvalds	21266b8df5	AT_EXECVE_CHECK introduction for v6.14-rc1 - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün) - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits (Mickaël Salaün) - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hO7wAKCRA2KwveOeQk u4l+AP9UHO1KwMn3aOt6uFPj7omaoY0vpcB1rx/x5s4efNFHOAD/QjY0f+ND+HzF mKLYOIeacGEQi7TNhpnOkGjz6jzSiwg= =sMhZ -----END PGP SIGNATURE----- Merge tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull AT_EXECVE_CHECK from Kees Cook: - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün) - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits (Mickaël Salaün) - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün) * tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ima: instantiate the bprm_creds_for_exec() hook samples/check-exec: Add an enlighten "inc" interpreter and 28 tests selftests: ktap_helpers: Fix uninitialized variable samples/check-exec: Add set-exec selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits exec: Add a new AT_EXECVE_CHECK flag to execveat(2)	2025-01-22 20:34:42 -08:00
Linus Torvalds	5ab889facc	hardening updates for v6.14-rc1 - stackleak: Use str_enabled_disabled() helper (Thorsten Blum) - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven) - Add task_prctl_unknown tracepoint (Marco Elver) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hR6QAKCRA2KwveOeQk uyYrAP90cNcedNxKCIC/XfIEyS5bWqgAcEcOdLwsPQ8X130M7wEAwadkKaO7PwrF 8T3ynXxUd4z5OyuXjKQvfvPAgaxhbg4= =OoiS -----END PGP SIGNATURE----- Merge tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - stackleak: Use str_enabled_disabled() helper (Thorsten Blum) - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven) - Add task_prctl_unknown tracepoint (Marco Elver) * tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC stackleak: Use str_enabled_disabled() helper in stack_erasing_sysctl() tracing: Remove pid in task_rename tracing output tracing: Add task_prctl_unknown tracepoint	2025-01-22 20:29:53 -08:00
Linus Torvalds	ad2aec7c96	Small changes for improving usability. Tetsuo Handa (3): tomoyo: automatically use patterns for several situations in learning mode tomoyo: use realpath if symlink's pathname refers to procfs tomoyo: don't emit warning in tomoyo_write_control() security/tomoyo/common.c \| 32 +++++++++++++++++++++++++++++++- security/tomoyo/domain.c \| 11 +++++++++-- 2 files changed, 40 insertions(+), 3 deletions(-) -----BEGIN PGP SIGNATURE----- iQJXBAABCABBFiEEQ8gzaWI9etOpbC/HQl8SjQxk9SoFAmeRdEUjHHBlbmd1aW4t a2VybmVsQGktbG92ZS5zYWt1cmEubmUuanAACgkQQl8SjQxk9SqbCRAAg0aZ5pu/ p9uJhsZFzBZzTE8BcFRdMWKS+t5+RGE1KqHoNc/YsP73eb9ui/N1jckJjsYCFh4m NsqWvadVMtma4qTcAw5Um48kZ898QVoq9kFBpxCklRnH0eGJsn2kNjNmgywYKK9q TOSKTzJM2DfOEjrSdepvthp/FK76nbHdrKREBYsslPcSLFx5c5qBqx60l6p8hS0A 1+zKQM8m9H2nZ7KnuCr0e4Lhiq1W6C/BaIN/+NzHxbPVXl55nE1i4OXM3WSTC92H i6Fe+eKv5B9B5zaQwWbAGhfONcL8pkmgwTvPFtZLj7y5t5HW8TxmPMFZ/jE5E1Kt AFFLWvXIlDZ5aYU4pbNUAA4RCNm2hWw1vniPgLPiJte2ft2AdSxBHO9J6w/W4xs9 W44TvsFXDfJgKsoY4Hy5HR3VxleIFA5tvIyWp1ONp7rZ+4WKolTQYW0hEz26V30K +urTZFusv3+229z3ZM8qsPUbL/WLML1jRw8E3AtnBMRJ92c8tpFaBA8b+o829zz8 QAFWeca3psgPqxHPZsnNeM8lOLOrtBik3+CNpmXJLq6KL2BrI9eRDUntBwHbBoRu /hAJX0feI/4yoixzr5sOk5UMNlp+G5CDBQb9NkTynUnr3Q1KRcgKWC4f8cAQlTb2 6OfT8xw2AID0/XXxJVLOdDRTPnYEVU3VH9c= =wawL -----END PGP SIGNATURE----- Merge tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo Pull tomoyo updates from Tetsuo Handa: "Small changes to improve usability" * tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: automatically use patterns for several situations in learning mode tomoyo: use realpath if symlink's pathname refers to procfs tomoyo: don't emit warning in tomoyo_write_control()	2025-01-22 20:25:00 -08:00
Linus Torvalds	de5817bbfb	Landlock updates for v6.14-rc1 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ5EMhBAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSMv0BAMOG2TFwq+UhbtxtL6pM7qzxfdWg6GR/t4t8 MFasAcCaAQDtTnW0HymHge8k7JFgWHHp0JBu7V7dhFrdJoS+718aDA== =1Hfr -----END PGP SIGNATURE----- Merge tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mostly factors out some Landlock code and prepares for upcoming audit support. Because files with invalid modes might be visible after filesystem corruption, Landlock now handles those weird files too. A few sample and test issues are also fixed" * tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add layout1.umount_sandboxer tests selftests/landlock: Add wrappers.h selftests/landlock: Fix error message landlock: Optimize file path walks and prepare for audit support selftests/landlock: Add test to check partial access in a mount tree landlock: Align partial refer access checks with final ones landlock: Simplify initially denied access rights landlock: Move access types landlock: Factor out check_access_path() selftests/landlock: Fix build with non-default pthread linking landlock: Use scoped guards for ruleset in landlock_add_rule() landlock: Use scoped guards for ruleset landlock: Constify get_mode_access() landlock: Handle weird files samples/landlock: Fix possible NULL dereference in parse_path() selftests/landlock: Remove unused macros in ptrace_test.c	2025-01-22 20:20:55 -08:00
Linus Torvalds	9cb2bf599b	Hi, Here's the keys changes for 6.14-rc1. BR, Jarkko -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZ49r8AAKCRAaerohdGur 0qiFAP9Av1djr8/HA+VidLPOe5neBkRYSuX54yNz3TGBOpjmvwD+L2YJfJYcBAd+ sku1MLkpLbMmBCLSTNhqjSJWxx4vngg= =iF42 -----END PGP SIGNATURE----- Merge tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull keys updates from Jarkko Sakkinen. Avoid using stack addresses for sg lists. And a cleanup. * tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y keys: drop shadowing dead prototype	2025-01-22 19:47:17 -08:00
Linus Torvalds	690ffcd817	selinux/stable-6.14 PR 20250121 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQE9YUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXO66BAAmmhqw3Cm94u6QOjYTQCNrHpmZVpv atPfW0EIlvYWuLGvyQZVh/2SKrYm0o7iNlTlOyMcBV9BfFUZd6vepX3L+ylhMacG L2lg5jgl11zUZdo8m++38kCABbdMexhTzgtfdAm+w2RmLRoOzXjBOCDx18sWVtCy aV3DQAvl6qdU/Y5U6PccmOCwgFVQEmWzQ4A1CMq696Fybr4EzTjI8mCCnotHWarz cgfMHHf5RYR4M4VdmxWo3MR6y6Qiq19/Vsy43YP/G/A0Ad+mfLqhHmc27+Mx2bDk IfdrvOOjaVxiEeIJe6mOePcW9p1D9q4OrPmBZHxN1+R3ck7k0MgVLIDvSzLDMbbj 3PSZx2UFk1xz+B0x3hvzhAXJ5YfbAjPj1Z65HlLIQBFBo5jvLWMrLxdpcH4eRdhT ovTqFuB4wQwYOeeKlXlnCWFitsAynjo9qGxcqjxG63geJfnBlsnLoYIa0g+cN6Uf 3Ty0+zeHDfCajj40buvtOWv98CyAMF5vBopnr18Kfo4upp6pgVERVBoGy2Yw020I yItiRhi1fpV31J8Gxrd7WA2/OmZZLISnAJtKMSsyd+hBihfOjVZ9LhKKYJk4vH+X mWVOdplHpCDe6y2EJE4EmaNwQCOVJfg4/Xvh9fghELdBdc91wFGPDO36AitDBNr8 /o13aUvFarsEmtA= =UJsr -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Extended permissions supported in conditional policy The SELinux extended permissions, aka "xperms", allow security admins to target individuals ioctls, and recently netlink messages, with their SELinux policy. Adding support for conditional policies allows admins to toggle the granular xperms using SELinux booleans, helping pave the way for greater use of xperms in general purpose SELinux policies. This change bumps the maximum SELinux policy version to 34. - Fix a SCTP/SELinux error return code inconsistency Depending on the loaded SELinux policy, specifically it's EXTSOCKCLASS support, the bind(2) LSM/SELinux hook could return different error codes due to the SELinux code checking the socket's SELinux object class (which can vary depending on EXTSOCKCLASS) and not the socket's sk_protocol field. We fix this by doing the obvious, and looking at the sock->sk_protocol field instead of the object class. - Makefile fixes to properly cleanup av_permissions.h Add av_permissions.h to "targets" so that it is properly cleaned up using the kbuild infrastructure. - A number of smaller improvements by Christian Göttsche A variety of straightforward changes to reduce code duplication, reduce pointer lookups, migrate void pointers to defined types, simplify code, constify function parameters, and correct iterator types. * tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: make more use of str_read() when loading the policy selinux: avoid unnecessary indirection in struct level_datum selinux: use known type instead of void pointer selinux: rename comparison functions for clarity selinux: rework match_ipv6_addrmask() selinux: constify and reconcile function parameter names selinux: avoid using types indicating user space interaction selinux: supply missing field initializers selinux: add netlink nlmsg_type audit message selinux: add support for xperms in conditional policies selinux: Fix SCTP error inconsistency in selinux_socket_bind() selinux: use native iterator types selinux: add generated av_permissions.h to targets	2025-01-21 20:09:14 -08:00
Linus Torvalds	f96a974170	lsm/stable-6.14 PR 20250121 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe 6LLOjmF7LLw+G4M= =fgyr -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improved handling of LSM "secctx" strings through lsm_context struct The LSM secctx string interface is from an older time when only one LSM was supported, migrate over to the lsm_context struct to better support the different LSMs we now have and make it easier to support new LSMs in the future. These changes explain the Rust, VFS, and networking changes in the diffstat. - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are enabled Small tweak to be a bit smarter about when we build the LSM's common audit helpers. - Check for absurdly large policies from userspace in SafeSetID SafeSetID policies rules are fairly small, basically just "UID:UID", it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which helps quiet a number of syzbot related issues. While work is being done to address the syzbot issues through other mechanisms, this is a trivial and relatively safe fix that we can do now. - Various minor improvements and cleanups A collection of improvements to the kernel selftests, constification of some function parameters, removing redundant assignments, and local variable renames to improve readability. * tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lockdown: initialize local array before use to quiet static analysis safesetid: check size of policy writes net: corrections for security_secid_to_secctx returns lsm: rename variable to avoid shadowing lsm: constify function parameters security: remove redundant assignment to return variable lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test binder: initialize lsm_context structure rust: replace lsm context+len with lsm_context lsm: secctx provider check on release lsm: lsm_context in security_dentry_init_security lsm: use lsm_context in security_inode_getsecctx lsm: replace context+len with lsm_context lsm: ensure the correct LSM context releaser	2025-01-21 20:03:04 -08:00
Linus Torvalds	678ca9f78e	One minor code improvement for v6.14 -----BEGIN PGP SIGNATURE----- iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmeQGOEXHGNhc2V5QHNj aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGbYA//RwccdJJXj5r3gUjfEb1p58hV z2D2cbJ6P23+e+mTa+AF/EqDQyRY1SizVnV/yV+ztfkXQ5DM0MLIMziWnenmwCeQ FTFp9D+/3q9J26+SdYV4JN9I/dVYeMVjIzEIQr82lEoiAWhPgNeJ8CKlme14wpUb jEqtFdAkNVwTxVi3OjMWs6liLBmkaskx2m9AXorsA7MXS6msuWhFmzuElJCt8KID DPUgEx9ny90SlCY3+FDz922v6cw4b4AU/z6Bf8ZBbf3cr9VcDOnq/W9D1NeOCfqb NXBHDIbjCNmQgR2vcfpDPW1tDk36xJdSOfIzWSlg8ZpMGWBe8Xnla5rpTTH4zaVq shMhLxzV1VTQoyrwsyvUbC7g7CssEBh7v+rIhdA0h59cBniMKanGTffAXEBzgXnl 39kMicu7jX8vLn6HRgmOUrSdr4KgXKzhuW4GfCe/5dBh+2RnnUTn+iUVdj4xts/X SwODT0XT5FOofVh2hZYZMZK3MnFa/P401vOENMH3OiY3hYHBJdrO3JtPz9n+vxUF bA4rhwvSuBTsLjyJC/8qTLU3sgfCIHl9pBGVlMeRb2yt+7+EAJ6WDEr3LBdIrh9P 6obhtG3zbQsplpVLwh7MbxuPcZT3u2pBX7ME9zfNWMKltwKKSPho/Bt72G8w2YNi VApxTfc6HKbWzJoCk04= =tXff -----END PGP SIGNATURE----- Merge tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next Pull smack update from Casey Schaufler: "One minor code improvement for v6.14" * tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next: smack: deduplicate access to string conversion	2025-01-21 19:59:46 -08:00
Linus Torvalds	0ca0cf9f8c	integrity-v6.14 -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ4rmvxQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5avvAP4tzjNdVp3tFeq9bA8gQZEJ74E6q/6a Qb18xTn54hxGXAEAuotzwJiandLBk/3hkHIE0BbyzmULiVMpos4qVuUmTwI= =ZvaD -----END PGP SIGNATURE----- Merge tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "There's just a couple of changes: two kernel messages addressed, a measurement policy collision addressed, and one policy cleanup. Please note that the contents of the IMA measurement list is potentially affected. The builtin tmpfs IMA policy rule change might introduce additional measurements, while detecting a reboot might eliminate some measurements" * tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ignore suffixed policy rule comments ima: limit the builtin 'tcb' dont_measure tmpfs policy rule ima: kexec: silence RCU list traversal warning ima: Suspend PCR extends and log appends when rebooting	2025-01-21 19:54:32 -08:00
David Gstir	e8d9fab39d	KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y With vmalloc stack addresses enabled (CONFIG_VMAP_STACK=y) DCP trusted keys can crash during en- and decryption of the blob encryption key via the DCP crypto driver. This is caused by improperly using sg_init_one() with vmalloc'd stack buffers (plain_key_blob). Fix this by always using kmalloc() for buffers we give to the DCP crypto driver. Cc: stable@vger.kernel.org # v6.10+ Fixes: `0e28bf61a5` ("KEYS: trusted: dcp: fix leak of blob encryption key") Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2025-01-21 11:25:23 +02:00
Linus Torvalds	4b84a4c8d4	vfs-6.14-rc1.misc -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRjQAKCRCRxhvAZXjc omUyAP9k31Qr7RY1zNtmpPfejqc+3Xx+xXD7NwHr+tONWtUQiQEA/F94qU2U3ivS AzyDABWrEQ5ZNsm+Rq2Y3zyoH7of3ww= =s3Bu -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Support caching symlink lengths in inodes The size is stored in a new union utilizing the same space as i_devices, thus avoiding growing the struct or taking up any more space When utilized it dodges strlen() in vfs_readlink(), giving about 1.5% speed up when issuing readlink on /initrd.img on ext4 - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag If a file system supports uncached buffered IO, it may set FOP_DONTCACHE and enable support for RWF_DONTCACHE. If RWF_DONTCACHE is attempted without the file system supporting it, it'll get errored with -EOPNOTSUPP - Enable VBOXGUEST and VBOXSF_FS on ARM64 Now that VirtualBox is able to run as a host on arm64 (e.g. the Apple M3 processors) we can enable VBOXSF_FS (and in turn VBOXGUEST) for this architecture. Tested with various runs of bonnie++ and dbench on an Apple MacBook Pro with the latest Virtualbox 7.1.4 r165100 installed Cleanups: - Delay sysctl_nr_open check in expand_files() - Use kernel-doc includes in fiemap docbook - Use page->private instead of page->index in watch_queue - Use a consume fence in mnt_idmap() as it's heavily used in link_path_walk() - Replace magic number 7 with ARRAY_SIZE() in fc_log - Sort out a stale comment about races between fd alloc and dup2() - Fix return type of do_mount() from long to int - Various cosmetic cleanups for the lockref code Fixes: - Annotate spinning as unlikely() in __read_seqcount_begin The annotation already used to be there, but got lost in commit `52ac39e5db` ("seqlock: seqcount_t: Implement all read APIs as statement expressions") - Fix proc_handler for sysctl_nr_open - Flush delayed work in delayed fput() - Fix grammar and spelling in propagate_umount() - Fix ESP not readable during coredump In /proc/PID/stat, there is the kstkesp field which is the stack pointer of a thread. While the thread is active, this field reads zero. But during a coredump, it should have a valid value However, at the moment, kstkesp is zero even during coredump - Don't wake up the writer if the pipe is still full - Fix unbalanced user_access_end() in select code" * tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) gfs2: use lockref_init for qd_lockref erofs: use lockref_init for pcl->lockref dcache: use lockref_init for d_lockref lockref: add a lockref_init helper lockref: drop superfluous externs lockref: use bool for false/true returns lockref: improve the lockref_get_not_zero description lockref: remove lockref_put_not_zero fs: Fix return type of do_mount() from long to int select: Fix unbalanced user_access_end() vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64 pipe_read: don't wake up the writer if the pipe is still full selftests: coredump: Add stackdump test fs/proc: do_task_stat: Fix ESP not readable during coredump fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag fs: sort out a stale comment about races between fd alloc and dup2 fs: Fix grammar and spelling in propagate_umount() fs: fc_log replace magic number 7 with ARRAY_SIZE() fs: use a consume fence in mnt_idmap() file: flush delayed work in delayed fput() ...	2025-01-20 09:40:49 -08:00
John Johansen	e6b0876769	apparmor: fix dbus permission queries to v9 ABI dbus permission queries need to be synced with fine grained unix mediation to avoid potential policy regressions. To ensure that dbus queries don't result in a case where fine grained unix mediation is not being applied but dbus mediation is check the loaded policy support ABI and abort the query if policy doesn't support the v9 ABI. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:13 -08:00
John Johansen	dcd7a55941	apparmor: gate make fine grained unix mediation behind v9 abi Fine grained unix mediation in Ubuntu used ABI v7, and policy using this has propogated onto systems where fine grained unix mediation was not supported. The userspace policy compiler supports downgrading policy so the policy could be shared without changes. Unfortunately this had the side effect that policy was not updated for the none Ubuntu systems and enabling fine grained unix mediation on those systems means that a new kernel can break a system with existing policy that worked with the previous kernel. With fine grained af_unix mediation this regression can easily break the system causing boot to fail, as it affect unix socket files, non-file based unix sockets, and dbus communication. To aoid this regression move fine grained af_unix mediation behind a new abi. This means that the system's userspace and policy must be updated to support the new policy before it takes affect and dropping a new kernel on existing system will not result in a regression. The abi bump is done in such a way as existing policy can be activated on the system by changing the policy abi declaration and existing unix policy rules will apply. Policy then only needs to be incrementally updated, can even be backported to existing Ubuntu policy. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:13 -08:00
John Johansen	c05e705812	apparmor: add fine grained af_unix mediation Extend af_unix mediation to support fine grained controls based on the type (abstract, anonymous, fs), the address, and the labeling on the socket. This allows for using socket addresses to label and the socket and control which subjects can communicate. The unix rule format follows standard apparmor rules except that fs based unix sockets can be mediated by existing file rules. None fs unix sockets can be mediated by a unix socket rule. Where The address of an abstract unix domain socket begins with the @ character, similar to how they are reported (as paths) by netstat -x. The address then follows and may contain pattern matching and any characters including the null character. In apparmor null characters must be specified by using an escape sequence \000 or \x00. The pattern matching is the same as is used by file path matching so * will not match / even though it has no special meaning with in an abstract socket name. Eg. allow unix addr=@*, Autobound unix domain sockets have a unix sun_path assigned to them by the kernel, as such specifying a policy based address is not possible. The autobinding of sockets can be controlled by specifying the special auto keyword. Eg. allow unix addr=auto, To indicate that the rule only applies to auto binding of unix domain sockets. It is important to note this only applies to the bind permission as once the socket is bound to an address it is indistinguishable from a socket that have an addr bound with a specified name. When the auto keyword is used with other permissions or as part of a peer addr it will be replaced with a pattern that can match an autobound socket. Eg. For some kernels allow unix rw addr=auto, It is important to note, this pattern may match abstract sockets that were not autobound but have an addr that fits what is generated by the kernel when autobinding a socket. Anonymous unix domain sockets have no sun_path associated with the socket address, however it can be specified with the special none keyword to indicate the rule only applies to anonymous unix domain sockets. Eg. allow unix addr=none, If the address component of a rule is not specified then the rule applies to autobind, abstract and anonymous sockets. The label on the socket can be compared using the standard label= rule conditional. Eg. allow unix addr=@foo peer=(label=bar), see man apparmor.d for full syntax description. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	b4940d913c	apparmor: in preparation for finer networking rules rework match_prot Rework match_prot into a common fn that can be shared by all the networking rules. This will provide compatibility with current socket mediation, via the early bailout permission encoding. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	6cc6a0523d	apparmor: lift kernel socket check out of critical section There is no need for the kern check to be in the critical section, it only complicates the code and slows down the case where the socket is being created by the kernel. Lifting it out will also allow socket_create to share common template code, with other socket_permission checks. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	9045aa25d1	apparmor: remove af_select macro The af_select macro just adds a layer of unnecessary abstraction that makes following what the code is doing harder. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	ce9e3b3fa2	apparmor: add ability to mediate caps with policy state machine Currently the caps encoding is very limited and can't be used with conditionals. Allow capabilities to be mediated by the state machine. This will allow us to add conditionals to capabilities that aren't possible with the current encoding. This patch only adds support for using the state machine and retains the old encoding lookup as part of the runtime mediation code to support older policy abis. A follow on patch will move backwards compatibility to a mapping function done at policy load time. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	a9eb185be8	apparmor: fix x_table_lookup when stacking is not the first entry x_table_lookup currently does stacking during label_parse() if the target specifies a stack but its only caller ensures that it will never be used with stacking. Refactor to slightly simplify the code in x_to_label(), this also fixes a long standing problem where x_to_labels check on stacking is only on the first element to the table option list, instead of the element that is found and used. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	84c455decf	apparmor: add support for profiles to define the kill signal Previously apparmor has only sent SIGKILL but there are cases where it can be useful to send a different signal. Allow the profile to optionally specify a different value. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	2e12c5f060	apparmor: add additional flags to extended permission. This is a step towards merging the file and policy state machines. With the switch to extended permissions the state machine's ACCEPT2 table became unused freeing it up to store state specific flags. The first flags to be stored are FLAG_OWNER and FLAG other which paves the way towards merging the file and policydb perms into a single permission table. Currently Lookups based on the objects ownership conditional will still need separate fns, this will be address in a following patch. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	de4754c801	apparmor: carry mediation check on label In order to speed up the mediated check, precompute and store the result as a bit per class type. This will not only allow us to speed up the mediation check but is also a step to removing the unconfined special cases as the unconfined check can be replaced with the generic label_mediates() check. Note: label check does not currently work for capabilities and resources which need to have their mediation updated first. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	34d31f2338	apparmor: cleanup: refactor file_perm() to doc semantics of some checks Provide semantics, via fn names, for some checks being done in file_perm(). This is a preparatory patch for improvements to both permission caching and delegation, where the check will become more involved. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	35fad5b462	apparmor: remove explicit restriction that unconfined cannot use change_hat There does not need to be an explicit restriction that unconfined can't use change_hat. Traditionally unconfined doesn't have hats so change_hat could not be used. But newer unconfined profiles have the potential of having hats, and even system unconfined will be able to be replaced with a profile that allows for hats. To remain backwards compitible with expected return codes, continue to return -EPERM if the unconfined profile does not have any hats. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	cd769b05cc	apparmor: ensure labels with more than one entry have correct flags labels containing more than one entry need to accumulate flag info from profiles that the label is constructed from. This is done correctly for labels created by a merge but is not being done for labels created by an update or directly created via a parse. This technically is a bug fix, however the effect in current code is to cause early unconfined bail out to not happen (ie. without the fix it is slower) on labels that were created via update or a parse. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	0bc8c6862f	apparmor: switch signal mediation to use RULE_MEDIATES Currently signal mediation is using a hard coded form of the RULE_MEDIATES check. This hides the intended semantics, and means this specific check won't pickup any changes or improvements made in the RULE_MEDIATES check. Switch to using RULE_MEDIATES(). Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	46b9b994dd	apparmor: remove redundant unconfined check. profile_af_perm and profile_af_sk_perm are only ever called after checking that the profile is not unconfined. So we can drop these redundant checks. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:12 -08:00
John Johansen	280799f724	apparmor: cleanup: attachment perm lookup to use lookup_perms() Remove another case of code duplications. Switch to using the generic routine instead of the current custom checks. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:11 -08:00
John Johansen	71e6cff3e0	apparmor: Improve debug print infrastructure Make it so apparmor debug output can be controlled by class flags as well as the debug flag on labels. This provides much finer control at what is being output so apparmor doesn't flood the logs with information that is not needed, making it hard to find what is important. Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:11 -08:00
Thorsten Blum	c602537de3	apparmor: Use str_yes_no() helper function Remove hard-coded strings by using the str_yes_no() helper function. Fix a typo in a comment: s/unpritable/unprintable/ Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: John Johansen <john.johansen@canonical.com>	2025-01-18 06:47:11 -08:00

1 2 3 4 5 ...

6543 Commits