Commit Graph

396 Commits

Author SHA1 Message Date
Eric Biggers
00d549bb89 lib/crypto: arm64/sha1: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization.  (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)

To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t.  Update the
assembly code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:48 -07:00
Eric Biggers
60e3f1e9b7 lib/crypto: arm64/sha512: Migrate optimized SHA-512 code to library
Instead of exposing the arm64-optimized SHA-512 code via arm64-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function.  This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-512 code was disabled
by default.  SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int or 'unsigned int' to size_t.  Update the
ARMv8 CE assembly function accordingly.  The scalar assembly function
actually already treated it as size_t.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:19 -07:00
Eric Biggers
e0fca17755 crypto: sha512 - Rename conflicting symbols
Rename existing functions and structs in architecture-optimized SHA-512
code that had names conflicting with the upcoming library interface
which will be added to <crypto/sha2.h>: sha384_init, sha512_init,
sha512_update, sha384, and sha512.

Note: all affected code will be superseded by later commits that migrate
the arch-optimized SHA-512 code into the library.  This commit simply
keeps the kernel building for the initial introduction of the library.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:19 -07:00
Herbert Xu
adcb9e32e5 crypto: arm64/sha256 - Add simd block function
Add CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD and a SIMD block function
so that the caller can decide whether to use SIMD.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:45 +08:00
Eric Biggers
6e36be511d crypto: arm64/sha256 - implement library instead of shash
Instead of providing crypto_shash algorithms for the arch-optimized
SHA-256 code, instead implement the SHA-256 library.  This is much
simpler, it makes the SHA-256 library functions be arch-optimized, and
it fixes the longstanding issue where the arch-optimized SHA-256 was
disabled by default.  SHA-256 still remains available through
crypto_shash, but individual architectures no longer need to handle it.

Remove support for SHA-256 finalization from the ARMv8 CE assembly code,
since the library does not yet support architecture-specific overrides
of the finalization.  (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)

To match sha256_blocks_arch(), change the type of the nblocks parameter
of the assembly functions from int or 'unsigned int' to size_t.  Update
the ARMv8 CE assembly function accordingly.  The scalar and NEON
assembly functions actually already treated it as size_t.

While renaming the assembly files, also fix the naming quirks where
"sha2" meant sha256, and "sha512" meant both sha256 and sha512.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:43 +08:00
Eric Biggers
642cfc0680 crypto: arm64/sha256 - remove obsolete chunking logic
Since kernel-mode NEON sections are now preemptible on arm64, there is
no longer any need to limit the length of them.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:43 +08:00
Herbert Xu
d5a582a782 crypto: arm64/polyval - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28 19:40:54 +08:00
Eric Biggers
cc16e228a2 crypto: arm64 - move library functions to arch/arm64/lib/crypto/
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the arm64 ChaCha and Poly1305 library
functions into a new directory arch/arm64/lib/crypto/ that does not
depend on CRYPTO.  This mirrors the distinction between crypto/ and
lib/crypto/.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28 19:40:53 +08:00
Eric Biggers
e2df5fb770 crypto: arm64 - drop redundant dependencies on ARM64
arch/arm64/crypto/Kconfig is sourced only when CONFIG_ARM64=y, so there
is no need for the symbols defined inside it to depend on ARM64.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28 19:40:53 +08:00
Herbert Xu
432f98cf56 crypto: arm64/sha1 - Set finalize for short finup
Always set sctx->finalize before calling finup as it may not have
been set previously on a short final.

Reported-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Fixes: b97d31100e ("crypto: arm64/sha1 - Use API partial block handling")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-26 19:12:24 +08:00
Herbert Xu
ef17008481 crypto: arm64/sm4 - Use API partial block handling
Use the Crypto API partial block handling.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
4dc3c40c4d crypto: arm64/aes - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
045f17b444 crypto: arm64/sm3-neon - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
6ba8c5f5a4 crypto: arm64/sm3-ce - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
2d0d18d801 crypto: arm/sha512 - Use API partial block handling
Use the Crypto API partial block handling.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
f294a6d9e9 crypto: arm64/sha512-ce - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:46 +08:00
Herbert Xu
b333c273ab crypto: arm64/sha3-ce - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:46 +08:00
Herbert Xu
a417f16f88 crypto: arm64/sha256 - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:46 +08:00
Herbert Xu
be32039547 crypto: arm64/sha256-ce - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:46 +08:00
Herbert Xu
b97d31100e crypto: arm64/sha1 - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 11:33:47 +08:00
Herbert Xu
9a7c987fb9 crypto: arm64/ghash - Use API partial block handling
Use the Crypto API partial block handling.

Also remove the unnecessary SIMD fallback path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 11:33:47 +08:00
Eric Biggers
bb9c648b33 crypto: lib/poly1305 - restore ability to remove modules
Though the module_exit functions are now no-ops, they should still be
defined, since otherwise the modules become unremovable.

Fixes: 1f81c58279 ("crypto: arm/poly1305 - remove redundant shash algorithm")
Fixes: f4b1a73aec ("crypto: arm64/poly1305 - remove redundant shash algorithm")
Fixes: 378a337ab4 ("crypto: powerpc/poly1305 - implement library instead of shash")
Fixes: 21969da642 ("crypto: x86/poly1305 - remove redundant shash algorithm")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-19 11:18:28 +08:00
Eric Biggers
8821d26926 crypto: lib/chacha - restore ability to remove modules
Though the module_exit functions are now no-ops, they should still be
defined, since otherwise the modules become unremovable.

Fixes: 08820553f3 ("crypto: arm/chacha - remove the redundant skcipher algorithms")
Fixes: 8c28abede1 ("crypto: arm64/chacha - remove the skcipher algorithms")
Fixes: f7915484c0 ("crypto: powerpc/chacha - remove the skcipher algorithms")
Fixes: ceba0eda83 ("crypto: riscv/chacha - implement library instead of skcipher")
Fixes: 632ab0978f ("crypto: x86/chacha - remove the skcipher algorithms")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-19 11:18:28 +08:00
Eric Biggers
f4b1a73aec crypto: arm64/poly1305 - remove redundant shash algorithm
Since crypto/poly1305.c now registers a poly1305-$(ARCH) shash algorithm
that uses the architecture's Poly1305 library functions, individual
architectures no longer need to do the same.  Therefore, remove the
redundant shash algorithm from the arch-specific code and leave just the
library functions there.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:25 +08:00
Eric Biggers
ecaa4be128 crypto: poly1305 - centralize the shash wrappers for arch code
Following the example of the crc32, crc32c, and chacha code, make the
crypto subsystem register both generic and architecture-optimized
poly1305 shash algorithms, both implemented on top of the appropriate
library functions.  This eliminates the need for every architecture to
implement the same shash glue code.

Note that the poly1305 shash requires that the key be prepended to the
data, which differs from the library functions where the key is simply a
parameter to poly1305_init().  Previously this was handled at a fairly
low level, polluting the library code with shash-specific code.
Reorganize things so that the shash code handles this quirk itself.

Also, to register the architecture-optimized shashes only when
architecture-optimized code is actually being used, add a function
poly1305_is_arch_optimized() and make each arch implement it.  Change
each architecture's Poly1305 module_init function to arch_initcall so
that the CPU feature detection is guaranteed to run before
poly1305_is_arch_optimized() gets called by crypto/poly1305.c.  (In
cases where poly1305_is_arch_optimized() just returns true
unconditionally, using arch_initcall is not strictly needed, but it's
still good to be consistent across architectures.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:24 +08:00
Herbert Xu
0a1376744c crypto: cbcmac - Set block size properly
The block size of a hash algorithm is meant to be the number of
bytes its block function can handle.  For cbcmac that should be
the block size of the underlying block cipher instead of one.

Set the block size of all cbcmac implementations accordingly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:24 +08:00
Herbert Xu
f4065b2f63 crypto: lib/sm3 - Move sm3 library into lib/crypto
Move the sm3 library code into lib/crypto.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:24 +08:00
Herbert Xu
16aeed07c0 crypto: arm64/sha512 - Fix header inclusions
Instead of relying on linux/module.h being included through the
header file sha512_base.h, include it directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:24 +08:00
Eric Biggers
8c28abede1 crypto: arm64/chacha - remove the skcipher algorithms
Since crypto/chacha.c now registers chacha20-$(ARCH), xchacha20-$(ARCH),
and xchacha12-$(ARCH) skcipher algorithms that use the architecture's
ChaCha and HChaCha library functions, individual architectures no longer
need to do the same.  Therefore, remove the redundant skcipher
algorithms and leave just the library functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:28 +08:00
Eric Biggers
4aa6dc909e crypto: chacha - centralize the skcipher wrappers for arch code
Following the example of the crc32 and crc32c code, make the crypto
subsystem register both generic and architecture-optimized chacha20,
xchacha20, and xchacha12 skcipher algorithms, all implemented on top of
the appropriate library functions.  This eliminates the need for every
architecture to implement the same skcipher glue code.

To register the architecture-optimized skciphers only when
architecture-optimized code is actually being used, add a function
chacha_is_arch_optimized() and make each arch implement it.  Change each
architecture's ChaCha module_init function to arch_initcall so that the
CPU feature detection is guaranteed to run before
chacha_is_arch_optimized() gets called by crypto/chacha.c.  In the case
of s390, remove the CPU feature based module autoloading, which is no
longer needed since the module just gets pulled in via function linkage.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:28 +08:00
Eric Biggers
ca17aa6640 crypto: lib/chacha - remove unused arch-specific init support
All implementations of chacha_init_arch() just call
chacha_init_generic(), so it is pointless.  Just delete it, and replace
chacha_init() with what was previously chacha_init_generic().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-21 17:39:06 +08:00
Herbert Xu
37d451809f crypto: skcipher - Make skcipher_walk src.virt.addr const
Mark the src.virt.addr field in struct skcipher_walk as a pointer
to const data.  This guarantees that the user won't modify the data
which should be done through dst.virt.addr to ensure that flushing
is done when necessary.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-15 16:21:22 +08:00
Herbert Xu
65775cf313 crypto: scatterwalk - Change scatterwalk_next calling convention
Rather than returning the address and storing the length into an
argument pointer, add an address field to the walk struct and use
that to store the address.  The length is returned directly.

Change the done functions to use this stored address instead of
getting them from the caller.

Split the address into two using a union.  The user should only
access the const version so that it is never changed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-15 16:21:22 +08:00
Herbert Xu
17ec3e71ba crypto: lib/Kconfig - Hide arch options from user
The ARCH_MAY_HAVE patch missed arm64, mips and s390.  But it may
also lead to arch options being enabled but ineffective because
of modular/built-in conflicts.

As the primary user of all these options wireguard is selecting
the arch options anyway, make the same selections at the lib/crypto
option level and hide the arch options from the user.

Instead of selecting them centrally from lib/crypto, simply set
the default of each arch option as suggested by Eric Biggers.

Change the Crypto API generic algorithms to select the top-level
lib/crypto options instead of the generic one as otherwise there
is no way to enable the arch options (Eric Biggers).  Introduce a
set of INTERNAL options to work around dependency cycles on the
CONFIG_CRYPTO symbol.

Fixes: 1047e21aec ("crypto: lib/Kconfig - Fix lib built-in failure when arch is modular")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Closes: https://lore.kernel.org/oe-kbuild-all/202502232152.JC84YDLp-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:21:47 +08:00
Eric Biggers
8fd0eecd55 crypto: arm64 - use the new scatterwalk functions
Use scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map(), and use scatterwalk_done_src() which consolidates
scatterwalk_unmap(), scatterwalk_advance(), and scatterwalk_done().
Remove unnecessary code that seemed to be intended to advance to the
next sg entry, which is already handled by the scatterwalk functions.
Adjust variable naming slightly to keep things consistent.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Linus Torvalds
37b33c68b0 CRC updates for 6.14
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API.  This is much simpler and more efficient.
 
 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API.  More conversions like this will come later.
 
 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.
 
 - Add an entry to MAINTAINERS for the kernel's CRC library code.  I'm
   volunteering to maintain it.  I have additional cleanups and
   optimizations planned for future cycles.
 
 These patches have been in linux-next since -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ418ZRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyJYAP9kBlpm8W9/XY6N8SpjKaXE/vKQYHQl
 Nobhak06Us8uJwEAkcUTymWP4IwQj5A9jgBAPRw53FQcNVKIc+01C7gRHw0=
 =mqSH
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC updates from Eric Biggers:

 - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API. This is much simpler and more efficient.

 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API. More conversions like this will come later.

 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.

 - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
   volunteering to maintain it. I have additional cleanups and
   optimizations planned for future cycles.

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
  MAINTAINERS: add entry for CRC library
  powerpc/crc: delete obsolete crc-vpmsum_test.c
  lib/crc32test: delete obsolete crc32test.c
  lib/crc16_kunit: delete obsolete crc16_kunit.c
  lib/crc_kunit.c: add KUnit test suite for CRC library functions
  powerpc/crc-t10dif: expose CRC-T10DIF function through lib
  arm64/crc-t10dif: expose CRC-T10DIF function through lib
  arm/crc-t10dif: expose CRC-T10DIF function through lib
  x86/crc-t10dif: expose CRC-T10DIF function through lib
  crypto: crct10dif - expose arch-optimized lib function
  lib/crc-t10dif: add support for arch overrides
  lib/crc-t10dif: stop wrapping the crypto API
  scsi: target: iscsi: switch to using the crc32c library
  f2fs: switch to using the crc32 library
  jbd2: switch to using the crc32c library
  ext4: switch to using the crc32c library
  lib/crc32: make crc32c() go directly to lib
  bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
  x86/crc32: expose CRC32 functions through lib
  x86/crc32: update prototype for crc32_pclmul_le_16()
  ...
2025-01-22 19:55:08 -08:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Eric Biggers
2051da8585 arm64/crc-t10dif: expose CRC-T10DIF function through lib
Move the arm64 CRC-T10DIF assembly code into the lib directory and wire
it up to the library interface.  This allows it to be used without going
through the crypto API.  It remains usable via the crypto API too via
the shash algorithms that use the library interface.  Thus all the
arch-specific "shash" code becomes unnecessary and is removed.

Note: to see the diff from arch/arm64/crypto/crct10dif-ce-glue.c to
arch/arm64/lib/crc-t10dif-glue.c, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20241202012056.209768-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:13 -08:00
Linus Torvalds
02b2f1a7b8 This update includes the following changes:
API:
 
 - Add sig driver API.
 - Remove signing/verification from akcipher API.
 - Move crypto_simd_disabled_for_test to lib/crypto.
 - Add WARN_ON for return values from driver that indicates memory corruption.
 
 Algorithms:
 
 - Provide crc32-arch and crc32c-arch through Crypto API.
 - Optimise crc32c code size on x86.
 - Optimise crct10dif on arm/arm64.
 - Optimise p10-aes-gcm on powerpc.
 - Optimise aegis128 on x86.
 - Output full sample from test interface in jitter RNG.
 - Retry without padata when it fails in pcrypt.
 
 Drivers:
 
 - Add support for Airoha EN7581 TRNG.
 - Add support for STM32MP25x platforms in stm32.
 - Enable iproc-r200 RNG driver on BCMBCA.
 - Add Broadcom BCM74110 RNG driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmc6sQsACgkQxycdCkmx
 i6dfHxAAnkI65TE6agZq9DlkEU4ZqOsxxdk0MsGIhbCUTxW3KENzu9vtKjnvg9T/
 Ou0d2J49ny87Y4zaA59Wf/Q1+gg5YSQR5kelonpfrPLkCkJjr72HZpyCHv8TTzEC
 uHHoVj9cnPIF5/yfiqQsrWT1ACip9vn+slyVPaMJV1qR6gnvnSALtsg4e/vKHkn7
 ZMaf2pZ2ROYXdB02nMK5KQcCrxD64MQle/yQepY44eYjnT+XclkqPdi6o1nUSpj/
 RFAeY0jFSTu0pj3DqT48TnU/LiiNLlFOZrGjCdEySoac63vmTtKqfYDmrRaFz4hB
 sucxbgJ3xnnYseRijtfXnxaD/IkDJln+ipGNQKAZLfOVMDCTxPdYGmOpobMTXMS+
 0sY0eAHgqr23P9pOp+sOzcAEFIqg6llAYQVWx3Zl4vpXBUuxzg6AqmHnPicnck7y
 Lw1cJhQxij2De3dG2ZL/0dgQxMjGN/YfCM8SSg6l+Xn3j4j47rqJNH2ZsmXtbJ2n
 kTkmemmWdgRR1IvgQQGsvyKs9ThkcEDW+IzW26SUv3Clvru2NSkX4ZPHbezZQf+D
 R0wMZsW3Fw7Zymerz1GIBSqdLnsyFWtIAjukDpOR6ordPgOBeDt76v6tw5vL2/II
 KYoeN1pdEEecwuhAsEvCryT5ZG4noBeNirf/ElWAfEybgcXiTks=
 =T8pa
 -----END PGP SIGNATURE-----

Merge tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add sig driver API
   - Remove signing/verification from akcipher API
   - Move crypto_simd_disabled_for_test to lib/crypto
   - Add WARN_ON for return values from driver that indicates memory
     corruption

  Algorithms:
   - Provide crc32-arch and crc32c-arch through Crypto API
   - Optimise crc32c code size on x86
   - Optimise crct10dif on arm/arm64
   - Optimise p10-aes-gcm on powerpc
   - Optimise aegis128 on x86
   - Output full sample from test interface in jitter RNG
   - Retry without padata when it fails in pcrypt

  Drivers:
   - Add support for Airoha EN7581 TRNG
   - Add support for STM32MP25x platforms in stm32
   - Enable iproc-r200 RNG driver on BCMBCA
   - Add Broadcom BCM74110 RNG driver"

* tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
  crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctx
  crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
  crypto: aesni - Move back to module_init
  crypto: lib/mpi - Export mpi_set_bit
  crypto: aes-gcm-p10 - Use the correct bit to test for P10
  hwrng: amd - remove reference to removed PPC_MAPLE config
  crypto: arm/crct10dif - Implement plain NEON variant
  crypto: arm/crct10dif - Macroify PMULL asm code
  crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
  crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
  crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
  crypto: arm64/crct10dif - Remove obsolete chunking logic
  crypto: bcm - add error check in the ahash_hmac_init function
  crypto: caam - add error check to caam_rsa_set_priv_key_form
  hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
  dt-bindings: rng: add binding for BCM74110 RNG
  padata: Clean up in padata_do_multithreaded()
  crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
  crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
  crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
  ...
2024-11-19 10:28:41 -08:00
Ard Biesheuvel
779cee8209 crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
The only remaining user of the fallback implementation of 64x64
polynomial multiplication using 8x8 PMULL instructions is the final
reduction from a 16 byte vector to a 16-bit CRC.

The fallback code is complicated and messy, and this reduction has
little impact on the overall performance, so instead, let's calculate
the final CRC by passing the 16 byte vector to the generic CRC-T10DIF
implementation when running the fallback version.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Ard Biesheuvel
67dfb1b73f crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
The CRC-T10DIF implementation for arm64 has a version that uses 8x8
polynomial multiplication, for cores that lack the crypto extensions,
which cover the 64x64 polynomial multiplication instruction that the
algorithm was built around.

This fallback version rather naively adopted the 64x64 polynomial
multiplication algorithm that I ported from ARM for the GHASH driver,
which needs 8 PMULL8 instructions to implement one PMULL64. This is
reasonable, given that each 8-bit vector element needs to be multiplied
with each element in the other vector, producing 8 vectors with partial
results that need to be combined to yield the correct result.

However, most PMULL64 invocations in the CRC-T10DIF code involve
multiplication by a pair of 16-bit folding coefficients, and so all the
partial results from higher order bytes will be zero, and there is no
need to calculate them to begin with.

Then, the CRC-T10DIF algorithm always XORs the output values of the
PMULL64 instructions being issued in pairs, and so there is no need to
faithfully implement each individual PMULL64 instruction, as long as
XORing the results pairwise produces the expected result.

Implementing these improvements results in a speedup of 3.3x on low-end
platforms such as Raspberry Pi 4 (Cortex-A72)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Ard Biesheuvel
7048c21e6b crypto: arm64/crct10dif - Remove obsolete chunking logic
This is a partial revert of commit fc754c024a, which moved the logic
into C code which ensures that kernel mode NEON code does not hog the
CPU for too long.

This is no longer needed now that kernel mode NEON no longer disables
preemption, so we can drop this.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Jia He
9369693a2c crypto: arm64/poly1305 - move data to rodata section
When objtool gains support for ARM in the future, it may encounter issues
disassembling the following data in the .text section:
> .Lzeros:
> .long   0,0,0,0,0,0,0,0
> .asciz  "Poly1305 for ARMv8, CRYPTOGAMS by \@dot-asm"
> .align  2

Move it to .rodata which is a more appropriate section for read-only data.

There is a limit on how far the label can be from the instruction, hence
use "adrp" and low 12bits offset of the label to avoid the compilation
error.

Signed-off-by: Jia He <justin.he@arm.com>
Tested-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17 13:55:49 +08:00
Herbert Xu
b0cd6f4c3f Revert "crypto: arm64/poly1305 - move data to rodata section"
This reverts commit 47d9625209.

It causes build issues as detected by the kernel test robot.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408040817.OWKXtCv6-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-06 13:45:59 +08:00
Jia He
47d9625209 crypto: arm64/poly1305 - move data to rodata section
When objtool gains support for ARM in the future, it may encounter issues
disassembling the following data in the .text section:
> .Lzeros:
> .long   0,0,0,0,0,0,0,0
> .asciz  "Poly1305 for ARMv8, CRYPTOGAMS by \@dot-asm"
> .align  2

Move it to .rodata which is a more appropriate section for read-only data.

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-02 21:11:20 +08:00
Jeff Johnson
b568826eff crypto: arm64 - add missing MODULE_DESCRIPTION() macros
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/crypto/crct10dif-ce.o
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/crypto/poly1305-neon.o
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/crypto/aes-neon-bs.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:16 +10:00
Mark Brown
a720de9fba crypto: arm64/crc10dif - Raise priority of NEON crct10dif implementation
The NEON implementation of crctd10dif is registered with a priority of 100
which is identical to that used by the generic C implementation. Raise the
priority to 150, half way between the PMULL based implementation and the
NEON one, so that it will be preferred over the generic implementation.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-05-31 17:34:56 +08:00
Ard Biesheuvel
571e557cba crypto: arm64/aes-ce - Simplify round key load sequence
Tweak the round key logic so that they can be loaded using a single
branchless sequence using overlapping loads. This is shorter and
simpler, and puts the conditional branches based on the key size further
apart, which might benefit microarchitectures that cannot record taken
branches at every instruction. For these branches, use test-bit-branch
instructions that don't clobber the condition flags.

Note that none of this has any impact on performance, positive or
otherwise (and the branch prediction benefit would only benefit AES-192
which nobody uses). It does make for nicer code, though.

While at it, use \@ to generate the labels inside the macros, which is
more robust than using fixed numbers, which could clash inadvertently.
Also, bring aes-neon.S in line with these changes, including the switch
to test-and-branch instructions, to avoid surprises in the future when
we might start relying on the condition flags being preserved in the
chaining mode wrappers in aes-modes.S

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-04-26 17:26:09 +08:00
Linus Torvalds
c8e7699616 This update includes the following changes:
API:
 
 - Avoid unnecessary copying in scomp for trivial SG lists.
 
 Algorithms:
 
 - Optimise NEON CCM implementation on ARM64.
 
 Drivers:
 
 - Add queue stop/query debugfs support in hisilicon/qm.
 -
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmXxDbcACgkQxycdCkmx
 i6dCzw//dI5CuszRMalZu8fwvh7K48d4GoY/5uuc1vPK6yLyc5dl6CzPsrKhmBGO
 Ip2fUCFoO+ViWl5tw8SjOOEnPRztETrIGlOM5PHOIxRuSA6lbcsdo+EQja4WBDo1
 fGUNaNVW9eY9XsaUw46uwxaicd+xOqRGJURKB2XuzSHACln/QrNy74aVdbc6VhaK
 Jnh4o2blsEh7jcVAK2ahNmmDm739w8C462Go4UIl8xcM693HtjUUFx7TALpI0iC+
 BWxctAV9IzA/siUwztvwwlo+V1KZbjZFD5XfC/erkV0gFs7ll8IcuDFnZ1suMvt2
 Z+6OVhghkkvMJJEl1qKNxVJITOUdXH0yKWqTbEHuFlV7VV+9hjUlWlvRVm7ZIrXf
 ZXi/OpBJRAIBB6fJqy1RAvfIZUR8Dl8i8RKLVawdqFLbB+XCOiIK7a8+5hCKaWsU
 0DVFNCfiJFPIiByzxXV+4Jt/hzxx/qseIbTiUShdoXeHOVWMgH5H255MWJv5qjVy
 aSwgQLX6MlY9Pj+w5cPHAUjOiGjiAa3iKwsbnmo5lszsPl4KR407gaJnV7VI4s6R
 fhqAgIjJ4Ik2lHzQRM89QU2OOogVBPs+FHEkk98lak5CATpjUq97iLbTharAg3dc
 l8FLoSB9+NM+RCbnzUXuOIvoU/okSM4+j6a3xAnLQH6OxBOtk3Y=
 =Uw2R
 -----END PGP SIGNATURE-----

Merge tag 'v6.9-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:

   - Avoid unnecessary copying in scomp for trivial SG lists

  Algorithms:

   - Optimise NEON CCM implementation on ARM64

  Drivers:

   - Add queue stop/query debugfs support in hisilicon/qm

   - Intel qat updates and cleanups"

* tag 'v6.9-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (79 commits)
  Revert "crypto: remove CONFIG_CRYPTO_STATS"
  crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem
  crypto: tcrypt - add ffdhe2048(dh) test
  crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flags
  crypto: hisilicon/zip - fix the missing CRYPTO_ALG_ASYNC in cra_flags
  hwrng: hisi - use dev_err_probe
  MAINTAINERS: Remove T Ambarus from few mchp entries
  crypto: iaa - Fix comp/decomp delay statistics
  crypto: iaa - Fix async_disable descriptor leak
  dt-bindings: rng: atmel,at91-trng: add sam9x7 TRNG
  dt-bindings: crypto: add sam9x7 in Atmel TDES
  dt-bindings: crypto: add sam9x7 in Atmel SHA
  dt-bindings: crypto: add sam9x7 in Atmel AES
  crypto: remove CONFIG_CRYPTO_STATS
  crypto: dh - Make public key test FIPS-only
  crypto: rockchip - fix to check return value
  crypto: jitter - fix CRYPTO_JITTERENTROPY help text
  crypto: qat - make ring to service map common for QAT GEN4
  crypto: qat - fix ring to service map for dcc in 420xx
  crypto: qat - fix ring to service map for dcc in 4xxx
  ...
2024-03-15 14:46:54 -07:00