linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-01 23:46:45 +00:00

Author	SHA1	Message	Date
Eric Biggers	543ea178fb	crypto: x86/aes-xts - optimize size of instructions operating on lengths x86_64 has the "interesting" property that the instruction size is generally a bit shorter for instructions that operate on the 32-bit (or less) part of registers, or registers that are in the original set of 8. This patch adjusts the AES-XTS code to take advantage of that property by changing the LEN parameter from size_t to unsigned int (which is all that's needed and is what the non-AVX implementation uses) and using the %eax register for KEYLEN. This decreases the size of aes-xts-avx-x86_64.o by 1.2%. Note that changing the kmovq to kmovd was going to be needed anyway to make the AVX10/256 code really work on CPUs that don't support 512-bit vectors (since the AVX10 spec says that 64-bit opmask instructions will only be supported on processors that support 512-bit vectors). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:19 +08:00
Eric Biggers	e619723a85	crypto: x86/aes-xts - eliminate a few more instructions - For conditionally subtracting 16 from LEN when decrypting a message whose length isn't a multiple of 16, use the cmovnz instruction. - Fold the addition of 4*VL to LEN into the sub of VL or 16 from LEN. - Remove an unnecessary test instruction. This results in slightly shorter code, both source and binary. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:19 +08:00
Eric Biggers	2717e01fc3	crypto: x86/aes-xts - handle AES-128 and AES-192 more efficiently Decrease the amount of code specific to the different AES variants by "right-aligning" the sequence of round keys, and for AES-128 and AES-192 just skipping irrelevant rounds at the beginning. This shrinks the size of aes-xts-avx-x86_64.o by 13.3%, and it improves the efficiency of AES-128 and AES-192. The tradeoff is that for AES-256 some additional not-taken conditional jumps are now executed. But these are predicted well and are cheap on x86. Note that the ARMv8 CE based AES-XTS implementation uses a similar strategy to handle the different AES variants. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:19 +08:00
Eric Biggers	ea9459ef36	crypto: x86/aesni-xts - deduplicate aesni_xts_enc() and aesni_xts_dec() Since aesni_xts_enc() and aesni_xts_dec() are very similar, generate them from a macro that's passed an argument enc=1 or enc=0. This reduces the length of aesni-intel_asm.S by 112 lines while still producing the exact same object file in both 32-bit and 64-bit mode. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:19 +08:00
Eric Biggers	1d27e1f5c8	crypto: x86/aes-xts - handle CTS encryption more efficiently When encrypting a message whose length isn't a multiple of 16 bytes, encrypt the last full block in the main loop. This works because only decryption uses the last two tweaks in reverse order, not encryption. This improves the performance of decrypting messages whose length isn't a multiple of the AES block length, shrinks the size of aes-xts-avx-x86_64.o by 5.0%, and eliminates two instructions (a test and a not-taken conditional jump) when encrypting a message whose length is a multiple of the AES block length. While it's not super useful to optimize for ciphertext stealing given that it's rarely needed in practice, the other two benefits mentioned above make this optimization worthwhile. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:19 +08:00
Eric Biggers	7daba20cc7	crypto: x86/sha256-ni - simplify do_4rounds Instead of loading the message words into both MSG and \m0 and then adding the round constants to MSG, load the message words into \m0 and the round constants into MSG and then add \m0 to MSG. This shortens the source code slightly. It changes the instructions slightly, but it doesn't affect binary code size and doesn't seem to affect performance. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:18 +08:00
Eric Biggers	59e62b20ac	crypto: x86/sha256-ni - optimize code size - Load the SHA-256 round constants relative to a pointer that points into the middle of the constants rather than to the beginning. Since x86 instructions use signed offsets, this decreases the instruction length required to access some of the later round constants. - Use punpcklqdq or punpckhqdq instead of longer instructions such as pshufd, pblendw, and palignr. This doesn't harm performance. The end result is that sha256_ni_transform shrinks from 839 bytes to 791 bytes, with no loss in performance. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:18 +08:00
Eric Biggers	1b5ddb067d	crypto: x86/sha256-ni - rename some register aliases MSGTMP[0-3] are used to hold the message schedule and are not temporary registers per se. MSGTMP4 is used as a temporary register for several different purposes and isn't really related to MSGTMP[0-3]. Rename them to MSG[0-3] and TMP accordingly. Also add a comment that clarifies what MSG is. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:18 +08:00
Eric Biggers	ffaec34b0f	crypto: x86/sha256-ni - convert to use rounds macros To avoid source code duplication, do the SHA-256 rounds using macros. This reduces the length of sha256_ni_asm.S by 153 lines while still producing the exact same object file. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:18 +08:00
Eric Biggers	b924ecd305	crypto: x86/aes-xts - access round keys using single-byte offsets Access the AES round keys using offsets -716 through 716, instead of 016 through 1416. This allows VEX-encoded instructions to address all round keys using 1-byte offsets, whereas before some needed 4-byte offsets. This decreases the code size of aes-xts-avx-x86_64.o by 4.2%. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-19 18:54:18 +08:00
Eric Biggers	751fb2528c	crypto: x86/aes-xts - make non-AVX implementation use new glue code Make the non-AVX implementation of AES-XTS (xts-aes-aesni) use the new glue code that was introduced for the AVX implementations of AES-XTS. This reduces code size, and it improves the performance of xts-aes-aesni due to the optimization for messages that don't span page boundaries. This required moving the new glue functions higher up in the file and allowing the IV encryption function to be specified by the caller. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-12 15:07:53 +08:00
Eric Biggers	6a24fdfe1e	crypto: x86/sha512-avx2 - add missing vzeroupper Since sha512_transform_rorx() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: `e01d69cb01` ("crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX instructions.") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-12 15:07:52 +08:00
Eric Biggers	57ce8a4e16	crypto: x86/sha256-avx2 - add missing vzeroupper Since sha256_transform_rorx() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: `d34a460092` ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-12 15:07:52 +08:00
Eric Biggers	4ad096cca9	crypto: x86/nh-avx2 - add missing vzeroupper Since nh_avx2() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: `0f961f9f67` ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-12 15:07:52 +08:00
Eric Biggers	aa2197f566	crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation Add an AES-XTS implementation "xts-aes-vaes-avx10_512" for x86_64 CPUs with the VAES, VPCLMULQDQ, and either AVX10/512 or AVX512BW + AVX512VL extensions. This implementation uses zmm registers to operate on four AES blocks at a time. The assembly code is instantiated using a macro so that most of the source code is shared with other implementations. To avoid downclocking on older Intel CPU models, an exclusion list is used to prevent this 512-bit implementation from being used by default on some CPU models. They will use xts-aes-vaes-avx10_256 instead. For now, this exclusion list is simply coded into aesni-intel_glue.c. It may make sense to eventually move it into a more central location. xts-aes-vaes-avx10_512 is slightly faster than xts-aes-vaes-avx10_256 on some current CPUs. E.g., on AMD Zen 4, AES-256-XTS decryption throughput increases by 13% with 4096-byte inputs, or 14% with 512-byte inputs. On Intel Sapphire Rapids, AES-256-XTS decryption throughput increases by 2% with 4096-byte inputs, or 3% with 512-byte inputs. Future CPUs may provide stronger 512-bit support, in which case a larger benefit should be seen. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-05 15:46:33 +08:00
Eric Biggers	ee63fea005	crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation Add an AES-XTS implementation "xts-aes-vaes-avx10_256" for x86_64 CPUs with the VAES, VPCLMULQDQ, and either AVX10/256 or AVX512BW + AVX512VL extensions. This implementation avoids using zmm registers, instead using ymm registers to operate on two AES blocks at a time. The assembly code is instantiated using a macro so that most of the source code is shared with other implementations. This is the optimal implementation on CPUs that support VAES and AVX512 but where the zmm registers should not be used due to downclocking effects, for example Intel's Ice Lake. It should also be the optimal implementation on future CPUs that support AVX10/256 but not AVX10/512. The performance is slightly better than that of xts-aes-vaes-avx2, which uses the same 256-bit vector length, due to factors such as being able to use ymm16-ymm31 to cache the AES round keys, and being able to use the vpternlogd instruction to do XORs more efficiently. For example, on Ice Lake, the throughput of decrypting 4096-byte messages with AES-256-XTS is 6.6% higher with xts-aes-vaes-avx10_256 than with xts-aes-vaes-avx2. While this is a small improvement, it is straightforward to provide this implementation (xts-aes-vaes-avx10_256) as long as we are providing xts-aes-vaes-avx2 and xts-aes-vaes-avx10_512 anyway, due to the way the _aes_xts_crypt macro is structured. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-05 15:46:33 +08:00
Eric Biggers	e787060bdf	crypto: x86/aes-xts - wire up VAES + AVX2 implementation Add an AES-XTS implementation "xts-aes-vaes-avx2" for x86_64 CPUs with the VAES, VPCLMULQDQ, and AVX2 extensions, but not AVX512 or AVX10. This implementation uses ymm registers to operate on two AES blocks at a time. The assembly code is instantiated using a macro so that most of the source code is shared with other implementations. This is the optimal implementation on AMD Zen 3. It should also be the optimal implementation on Intel Alder Lake, which similarly supports VAES but not AVX512. Comparing to xts-aes-aesni-avx on Zen 3, xts-aes-vaes-avx2 provides 70% higher AES-256-XTS decryption throughput with 4096-byte messages, or 23% higher with 512-byte messages. A large improvement is also seen with CPUs that do support AVX512 (e.g., 98% higher AES-256-XTS decryption throughput on Ice Lake with 4096-byte messages), though the following patches add AVX512 optimized implementations to get a bit more performance on those CPUs. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-05 15:46:33 +08:00
Eric Biggers	996f4dcbd2	crypto: x86/aes-xts - wire up AESNI + AVX implementation Add an AES-XTS implementation "xts-aes-aesni-avx" for x86_64 CPUs that have the AES-NI and AVX extensions but not VAES. It's similar to the existing xts-aes-aesni in that uses xmm registers to operate on one AES block at a time. It differs from xts-aes-aesni in the following ways: - It uses the VEX-coded (non-destructive) instructions from AVX. This improves performance slightly. - It incorporates some additional optimizations such as interleaving the tweak computation with AES en/decryption, handling single-page messages more efficiently, and caching the first round key. - It supports only 64-bit (x86_64). - It's generated by an assembly macro that will also be used to generate VAES-based implementations. The performance improvement over xts-aes-aesni varies from small to large, depending on the CPU and other factors such as the size of the messages en/decrypted. For example, the following increases in AES-256-XTS decryption throughput are seen on the following CPUs: \| 4096-byte messages \| 512-byte messages \| ----------------------+--------------------+-------------------+ Intel Skylake \| 6% \| 31% \| Intel Cascade Lake \| 4% \| 26% \| AMD Zen 1 \| 61% \| 73% \| AMD Zen 2 \| 36% \| 59% \| (The above CPUs don't support VAES, so they can't use VAES instead.) While this isn't as large an improvement as what VAES provides, this still seems worthwhile. This implementation is fairly easy to provide based on the assembly macro that's needed for VAES anyway, and it will be the best implementation on a large number of CPUs (very roughly, the CPUs launched by Intel and AMD from 2011 to 2018). This makes the existing xts-aes-aesni mostly obsolete. For now, leave it in place to support 32-bit kernels and also CPUs like Intel Westmere that support AES-NI but not AVX. (We could potentially remove it anyway and just rely on the indirect acceleration via ecb-aes-aesni in those cases, but that change will need to be considered separately.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-05 15:46:33 +08:00
Eric Biggers	d637168810	crypto: x86/aes-xts - add AES-XTS assembly macro for modern CPUs Add an assembly file aes-xts-avx-x86_64.S which contains a macro that expands into AES-XTS implementations for x86_64 CPUs that support at least AES-NI and AVX, optionally also taking advantage of VAES, VPCLMULQDQ, and AVX512 or AVX10. This patch doesn't expand the macro at all. Later patches will do so, adding each implementation individually so that the motivation and use case for each individual implementation can be fully presented. The file also provides a function aes_xts_encrypt_iv() which handles the encryption of the IV (tweak), using AES-NI and AVX. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-05 15:46:33 +08:00
Chang S. Bae	e3299a4c1c	crypto: x86/aesni - Update aesni_set_key() to return void The aesni_set_key() implementation has no error case, yet its prototype specifies to return an error code. Modify the function prototype to return void and adjust the related code. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-02 10:49:39 +08:00
Chang S. Bae	d50b35f0c4	crypto: x86/aesni - Rearrange AES key size check aes_expandkey() already includes an AES key size check. If AES-NI is unusable, invoke the function without the size check. Also, use aes_check_keylen() instead of open code. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-04-02 10:49:39 +08:00
Linus Torvalds	0cb552aa97	This update includes the following changes: API: - Add incremental lskcipher/skcipher processing. Algorithms: - Remove SHA1 from drbg. - Remove CFB and OFB. Drivers: - Add comp high perf mode configuration in hisilicon/zip. - Add support for 420xx devices in qat. - Add IAA Compression Accelerator driver. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmWdxR4ACgkQxycdCkmx i6fAjg//SqOwxeUYWpT4KdMCxGMn7U9iE3wJeX8nqfma3a62Wt2soey7H3GB9G7v gEh0OraOKIGeBtS8giIX83SZJOirMlgeE2tngxMmR9O95EUNR0XGnywF/emyt96z WcSN1IrRZ8qQzTASBF0KpV2Ir5mNzBiOwU9tVHIztROufA4C1fwKl7yhPM67C3MU 88vf1R+ZeWUbNbzQNC8oYIqU11dcNaMNhOVPiZCECKbIR6LqwUf3Swexz+HuPR/D WTSrb4J3Eeg77SMhI959/Hi53WeEyVW1vWYAVMgfTEFw6PESiOXyPeImfzUMFos6 fFYIAoQzoG5GlQeYwLLSoZAwtfY+f7gTNoaE+bnPk5317EFzFDijaXrkjjVKqkS2 OOBfxrMMIGNmxp7pPkt6HPnIvGNTo+SnbAdVIm6M3EN1K+BTGrj7/CTJkcT6XSyK nCBL6nbP7zMB1GJfCFGPvlIdW4oYnAfB1Q5YJ9tzYbEZ0t5NWxDKZ45RnM9xQp4Y 2V1zdfALdqmGRKBWgyUcqp1T4/AYRU0+WaQxz7gHw3BPR4QmfVLPRqiiR7OT0Z+P XFotOYD3epVXS1OUyZdLBn5+FXLnRd1uylQ+j8FNfnddr4Nr+tH1J6edK71NMvXG Tj7p5rP5bbgvVkD43ywsVnCI0w+9NS55mH5UP2Y4fSLS6p2tJAw= =yMmO -----END PGP SIGNATURE----- Merge tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add incremental lskcipher/skcipher processing Algorithms: - Remove SHA1 from drbg - Remove CFB and OFB Drivers: - Add comp high perf mode configuration in hisilicon/zip - Add support for 420xx devices in qat - Add IAA Compression Accelerator driver" * tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (172 commits) crypto: iaa - Account for cpu-less numa nodes crypto: scomp - fix req->dst buffer overflow crypto: sahara - add support for crypto_engine crypto: sahara - remove error message for bad aes request size crypto: sahara - remove unnecessary NULL assignments crypto: sahara - remove 'active' flag from sahara_aes_reqctx struct crypto: sahara - use dev_err_probe() crypto: sahara - use devm_clk_get_enabled() crypto: sahara - use BIT() macro crypto: sahara - clean up macro indentation crypto: sahara - do not resize req->src when doing hash operations crypto: sahara - fix processing hash requests with req->nbytes < sg->length crypto: sahara - improve error handling in sahara_sha_process() crypto: sahara - fix wait_for_completion_timeout() error handling crypto: sahara - fix ahash reqsize crypto: sahara - handle zero-length aes requests crypto: skcipher - remove excess kerneldoc members crypto: shash - remove excess kerneldoc members crypto: qat - generate dynamically arbiter mappings crypto: qat - add support for ring pair level telemetry ...	2024-01-10 12:23:43 -08:00
Bjorn Helgaas	54aa699e80	arch/x86: Fix typos Fix typos, most reported by "codespell arch/x86". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240103004011.1758650-1-helgaas@kernel.org	2024-01-03 11:46:22 +01:00
Herbert Xu	05bd1e2a78	crypto: x86/sm4 - Remove cfb(sm4) Remove the unused CFB implementation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-12-08 11:59:45 +08:00
Eric Biggers	ba5a434d5a	crypto: x86/sha256 - autoload if SHA-NI detected The x86 SHA-256 module contains four implementations: SSSE3, AVX, AVX2, and SHA-NI. Commit `1c43c0f1f8` ("crypto: x86/sha - load modules based on CPU features") made the module be autoloaded when SSSE3, AVX, or AVX2 is detected. The omission of SHA-NI appears to be an oversight, perhaps because of the outdated file-level comment. This patch fixes this, though in practice this makes no difference because SSSE3 is a subset of the other three features anyway. Indeed, sha256_ni_transform() executes SSSE3 instructions such as pshufb. Reviewed-by: Roxana Nicolescu <roxana.nicolescu@canonical.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-11-17 19:16:29 +08:00
Eric Biggers	20342e3f64	crypto: x86/sha1 - autoload if SHA-NI detected The x86 SHA-1 module contains four implementations: SSSE3, AVX, AVX2, and SHA-NI. Commit `1c43c0f1f8` ("crypto: x86/sha - load modules based on CPU features") made the module be autoloaded when SSSE3, AVX, or AVX2 is detected. The omission of SHA-NI appears to be an oversight, perhaps because of the outdated file-level comment. This patch fixes this, though in practice this makes no difference because SSSE3 is a subset of the other three features anyway. Indeed, sha1_ni_transform() executes SSSE3 instructions such as pshufb. Reviewed-by: Roxana Nicolescu <roxana.nicolescu@canonical.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-11-17 19:16:29 +08:00
Eric Biggers	796b06f5c9	crypto: x86/nhpoly1305 - implement ->digest Implement the ->digest method to improve performance on single-page messages by reducing the number of indirect calls. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-10-20 13:39:25 +08:00
Eric Biggers	fdcac2ddc7	crypto: x86/sha256 - implement ->digest for sha256 Implement a ->digest function for sha256-ssse3, sha256-avx, sha256-avx2, and sha256-ni. This improves the performance of crypto_shash_digest() with these algorithms by reducing the number of indirect calls that are made. For now, don't bother with this for sha224, since sha224 is rarely used. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-10-20 13:39:25 +08:00
Chang S. Bae	e12a68b3c6	crypto: x86/aesni - Perform address alignment early for XTS mode Currently, the alignment of each field in struct aesni_xts_ctx occurs right before every access. However, it's possible to perform this alignment ahead of time. Introduce a helper function that converts struct crypto_skcipher tfm to struct aesni_xts_ctx ctx and returns an aligned address. Utilize this helper function at the beginning of each XTS function and then eliminate redundant alignment code. Suggested-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/all/ZFWQ4sZEVu%2FLHq+Q@gmail.com/ Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-10-05 18:16:30 +08:00
Chang S. Bae	d148736ff1	crypto: x86/aesni - Correct the data type in struct aesni_xts_ctx Currently, every field in struct aesni_xts_ctx is defined as a byte array of the same size as struct crypto_aes_ctx. This data type is obscure and the choice lacks justification. To rectify this, update the field type in struct aesni_xts_ctx to match its actual structure. Suggested-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/all/ZFWQ4sZEVu%2FLHq+Q@gmail.com/ Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-10-05 18:16:30 +08:00
Chang S. Bae	62496a2dea	crypto: x86/aesni - Refactor the common address alignment code The address alignment code has been duplicated for each mode. Instead of duplicating the same code, refactor the alignment code and simplify the alignment helpers. Suggested-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/all/20230526065414.GB875@sol.localdomain/ Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-10-05 18:16:30 +08:00
Roxana Nicolescu	1c43c0f1f8	crypto: x86/sha - load modules based on CPU features x86 optimized crypto modules are built as modules rather than build-in and they are not loaded when the crypto API is initialized, resulting in the generic builtin module (sha1-generic) being used instead. It was discovered when creating a sha1/sha256 checksum of a 2Gb file by using kcapi-tools because it would take significantly longer than creating a sha512 checksum of the same file. trace-cmd showed that for sha1/256 the generic module was used, whereas for sha512 the optimized module was used instead. Add module aliases() for these x86 optimized crypto modules based on CPU feature bits so udev gets a chance to load them later in the boot process. This resulted in ~3x decrease in the real-time execution of kcapi-dsg. Fix is inspired from commit `aa031b8f70` ("crypto: x86/sha512 - load based on CPU features") where a similar fix was done for sha512. Cc: stable@vger.kernel.org # 5.15+ Suggested-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Suggested-by: Julian Andres Klode <julian.klode@canonical.com> Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-09-20 13:15:54 +08:00
Bo Liu	02968703e8	crypto: aesni - Fix double word in comments Remove the repeated word "if" in comments. Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-09-20 13:15:29 +08:00
Eric Biggers	28b7760983	crypto: x86/aesni - remove unused parameter to aes_set_key_common() The 'tfm' parameter to aes_set_key_common() is never used, so remove it. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-07-22 13:59:39 +12:00
Chang S. Bae	74c6df413f	crypto: x86/aesni - Align the address before aes_set_key_common() aes_set_key_common() performs runtime alignment to the void *raw_ctx pointer. This facilitates consistent access to the 16byte-aligned address during key extension. However, the alignment is already handlded in the GCM-related setkey functions before invoking the common function. Consequently, the alignment in the common function is unnecessary for those functions. To establish a consistent approach throughout the glue code, remove the aes_ctx() call from its current location. Instead, place it at each call site where the runtime alignment is currently absent. Link: https://lore.kernel.org/lkml/20230605024623.GA4653@quark.localdomain/ Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-07-14 18:23:14 +10:00
Linus Torvalds	7a6c8e512f	This push fixes an alignment crash in x86/aria. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmRt4tIACgkQxycdCkmx i6fKpw/6A52EyoKBOjfVE+FKc40Fs/1ecmMUumN0kqnpPFKMHvJehnFsb6U0/gCS l6BI3CsFVogeICpBYAo1tjvrt3B4FO+zhBWME+il3aOzYMpdryXf3oLMMORqWlCl a+TA0LZH+NmySfYKy6rg1fYTQwKICyv25dp0aFC2n/Nj5SQxVtSVN2TFZ8uXNwcT ubjMcVUdmGXVls0KO7/ZbhRf9t/2VELzrS/WovhxUJxxXXW4Qhu1tI7X4UJPbnqc m14/dj6aZuaElst8B82BT3OSqZdT2rEbomfcnKOo7ePTp5tkJcpXmtM8eDO40AXn lzwZ4d9TIPiXalZWZ4ZL0htvbaC6QAvioSlcFmPPZJudSXYm4n31eimMYBmYlgLq uAm5wG4Tl84USumoekTNusZJXinDi0oNDXR7RV3gACy1TQJ17+5vgXKr+1/kNH2u clHoBIuon9oFAvxZ44iNyBDc4770D+gU+dNsuFCreBUT/8fvAtFWeptpGY08iptY c1hu5Tv3kK68JfEgxTgVWs+ObX1pLWO0bEPeQummd5O9jNXRLLPrSB7wOZrkqrpq nl4AchdGE9u9OmB5h0VvNj0tUjJqxr8h04a+Hkhu7NqkREYajGWPJkTZQjNzrTSP lu52zuvflim1wyWMozMbFPg1O0J78JqI8EQxs/nuFAjF77u2xgs= =JBjy -----END PGP SIGNATURE----- Merge tag 'v6.4-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix an alignment crash in x86/aria" * tag 'v6.4-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aria - Use 16 byte alignment for GFNI constant vectors	2023-05-29 07:05:49 -04:00
Ard Biesheuvel	6ab39f9992	crypto: x86/aria - Use 16 byte alignment for GFNI constant vectors The GFNI routines in the AVX version of the ARIA implementation now use explicit VMOVDQA instructions to load the constant input vectors, which means they must be 16 byte aligned. So ensure that this is the case, by dropping the section split and the incorrect .align 8 directive, and emitting the constants into the 16-byte aligned section instead. Note that the AVX2 version of this code deviates from this pattern, and does not require a similar fix, given that it loads these contants as 8-byte memory operands, for which AVX2 permits any alignment. Cc: Taehee Yoo <ap420073@gmail.com> Fixes: `8b84475318` ("crypto: x86/aria-avx - Do not use avx2 instructions") Reported-by: syzbot+a6abcf08bad8b18fd198@syzkaller.appspotmail.com Tested-by: syzbot+a6abcf08bad8b18fd198@syzkaller.appspotmail.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-05-24 18:10:27 +08:00
Linus Torvalds	b6a7828502	modules-6.4-rc1 The summary of the changes for this pull requests is: * Song Liu's new struct module_memory replacement * Nick Alcock's MODULE_LICENSE() removal for non-modules * My cleanups and enhancements to reduce the areas where we vmalloc module memory for duplicates, and the respective debug code which proves the remaining vmalloc pressure comes from userspace. Most of the changes have been in linux-next for quite some time except the minor fixes I made to check if a module was already loaded prior to allocating the final module memory with vmalloc and the respective debug code it introduces to help clarify the issue. Although the functional change is small it is rather safe as it can only help reduce vmalloc space for duplicates and is confirmed to fix a bootup issue with over 400 CPUs with KASAN enabled. I don't expect stable kernels to pick up that fix as the cleanups would have also had to have been picked up. Folks on larger CPU systems with modules will want to just upgrade if vmalloc space has been an issue on bootup. Given the size of this request, here's some more elaborate details on this pull request. The functional change change in this pull request is the very first patch from Song Liu which replaces the struct module_layout with a new struct module memory. The old data structure tried to put together all types of supported module memory types in one data structure, the new one abstracts the differences in memory types in a module to allow each one to provide their own set of details. This paves the way in the future so we can deal with them in a cleaner way. If you look at changes they also provide a nice cleanup of how we handle these different memory areas in a module. This change has been in linux-next since before the merge window opened for v6.3 so to provide more than a full kernel cycle of testing. It's a good thing as quite a bit of fixes have been found for it. Jason Baron then made dynamic debug a first class citizen module user by using module notifier callbacks to allocate / remove module specific dynamic debug information. Nick Alcock has done quite a bit of work cross-tree to remove module license tags from things which cannot possibly be module at my request so to: a) help him with his longer term tooling goals which require a deterministic evaluation if a piece a symbol code could ever be part of a module or not. But quite recently it is has been made clear that tooling is not the only one that would benefit. Disambiguating symbols also helps efforts such as live patching, kprobes and BPF, but for other reasons and R&D on this area is active with no clear solution in sight. b) help us inch closer to the now generally accepted long term goal of automating all the MODULE_LICENSE() tags from SPDX license tags In so far as a) is concerned, although module license tags are a no-op for non-modules, tools which would want create a mapping of possible modules can only rely on the module license tag after the commit `8b41fc4454` ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"). Nick has been working on this for years and AFAICT I was the only one to suggest two alternatives to this approach for tooling. The complexity in one of my suggested approaches lies in that we'd need a possible-obj-m and a could-be-module which would check if the object being built is part of any kconfig build which could ever lead to it being part of a module, and if so define a new define -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've suggested would be to have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names mapping to modules always, and I don't think that's the case today. I am not aware of Nick or anyone exploring either of these options. Quite recently Josh Poimboeuf has pointed out that live patching, kprobes and BPF would benefit from resolving some part of the disambiguation as well but for other reasons. The function granularity KASLR (fgkaslr) patches were mentioned but Joe Lawrence has clarified this effort has been dropped with no clear solution in sight [1]. In the meantime removing module license tags from code which could never be modules is welcomed for both objectives mentioned above. Some developers have also welcomed these changes as it has helped clarify when a module was never possible and they forgot to clean this up, and so you'll see quite a bit of Nick's patches in other pull requests for this merge window. I just picked up the stragglers after rc3. LWN has good coverage on the motivation behind this work [2] and the typical cross-tree issues he ran into along the way. The only concrete blocker issue he ran into was that we should not remove the MODULE_LICENSE() tags from files which have no SPDX tags yet, even if they can never be modules. Nick ended up giving up on his efforts due to having to do this vetting and backlash he ran into from folks who really did not understand the core of the issue nor were providing any alternative / guidance. I've gone through his changes and dropped the patches which dropped the module license tags where an SPDX license tag was missing, it only consisted of 11 drivers. To see if a pull request deals with a file which lacks SPDX tags you can just use: ./scripts/spdxcheck.py -f \ $(git diff --name-only commid-id \| xargs echo) You'll see a core module file in this pull request for the above, but that's not related to his changes. WE just need to add the SPDX license tag for the kernel/module/kmod.c file in the future but it demonstrates the effectiveness of the script. Most of Nick's changes were spread out through different trees, and I just picked up the slack after rc3 for the last kernel was out. Those changes have been in linux-next for over two weeks. The cleanups, debug code I added and final fix I added for modules were motivated by David Hildenbrand's report of boot failing on a systems with over 400 CPUs when KASAN was enabled due to running out of virtual memory space. Although the functional change only consists of 3 lines in the patch "module: avoid allocation if module is already present and ready", proving that this was the best we can do on the modules side took quite a bit of effort and new debug code. The initial cleanups I did on the modules side of things has been in linux-next since around rc3 of the last kernel, the actual final fix for and debug code however have only been in linux-next for about a week or so but I think it is worth getting that code in for this merge window as it does help fix / prove / evaluate the issues reported with larger number of CPUs. Userspace is not yet fixed as it is taking a bit of time for folks to understand the crux of the issue and find a proper resolution. Worst come to worst, I have a kludge-of-concept [3] of how to make kernel_read() calls for modules unique / converge them, but I'm currently inclined to just see if userspace can fix this instead. [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [2] https://lwn.net/Articles/927569/ [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/ WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6 agiwdHUMVN7X =56WK -----END PGP SIGNATURE----- Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "The summary of the changes for this pull requests is: - Song Liu's new struct module_memory replacement - Nick Alcock's MODULE_LICENSE() removal for non-modules - My cleanups and enhancements to reduce the areas where we vmalloc module memory for duplicates, and the respective debug code which proves the remaining vmalloc pressure comes from userspace. Most of the changes have been in linux-next for quite some time except the minor fixes I made to check if a module was already loaded prior to allocating the final module memory with vmalloc and the respective debug code it introduces to help clarify the issue. Although the functional change is small it is rather safe as it can only help* reduce vmalloc space for duplicates and is confirmed to fix a bootup issue with over 400 CPUs with KASAN enabled. I don't expect stable kernels to pick up that fix as the cleanups would have also had to have been picked up. Folks on larger CPU systems with modules will want to just upgrade if vmalloc space has been an issue on bootup. Given the size of this request, here's some more elaborate details: The functional change change in this pull request is the very first patch from Song Liu which replaces the 'struct module_layout' with a new 'struct module_memory'. The old data structure tried to put together all types of supported module memory types in one data structure, the new one abstracts the differences in memory types in a module to allow each one to provide their own set of details. This paves the way in the future so we can deal with them in a cleaner way. If you look at changes they also provide a nice cleanup of how we handle these different memory areas in a module. This change has been in linux-next since before the merge window opened for v6.3 so to provide more than a full kernel cycle of testing. It's a good thing as quite a bit of fixes have been found for it. Jason Baron then made dynamic debug a first class citizen module user by using module notifier callbacks to allocate / remove module specific dynamic debug information. Nick Alcock has done quite a bit of work cross-tree to remove module license tags from things which cannot possibly be module at my request so to: a) help him with his longer term tooling goals which require a deterministic evaluation if a piece a symbol code could ever be part of a module or not. But quite recently it is has been made clear that tooling is not the only one that would benefit. Disambiguating symbols also helps efforts such as live patching, kprobes and BPF, but for other reasons and R&D on this area is active with no clear solution in sight. b) help us inch closer to the now generally accepted long term goal of automating all the MODULE_LICENSE() tags from SPDX license tags In so far as a) is concerned, although module license tags are a no-op for non-modules, tools which would want create a mapping of possible modules can only rely on the module license tag after the commit `8b41fc4454` ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"). Nick has been working on this for years and AFAICT I was the only one to suggest two alternatives to this approach for tooling. The complexity in one of my suggested approaches lies in that we'd need a possible-obj-m and a could-be-module which would check if the object being built is part of any kconfig build which could ever lead to it being part of a module, and if so define a new define -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've suggested would be to have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names mapping to modules always, and I don't think that's the case today. I am not aware of Nick or anyone exploring either of these options. Quite recently Josh Poimboeuf has pointed out that live patching, kprobes and BPF would benefit from resolving some part of the disambiguation as well but for other reasons. The function granularity KASLR (fgkaslr) patches were mentioned but Joe Lawrence has clarified this effort has been dropped with no clear solution in sight [1]. In the meantime removing module license tags from code which could never be modules is welcomed for both objectives mentioned above. Some developers have also welcomed these changes as it has helped clarify when a module was never possible and they forgot to clean this up, and so you'll see quite a bit of Nick's patches in other pull requests for this merge window. I just picked up the stragglers after rc3. LWN has good coverage on the motivation behind this work [2] and the typical cross-tree issues he ran into along the way. The only concrete blocker issue he ran into was that we should not remove the MODULE_LICENSE() tags from files which have no SPDX tags yet, even if they can never be modules. Nick ended up giving up on his efforts due to having to do this vetting and backlash he ran into from folks who really did not understand the core of the issue nor were providing any alternative / guidance. I've gone through his changes and dropped the patches which dropped the module license tags where an SPDX license tag was missing, it only consisted of 11 drivers. To see if a pull request deals with a file which lacks SPDX tags you can just use: ./scripts/spdxcheck.py -f \ $(git diff --name-only commid-id \| xargs echo) You'll see a core module file in this pull request for the above, but that's not related to his changes. WE just need to add the SPDX license tag for the kernel/module/kmod.c file in the future but it demonstrates the effectiveness of the script. Most of Nick's changes were spread out through different trees, and I just picked up the slack after rc3 for the last kernel was out. Those changes have been in linux-next for over two weeks. The cleanups, debug code I added and final fix I added for modules were motivated by David Hildenbrand's report of boot failing on a systems with over 400 CPUs when KASAN was enabled due to running out of virtual memory space. Although the functional change only consists of 3 lines in the patch "module: avoid allocation if module is already present and ready", proving that this was the best we can do on the modules side took quite a bit of effort and new debug code. The initial cleanups I did on the modules side of things has been in linux-next since around rc3 of the last kernel, the actual final fix for and debug code however have only been in linux-next for about a week or so but I think it is worth getting that code in for this merge window as it does help fix / prove / evaluate the issues reported with larger number of CPUs. Userspace is not yet fixed as it is taking a bit of time for folks to understand the crux of the issue and find a proper resolution. Worst come to worst, I have a kludge-of-concept [3] of how to make kernel_read() calls for modules unique / converge them, but I'm currently inclined to just see if userspace can fix this instead" Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0] Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1] Link: https://lwn.net/Articles/927569/ [2] Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3] tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits) module: add debugging auto-load duplicate module support module: stats: fix invalid_mod_bytes typo module: remove use of uninitialized variable len module: fix building stats for 32-bit targets module: stats: include uapi/linux/module.h module: avoid allocation if module is already present and ready module: add debug stats to help identify memory pressure module: extract patient module check into helper modules/kmod: replace implementation with a semaphore Change DEFINE_SEMAPHORE() to take a number argument module: fix kmemleak annotations for non init ELF sections module: Ignore L0 and rename is_arm_mapping_symbol() module: Move is_arm_mapping_symbol() to module_symbol.h module: Sync code of is_arm_mapping_symbol() scripts/gdb: use mem instead of core_layout to get the module address interconnect: remove module-related code interconnect: remove MODULE_LICENSE in non-modules zswap: remove MODULE_LICENSE in non-modules zpool: remove MODULE_LICENSE in non-modules x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules ...	2023-04-27 16:36:55 -07:00
Ard Biesheuvel	94330fbe08	crypto: x86/sha - Use local .L symbols for code Avoid cluttering up the kallsyms symbol table with entries that should not end up in things like backtraces, as they have undescriptive and generated identifiers. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	9ac589cf3c	crypto: x86/crc32 - Use local .L symbols for code Avoid cluttering up the kallsyms symbol table with entries that should not end up in things like backtraces, as they have undescriptive and generated identifiers. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	1d4b0ff30c	crypto: x86/aesni - Use local .L symbols for code Avoid cluttering up the kallsyms symbol table with entries that should not end up in things like backtraces, as they have undescriptive and generated identifiers. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	e4ab7680bb	crypto: x86/sha256 - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	c41672b9fd	crypto: x86/ghash - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	3695536028	crypto: x86/des3 - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Co-developed-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	3b519dc878	crypto: x86/crc32c - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	7f8ec31648	crypto: x86/cast6 - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Co-developed-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	0dcc7782de	crypto: x86/cast5 - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Co-developed-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	24ff1e9d72	crypto: x86/camellia - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Co-developed-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Thomas Garnier <thgarnie@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	52fc482a12	crypto: x86/aria - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	c75962f1c4	crypto: x86/aesni - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. In the GCM case, we can get rid of the oversized permutation array entirely while at it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Ard Biesheuvel	9d5aef1222	crypto: x86/aegis128 - Use RIP-relative addressing Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-04-20 18:20:04 +08:00
Nick Alcock	b71e2a69d1	crypto: blake2s: remove module_init and module.h inclusion Now this can no longer be built as a module, drop all remaining module-related code as well. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: linux-crypto@vger.kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>	2023-04-13 13:13:51 -07:00
Nick Alcock	41a98c68f2	crypto: remove MODULE_LICENSE in non-modules Since commit `8b41fc4454` ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: linux-crypto@vger.kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>	2023-04-13 13:13:51 -07:00
Taehee Yoo	8b84475318	crypto: x86/aria-avx - Do not use avx2 instructions vpbroadcastb and vpbroadcastd are not AVX instructions. But the aria-avx assembly code contains these instructions. So, kernel panic will occur if the aria-avx works on AVX2 unsupported CPU. vbroadcastss, and vpshufb are used to avoid using vpbroadcastb in it. Unfortunately, this change reduces performance by about 5%. Also, vpbroadcastd is simply replaced by vmovdqa in it. Fixes: `ba3579e6e4` ("crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher") Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-02-14 13:39:33 +08:00
Peter Lafreniere	c9adc75d32	crypto: x86/blowfish - Eliminate use of SYM_TYPED_FUNC_START in asm Now that we use the ECB/CBC macros, none of the asm functions in blowfish-x86_64 are called indirectly. So we can safely use SYM_FUNC_START instead of SYM_TYPED_FUNC_START with no effect, allowing us to remove an include. Signed-off-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-02-10 17:20:19 +08:00
Peter Lafreniere	bc3f42acc4	crypto: x86/blowfish - Convert to use ECB/CBC helpers We can simplify the blowfish-x86_64 glue code by using the preexisting ECB/CBC helper macros. Additionally, this allows for easier reuse of asm functions in later x86 implementations of blowfish. This involves: 1 - Modifying blowfish_dec_blk_4way() to xor outputs when a flag is passed. 2 - Renaming blowfish_dec_blk_4way() to __blowfish_dec_blk_4way(). 3 - Creating two wrapper functions around __blowfish_dec_blk_4way() for use in the ECB/CBC macros. 4 - Removing the custom ecb_encrypt() and cbc_encrypt() routines in favor of macro-based routines. Signed-off-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-02-10 17:20:19 +08:00
Peter Lafreniere	b529ea6593	crypto: x86/blowfish - Remove unused encode parameter The blowfish-x86_64 encryption functions have an unused argument. Remove it. This involves: 1 - Removing xor_block() macros. 2 - Removing handling of fourth argument from __blowfish_enc_blk{,_4way}() functions. 3 - Renaming __blowfish_enc_blk{,_4way}() to blowfish_enc_blk{,_4way}(). 4 - Removing the blowfish_enc_blk{,_4way}() wrappers from blowfish_glue.c 5 - Temporarily using SYM_TYPED_FUNC_START for now indirectly-callable encode functions. Signed-off-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-02-10 17:20:19 +08:00
Peter Lafreniere	8a1955f958	crypto: x86 - exit fpu context earlier in ECB/CBC macros Currently the ecb/cbc macros hold fpu context unnecessarily when using scalar cipher routines (e.g. when handling odd sizes of blocks per walk). Change the macros to drop fpu context as soon as the fpu is out of use. No performance impact found (on Intel Haswell). Signed-off-by: Peter Lafreniere <peter@n8pjl.ca> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-02-03 12:54:54 +08:00
Taehee Yoo	d6b7ec1106	crypto: x86/aria-avx512 - fix build failure with old binutils The minimum version of binutils for kernel build is currently 2.23 and it doesn't support GFNI. So, it fails to build the aria-avx512 if the old binutils is used. aria-avx512 requires GFNI, so it should not be allowed to build if the old binutils is used. The AS_AVX512 and AS_GFNI are added to the Kconfig to disable build aria-avx512 if the old binutils is used. Fixes: `c970d42001` ("crypto: x86/aria - implement aria-avx512") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-20 18:29:32 +08:00
Taehee Yoo	1eb468b3c7	crypto: x86/aria-avx2 - fix build failure with old binutils The minimum version of binutils for kernel build is currently 2.23 and it doesn't support GFNI. So, it fails to build the aria-avx2 if the old binutils is used. The code using GFNI is an optional part of aria-avx2. So, it disables GFNI part in it when the old binutils is used. Fixes: `37d8d3ae7a` ("crypto: x86/aria - implement aria-avx2") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-20 18:29:32 +08:00
Taehee Yoo	e3cf2f8794	crypto: x86/aria-avx - fix build failure with old binutils The minimum version of binutils for kernel build is currently 2.23 and it doesn't support GFNI. So, it fails to build the aria-avx if the old binutils is used. The code using GFNI is an optional part of aria-avx. So, it disables GFNI part in it when the old binutils is used. In order to check whether the using binutils is supporting GFNI or not, AS_GFNI is added. Fixes: `ba3579e6e4` ("crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-20 18:29:31 +08:00
Taehee Yoo	c970d42001	crypto: x86/aria - implement aria-avx512 aria-avx512 implementation uses AVX512 and GFNI. It supports 64way parallel processing. So, byteslicing code is changed to support 64way parallel. And it exports some aria-avx2 functions such as encrypt() and decrypt(). AVX and AVX2 have 16 registers. They should use memory to store/load state because of lack of registers. But AVX512 supports 32 registers. So, it doesn't require store/load in the s-box layer. It means that it can reduce overhead of store/load in the s-box layer. Also code become much simpler. Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100: ARIA-AVX512(128bit and 256bit) testing speed of multibuffer ecb(aria) (ecb-aria-avx512) encryption tcrypt: 1 operation in 1504 cycles (1024 bytes) tcrypt: 1 operation in 4595 cycles (4096 bytes) tcrypt: 1 operation in 1763 cycles (1024 bytes) tcrypt: 1 operation in 5540 cycles (4096 bytes) testing speed of multibuffer ecb(aria) (ecb-aria-avx512) decryption tcrypt: 1 operation in 1502 cycles (1024 bytes) tcrypt: 1 operation in 4615 cycles (4096 bytes) tcrypt: 1 operation in 1759 cycles (1024 bytes) tcrypt: 1 operation in 5554 cycles (4096 bytes) ARIA-AVX2 with GFNI(128bit and 256bit) testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption tcrypt: 1 operation in 2003 cycles (1024 bytes) tcrypt: 1 operation in 5867 cycles (4096 bytes) tcrypt: 1 operation in 2358 cycles (1024 bytes) tcrypt: 1 operation in 7295 cycles (4096 bytes) testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption tcrypt: 1 operation in 2004 cycles (1024 bytes) tcrypt: 1 operation in 5956 cycles (4096 bytes) tcrypt: 1 operation in 2409 cycles (1024 bytes) tcrypt: 1 operation in 7564 cycles (4096 bytes) Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-06 17:15:47 +08:00
Taehee Yoo	37d8d3ae7a	crypto: x86/aria - implement aria-avx2 aria-avx2 implementation uses AVX2, AES-NI, and GFNI. It supports 32way parallel processing. So, byteslicing code is changed to support 32way parallel. And it exports some aria-avx functions such as encrypt() and decrypt(). There are two main logics, s-box layer and diffusion layer. These codes are the same as aria-avx implementation. But some instruction are exchanged because they don't support 256bit registers. Also, AES-NI doesn't support 256bit register. So, aesenclast and aesdeclast are used twice like below: vextracti128 $1, ymm0, xmm6; vaesenclast xmm7, xmm0, xmm0; vaesenclast xmm7, xmm6, xmm6; vinserti128 $1, xmm6, ymm0, ymm0; Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100: ARIA-AVX2 with GFNI(128bit and 256bit) testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption tcrypt: 1 operation in 2003 cycles (1024 bytes) tcrypt: 1 operation in 5867 cycles (4096 bytes) tcrypt: 1 operation in 2358 cycles (1024 bytes) tcrypt: 1 operation in 7295 cycles (4096 bytes) testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption tcrypt: 1 operation in 2004 cycles (1024 bytes) tcrypt: 1 operation in 5956 cycles (4096 bytes) tcrypt: 1 operation in 2409 cycles (1024 bytes) tcrypt: 1 operation in 7564 cycles (4096 bytes) ARIA-AVX with GFNI(128bit and 256bit) testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption tcrypt: 1 operation in 2761 cycles (1024 bytes) tcrypt: 1 operation in 9390 cycles (4096 bytes) tcrypt: 1 operation in 3401 cycles (1024 bytes) tcrypt: 1 operation in 11876 cycles (4096 bytes) testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption tcrypt: 1 operation in 2735 cycles (1024 bytes) tcrypt: 1 operation in 9424 cycles (4096 bytes) tcrypt: 1 operation in 3369 cycles (1024 bytes) tcrypt: 1 operation in 11954 cycles (4096 bytes) Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-06 17:15:47 +08:00
Taehee Yoo	35344cf30f	crypto: x86/aria - do not use magic number offsets of aria_ctx aria-avx assembly code accesses members of aria_ctx with magic number offset. If the shape of struct aria_ctx is changed carelessly, aria-avx will not work. So, we need to ensure accessing members of aria_ctx with correct offset values, not with magic numbers. It adds ARIA_CTX_enc_key, ARIA_CTX_dec_key, and ARIA_CTX_rounds in the asm-offsets.c So, correct offset definitions will be generated. aria-avx assembly code can access members of aria_ctx safely with these definitions. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-06 17:15:47 +08:00
Taehee Yoo	8e7d7ce2e3	crypto: x86/aria - add keystream array into request ctx avx accelerated aria module used local keystream array. But, keystream array size is too big. So, it puts the keystream array into request ctx. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2023-01-06 17:15:47 +08:00
Eric Biggers	750426d633	crypto: x86/ghash - add comment and fix broken link Add a comment that explains what ghash_setkey() is doing, as it's hard to understand otherwise. Also fix a broken hyperlink. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-12-30 17:57:42 +08:00
Eric Biggers	f1740751f7	crypto: x86/ghash - use le128 instead of u128 The u128 struct type is going away, so make ghash-clmulni-intel use le128 instead. Note that the field names a and b swapped, as they were backwards with u128. (a is meant to be high-order and b low-order.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-12-30 17:57:42 +08:00
Eric Biggers	116db2704c	crypto: x86/ghash - fix unaligned access in ghash_setkey() The key can be unaligned, so use the unaligned memory access helpers. Fixes: `8ceee72808` ("crypto: ghash-clmulni-intel - use C implementation for setkey()") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-12-30 17:57:42 +08:00
Linus Torvalds	94a855111e	- Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd 73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/ zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs= =cRy1 -----END PGP SIGNATURE----- Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...	2022-12-14 15:03:00 -08:00
Herbert Xu	14386d4713	crypto: Prepare to move crypto_tfm_ctx The helper crypto_tfm_ctx is only used by the Crypto API algorithm code and should really be in algapi.h. However, for historical reasons many files relied on it to be in crypto.h. This patch changes those files to use algapi.h instead in prepartion for a move. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-12-02 18:12:40 +08:00
Joe Fradley	c390c452eb	crypto: x86/curve25519 - disable gcov curve25519-x86_64.c fails to build when CONFIG_GCOV_KERNEL is enabled. The error is "inline assembly requires more registers than available" thrown from the `fsqr()` function. Therefore, excluding this file from GCOV profiling until this issue is resolved. Thereby allowing CONFIG_GCOV_PROFILE_ALL to be enabled for x86. Signed-off-by: Joe Fradley <joefradley@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-12-02 18:12:40 +08:00
Eric Biggers	2d203c46a0	crypto: x86/sm4 - fix crash with CFI enabled sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(), sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(), sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure. (Or at least that should be the case. For some reason the CFI checks in sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt() are not always being generated, using current tip-of-tree clang. Anyway, this patch is a good idea anyway.) Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	8ba490d9f5	crypto: x86/sm3 - fix possible crash with CFI enabled sm3_transform_avx() is called via indirect function calls. Therefore it needs to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause its type hash to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if the compiler didn't happen to optimize out the indirect call). Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	a1d72fa331	crypto: x86/sha512 - fix possible crash with CFI enabled sha512_transform_ssse3(), sha512_transform_avx(), and sha512_transform_rorx() are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if the compiler didn't happen to optimize out the indirect calls). Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	19940ebbb5	crypto: x86/sha256 - fix possible crash with CFI enabled sha256_transform_ssse3(), sha256_transform_avx(), sha256_transform_rorx(), and sha256_ni_transform() are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if the compiler didn't happen to optimize out the indirect calls). Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	32f34bf7e4	crypto: x86/sha1 - fix possible crash with CFI enabled sha1_transform_ssse3(), sha1_transform_avx(), and sha1_ni_transform() (but not sha1_transform_avx2()) are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if the compiler didn't happen to optimize out the indirect calls). Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	0f8bc4bd48	crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers Since the CFI implementation now supports indirect calls to assembly functions, take advantage of that rather than use wrapper functions. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	c67b553a4f	crypto: x86/aria - fix crash with CFI enabled aria_aesni_avx_encrypt_16way(), aria_aesni_avx_decrypt_16way(), aria_aesni_avx_ctr_crypt_16way(), aria_aesni_avx_gfni_encrypt_16way(), aria_aesni_avx_gfni_decrypt_16way(), and aria_aesni_avx_gfni_ctr_crypt_16way() are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure. Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:19 +08:00
Eric Biggers	8bd9974b6b	crypto: x86/aegis128 - fix possible crash with CFI enabled crypto_aegis128_aesni_enc(), crypto_aegis128_aesni_enc_tail(), crypto_aegis128_aesni_dec(), and crypto_aegis128_aesni_dec_tail() are called via indirect function calls. Therefore they need to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if the compiler didn't happen to optimize out the indirect calls). Fixes: `ccace936ee` ("x86: Add types to indirectly called assembly functions") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-11-25 17:39:18 +08:00
Ingo Molnar	0ce096db71	Linux 6.1-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmN6wAgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG0EYH/3/RO90NbrFItraN Lzr+d3VdbGjTu8xd1M+PRTmwh3zxLpB+Jwqr0T0A2gzL9B/D+AUPUJdrCVbv9DqS FLJAVqoeV20dNBAHSffOOLPsgCZ+Eu+LzlNN7Iqde0e8cyZICFMNktitui84Xm/i 1NgFVgz9OZ6+aieYvUj3FrFq0p8GTIaC/oybDZrxYKcO8ZzKVMJ11swRw10wwq0g qOOECvV3w7wlQ8upQZkzFxItKFc7EexZI6R4elXeGSJJ9Hlc092dv/zsKB9dwV+k WcwkJrZRoezYXzgGBFxUcQtzi+ethjrPjuJuM1rYLUSIcfIW/0lkaSLgRoBu8D+I 1GfXkXs= =gt6P -----END PGP SIGNATURE----- Merge tag 'v6.1-rc6' into x86/core, to resolve conflicts Resolve conflicts between these commits in arch/x86/kernel/asm-offsets.c: # upstream: `debc5a1ec0` ("KVM: x86: use a separate asm-offsets.c file") # retbleed work in x86/core: `5d8213864a` ("x86/retbleed: Add SKL return thunk") ... and these commits in include/linux/bpf.h: # upstram: `18acb7fac2` ("bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")") # x86/core commits: `931ab63664` ("x86/ibt: Implement FineIBT") `bea75b3389` ("x86/Kconfig: Introduce function padding") The latter two modify BPF_DISPATCHER_ATTRIBUTES(), which was removed upstream. Conflicts: arch/x86/kernel/asm-offsets.c include/linux/bpf.h Signed-off-by: Ingo Molnar <mingo@kernel.org>	2022-11-21 23:01:51 +01:00
Nathan Huckleberry	9f6035af06	crypto: x86/polyval - Fix crashes when keys are not 16-byte aligned crypto_tfm::__crt_ctx is not guaranteed to be 16-byte aligned on x86-64. This causes crashes due to movaps instructions in clmul_polyval_update. Add logic to align polyval_tfm_ctx to 16 bytes. Cc: <stable@vger.kernel.org> Fixes: `34f7f6c301` ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-10-21 19:05:05 +08:00
Thomas Gleixner	fdc9ee7e97	crypto: x86/poly1305: Remove custom function alignment SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111145.073285765@infradead.org	2022-10-17 16:41:03 +02:00
Thomas Gleixner	e2c9475e88	crypto: twofish: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.971229477@infradead.org	2022-10-17 16:41:03 +02:00
Thomas Gleixner	2f93238b87	crypto: x86/sm[34]: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. ( this code couldn't seem to make up it's mind about what alignment it actually wanted, randomly mixing 8 and 16 bytes ) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.868540856@infradead.org	2022-10-17 16:41:02 +02:00
Thomas Gleixner	3ba56d0b87	crypto: x86/sha256: Remove custom alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.766564176@infradead.org	2022-10-17 16:41:02 +02:00
Thomas Gleixner	c2a3ce6fdb	crypto: x86/sha1: Remove custom alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.662580589@infradead.org	2022-10-17 16:41:01 +02:00
Thomas Gleixner	8b44221671	crypto: x86/serpent: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.558544791@infradead.org	2022-10-17 16:41:01 +02:00
Thomas Gleixner	ba1b270c20	crypto: x86/crct10dif-pcl: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.456602381@infradead.org	2022-10-17 16:41:01 +02:00
Thomas Gleixner	88cdf02551	crypto: x86/cast5: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.353555711@infradead.org	2022-10-17 16:41:00 +02:00
Thomas Gleixner	f6dabc817e	crypto: x86/camellia: Remove redundant alignments SYM_FUNC_START*() and friends already imply alignment, remove custom alignment hacks to make code consistent. This prepares for future function call ABI changes. Also, with having pushed the function alignment to 16 bytes, this custom alignment is completely superfluous. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111144.248229966@infradead.org	2022-10-17 16:41:00 +02:00
Linus Torvalds	3604a7f568	This update includes the following changes: API: - Feed untrusted RNGs into /dev/random. - Allow HWRNG sleeping to be more interruptible. - Create lib/utils module. - Setting private keys no longer required for akcipher. - Remove tcrypt mode=1000. - Reorganised Kconfig entries. Algorithms: - Load x86/sha512 based on CPU features. - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher. Drivers: - Add HACE crypto driver aspeed. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmM785cACgkQxycdCkmx i6dveBAAmGVYtrPmcGfA6CmzZ8ps9KdZxhjHjzLKwuqrOMulZvE2IYeUV4QtNqpQ 6NLY2+TkqL0XIbCXoByIk32lMYIlXBaJdMYdHHDTeo7E2wqZn/46SPSWeNKazyJx dkL8Oj62nqDc2s0LOi3vLvod+sENFQ69R+vkHOa0fZhX0UBsac3NIXo+74Y2A7bE 0+iQFKTWdNnoQzQ0j4q8WMiolKYh21iPZ9l5sjgMgichLCaE6PrITlRcaWrtPhey U1OmJtbTPsg+5X1r9KyLtoAXtBDONl66GQyne+p/ZYD8cMhxomjJaPlMhwWE/n4d d2KJKvoXoPPo4c+yNIS9hBav07ZriPl0q0jd2M1rd6oYTmFpaodTgIBfjvxO+wfV GoqDS8PEc42U1uwkuKC/cvfr6pB8WiybfXy+vSXBm/jUgIOO3y+eqsC8Jx9ZoQeG F+d34PYfJrJbmDRtcA6ZKdzN0OmKq7aCilx1kGKGPg0D+uq64FBo7zsT6XzTK8HL 2Za9AACPn87xLQwGrKDSBfyrlSSIJm2FaIIPayUXHEo7cyoiZwbTpXRRJ1mDR+v9 jzI+xPEXCthtjysuRmufNhTkiZUv3lZ8ORfQ0QFKR53tjZUm+dVQo0V/N/ZSXoSV SyRvXYO+ToXePAofNWl1LcO1grX/vxtFNedMkDLHXooRcnCaIYo= =rq2f -----END PGP SIGNATURE----- Merge tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Feed untrusted RNGs into /dev/random - Allow HWRNG sleeping to be more interruptible - Create lib/utils module - Setting private keys no longer required for akcipher - Remove tcrypt mode=1000 - Reorganised Kconfig entries Algorithms: - Load x86/sha512 based on CPU features - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher Drivers: - Add HACE crypto driver aspeed" * tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (124 commits) crypto: aspeed - Remove redundant dev_err call crypto: scatterwalk - Remove unused inline function scatterwalk_aligned() crypto: aead - Remove unused inline functions from aead crypto: bcm - Simplify obtain the name for cipher crypto: marvell/octeontx - use sysfs_emit() to instead of scnprintf() hwrng: core - start hwrng kthread also for untrusted sources crypto: zip - remove the unneeded result variable crypto: qat - add limit to linked list parsing crypto: octeontx2 - Remove the unneeded result variable crypto: ccp - Remove the unneeded result variable crypto: aspeed - Fix check for platform_get_irq() errors crypto: virtio - fix memory-leak crypto: cavium - prevent integer overflow loading firmware crypto: marvell/octeontx - prevent integer overflows crypto: aspeed - fix build error when only CRYPTO_DEV_ASPEED is enabled crypto: hisilicon/qm - fix the qos value initialization crypto: sun4i-ss - use DEFINE_SHOW_ATTRIBUTE to simplify sun4i_ss_debugfs crypto: tcrypt - add async speed test for aria cipher crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher crypto: aria - prepare generic module for optimized implementations ...	2022-10-10 13:04:25 -07:00
Sami Tolvanen	ccace936ee	x86: Add types to indirectly called assembly functions With CONFIG_CFI_CLANG, assembly functions indirectly called from C code must be annotated with type identifiers to pass CFI checking. Define the __CFI_TYPE helper macro to match the compiler generated function preamble, and ensure SYM_TYPED_FUNC_START also emits ENDBR with IBT. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220908215504.3686827-21-samitolvanen@google.com	2022-09-26 10:13:15 -07:00
Taehee Yoo	ba3579e6e4	crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher The implementation is based on the 32-bit implementation of the aria. Also, aria-avx process steps are the similar to the camellia-avx. 1. Byteslice(16way) 2. Add-round-key. 3. Sbox 4. Diffusion layer. Except for s-box, all steps are the same as the aria-generic implementation. s-box step is very similar to camellia and sm4 implementation. There are 2 implementations for s-box step. One is to use AES-NI and affine transformation, which is the same as Camellia, sm4, and others. Another is to use GFNI. GFNI implementation is faster than AES-NI implementation. So, it uses GFNI implementation if the running CPU supports GFNI. There are 4 s-boxes in the ARIA and the 2 s-boxes are the same as AES's s-boxes. To calculate the first sbox, it just uses the aesenclast and then inverts shift_row. No more process is needed for this job because the first s-box is the same as the AES encryption s-box. To calculate the second sbox(invert of s1), it just uses the aesdeclast and then inverts shift_row. No more process is needed for this job because the second s-box is the same as the AES decryption s-box. To calculate the third s-box, it uses the aesenclast, then affine transformation, which is combined AES inverse affine and ARIA S2. To calculate the last s-box, it uses the aesdeclast, then affine transformation, which is combined X2 and AES forward affine. The optimized third and last s-box logic and GFNI s-box logic are implemented by Jussi Kivilinna. The aria-generic implementation is based on a 32-bit implementation, not an 8-bit implementation. the aria-avx Diffusion Layer implementation is based on aria-generic implementation because 8-bit implementation is not fit for parallel implementation but 32-bit is enough to fit for this. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-09-24 16:14:44 +08:00
Robert Elliott	cf514b2a59	crypto: Kconfig - simplify cipher entries Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:43 +08:00
Robert Elliott	3f342a2325	crypto: Kconfig - simplify hash entries Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:43 +08:00
Robert Elliott	e3d2eadd06	crypto: Kconfig - simplify aead entries Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:42 +08:00
Robert Elliott	ec84348da4	crypto: Kconfig - simplify CRC entries Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:42 +08:00
Robert Elliott	05b3746527	crypto: Kconfig - simplify public-key entries Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:42 +08:00
Robert Elliott	28a936ef44	crypto: Kconfig - move x86 entries to a submenu Move CPU-specific crypto/Kconfig entries to arch/xxx/crypto/Kconfig and create a submenu for them under the Crypto API menu. Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-26 18:50:41 +08:00
Robert Elliott	aa031b8f70	crypto: x86/sha512 - load based on CPU features x86 optimized crypto modules built as modules rather than built-in to the kernel end up as .ko files in the filesystem, e.g., in /usr/lib/modules. If the filesystem itself is a module, these might not be available when the crypto API is initialized, resulting in the generic implementation being used (e.g., sha512_transform rather than sha512_transform_avx2). In one test case, CPU utilization in the sha512 function dropped from 15.34% to 7.18% after forcing loading of the optimized module. Add module aliases for this x86 optimized crypto module based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-08-19 18:39:39 +08:00
Linus Torvalds	c1c76700a0	SPDX changes for 6.0-rc1 Here is the set of SPDX comment updates for 6.0-rc1. Nothing huge here, just a number of updated SPDX license tags and cleanups based on the review of a number of common patterns in GPLv2 boilerplate text. Also included in here are a few other minor updates, 2 USB files, and one Documentation file update to get the SPDX lines correct. All of these have been in the linux-next tree for a very long time. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN 4AKdqbiBNlFbCroQwmeQ =v1sg -----END PGP SIGNATURE----- Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here is the set of SPDX comment updates for 6.0-rc1. Nothing huge here, just a number of updated SPDX license tags and cleanups based on the review of a number of common patterns in GPLv2 boilerplate text. Also included in here are a few other minor updates, two USB files, and one Documentation file update to get the SPDX lines correct. All of these have been in the linux-next tree for a very long time" * tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits) Documentation: samsung-s3c24xx: Add blank line after SPDX directive x86/crypto: Remove stray comment terminator treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2) treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1) treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE ...	2022-08-04 12:12:54 -07:00
Colin Ian King	1353e576ae	crypto: x86/blowfish - remove redundant assignment to variable nytes Variable nbytes is being assigned a value that is never read, it is being re-assigned in the next statement in the while-loop. The assignment is redundant and can be removed. Cleans up clang scan-build warnings, e.g.: arch/x86/crypto/blowfish_glue.c:147:10: warning: Although the value stored to 'nbytes' is used in the enclosing expression, the value is never actually read from 'nbytes' Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-07-15 16:43:22 +08:00
Thomas Gleixner	de01303553	x86/crypto: Remove stray comment terminator It seems the SPDX patch script managed to confuse itself. Fixes: `2eb72d6696` ("treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-13 09:47:58 +02:00
Thomas Gleixner	2eb72d6696	treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE Based on the normalized pattern: gpl header start do not alter or remove copyright notices or this file header this program is free software you can redistribute it and/or modify it under the terms of the gnu general public license version 2 only as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license version 2 for more details (a copy is included in the license file that accompanied this code) you should have received a copy of the gnu general public license version 2 along with this program if not see http://www gnu org/licenses please visit http://www xyratex com/contact if you need additional information or have any questions gpl header end extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference. Reviewed-by: Allison Randal <allison@lohutok.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-10 14:51:36 +02:00
Jason A. Donenfeld	2d16803c56	crypto: blake2s - remove shash module BLAKE2s has no currently known use as an shash. Just remove all of this unnecessary plumbing. Removing this shash was something we talked about back when we were making BLAKE2s a built-in, but I simply never got around to doing it. So this completes that project. Importantly, this fixs a bug in which the lib code depends on crypto_simd_disabled_for_test, causing linker errors. Also add more alignment tests to the selftests and compare SIMD and non-SIMD compression functions, to make up for what we lose from testmgr.c. Reported-by: gaochao <gaochao49@huawei.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: stable@vger.kernel.org Fixes: `6048fdcc5f` ("lib/crypto: blake2s: include as built-in") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-06-10 16:43:49 +08:00
Nathan Huckleberry	34f7f6c301	crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL Add hardware accelerated version of POLYVAL for x86-64 CPUs with PCLMULQDQ support. This implementation is accelerated using PCLMULQDQ instructions to perform the finite field computations. For added efficiency, 8 blocks of the message are processed simultaneously by precomputing the first 8 powers of the key. Schoolbook multiplication is used instead of Karatsuba multiplication because it was found to be slightly faster on x86-64 machines. Montgomery reduction must be used instead of Barrett reduction due to the difference in modulus between POLYVAL's field and other finite fields. More information on POLYVAL can be found in the HCTR2 paper: "Length-preserving encryption with HCTR2": https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-06-10 16:40:17 +08:00
Nathan Huckleberry	fd94fcf099	crypto: x86/aesni-xctr - Add accelerated implementation of XCTR Add hardware accelerated version of XCTR for x86-64 CPUs with AESNI support. More information on XCTR can be found in the HCTR2 paper: "Length-preserving encryption with HCTR2": https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-06-10 16:40:17 +08:00
Randy Dunlap	f16a005cde	crypto: x86 - eliminate anonymous module_init & module_exit Eliminate anonymous module_init() and module_exit(), which can lead to confusion or ambiguity when reading System.map, crashes/oops/bugs, or an initcall_debug log. Give each of these init and exit functions unique driver-specific names to eliminate the anonymous names. Example 1: (System.map) ffffffff832fc78c t init ffffffff832fc79e t init ffffffff832fc8f8 t init Example 2: (initcall_debug log) calling init+0x0/0x12 @ 1 initcall init+0x0/0x12 returned 0 after 15 usecs calling init+0x0/0x60 @ 1 initcall init+0x0/0x60 returned 0 after 2 usecs calling init+0x0/0x9a @ 1 initcall init+0x0/0x9a returned 0 after 74 usecs Fixes: `64b94ceae8` ("crypto: blowfish - add x86_64 assembly implementation") Fixes: `676a38046f` ("crypto: camellia-x86_64 - module init/exit functions should be static") Fixes: `0b95ec56ae` ("crypto: camellia - add assembler implementation for x86_64") Fixes: `56d76c96a9` ("crypto: serpent - add AVX2/x86_64 assembler implementation of serpent cipher") Fixes: `b9f535ffe3` ("[CRYPTO] twofish: i586 assembly version") Fixes: `ff0a70fe05` ("crypto: twofish-x86_64-3way - module init/exit functions should be static") Fixes: `8280daad43` ("crypto: twofish - add 3-way parallel x86_64 assembler implemention") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Cc: Joachim Fritschi <jfritschi@freenet.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Cc: x86@kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-04-08 16:13:31 +08:00
Linus Torvalds	93235e3df2	This push fixes the following issues: - Missing Kconfig dependency on arm that leads to boot failure. - x86 SLS fixes. - Reference leak in the stm32 driver. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmJD5lYACgkQxycdCkmx i6cE1w//Xp0x6/m+iMOtctbBy8dLRoKO3ADUxFI+Y+GdYYkUuRX5YKFRRBIIsRWv 1RLv9C53g5It7O5ohtH2oMrefLiZ2jWLojfBBrv/1pvol6r1LsAxSsRN9QrFSNAB Bsv6RouB/HYaMjbwEroPlj9/3XUlvsbvb4aNGxSnpcNI12HifxYRh3FPlJj/mdHh SPvPqpSewuDSajNubHfRAAvayG3md7iOZBFx1q+fAaczHiO5NK8DslktFlyRUbeV KT0YosZ7VuGLWgsQD052FYKqApqRzj9GmePtO/n5F24e+K5fbo0vP1XzjpTI2KAh I+vZ4CvTjSz3feFSsCNjLjd+KGj+cCuG2TrTn0rhM9o2bINGw+VWwSj3Wr7EBsS5 Gf9CzdLrlcpM+HfDW2HMEqX+MXsaGQ0eoKxWs5BeKrPAUtbWTG9Y0UNrZ/eeoLYa 4j6r3Lr0eb6zLzy6rRkG6iKN2tBUmj3BC6KZjNJaHq+bxHTY2myU1YLtcTHZXvKc x6I5G6e/AyRNQwcSoGYOnAnp8PfZyPaeMRR3ydxqRL/dZiJrH7xUjF0gr4ZYLcDr 9khwTmlMiSQA7X/FlgnmGFkVlFQdxIF1jQ5RXn5K/CrzWHgmbdoKB2rVJB/mdSMj TwAGCbL8r0Sr7SSkisHrgZN+mGOt4XxpPWh+IpkLUQx4iB1XI7I= =qKtj -----END PGP SIGNATURE----- Merge tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Missing Kconfig dependency on arm that leads to boot failure - x86 SLS fixes - Reference leak in the stm32 driver * tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/sm3 - Fixup SLS crypto: x86/poly1305 - Fixup SLS crypto: x86/chacha20 - Avoid spurious jumps to other functions crypto: stm32 - fix reference leak in stm32_crc_remove crypto: arm/aes-neonbs-cbc - Select generic cbc and aes	2022-03-31 11:17:39 -07:00
Peter Zijlstra	aa8e73eed7	crypto: x86/sm3 - Fixup SLS This missed the big asm update due to being merged through the crypto tree. Fixes: `f94909ceb1` ("x86: Prepare asm files for straight-line-speculation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-03-30 16:33:11 +12:00
Linus Torvalds	7001052160	Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a coarse grained, hardware based, forward edge Control-Flow-Integrity mechanism where any indirect CALL/JMP must target an ENDBR instruction or suffer #CP. Additionally, since Alderlake (12th gen)/Sapphire-Rapids, speculation is limited to 2 instructions (and typically fewer) on branch targets not starting with ENDBR. CET-IBT also limits speculation of the next sequential instruction after the indirect CALL/JMP [1]. CET-IBT is fundamentally incompatible with retpolines, but provides, as described above, speculation limits itself. [1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEv3OU3/byMaA0LqWJdkfhpEvA5LoFAmI/LI8VHHBldGVyekBp bmZyYWRlYWQub3JnAAoJEHZH4aRLwOS6ZnkP/2QCgQLTu6oRxv9O020CHwlaSEeD 1Hoy3loum5q5hAi1Ik3dR9p0H5u64c9qbrBVxaFoNKaLt5GKrtHaDSHNk2L/CFHX urpH65uvTLxbyZzcahkAahoJ71XU+m7PcrHLWMunw9sy10rExYVsUOlFyoyG6XCF BDCNZpdkC09ZM3vwlWGMZd5Pp+6HcZNPyoV9tpvWAS2l+WYFWAID7mflbpQ+tA8b y/hM6b3Ud0rT2ubuG1iUpopgNdwqQZ+HisMPGprh+wKZkYwS2l8pUTrz0MaBkFde go7fW16kFy2HQzGm6aIEBmfcg0palP/mFVaWP0zS62LwhJSWTn5G6xWBr3yxSsht 9gWCiI0oDZuTg698MedWmomdG2SK6yAuZuqmdKtLLoWfWgviPEi7TDFG/cKtZdAW ag8GM8T4iyYZzpCEcWO9GWbjo6TTGq30JBQefCBG47GjD0csv2ubXXx0Iey+jOwT x3E8wnv9dl8V9FSd/tMpTFmje8ges23yGrWtNpb5BRBuWTeuGiBPZED2BNyyIf+T dmewi2ufNMONgyNp27bDKopY81CPAQq9cVxqNm9Cg3eWPFnpOq2KGYEvisZ/rpEL EjMQeUBsy/C3AUFAleu1vwNnkwP/7JfKYpN00gnSyeQNZpqwxXBCKnHNgOMTXyJz beB/7u2KIUbKEkSN =jZfK -----END PGP SIGNATURE----- Merge tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CET-IBT (Control-Flow-Integrity) support from Peter Zijlstra: "Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a coarse grained, hardware based, forward edge Control-Flow-Integrity mechanism where any indirect CALL/JMP must target an ENDBR instruction or suffer #CP. Additionally, since Alderlake (12th gen)/Sapphire-Rapids, speculation is limited to 2 instructions (and typically fewer) on branch targets not starting with ENDBR. CET-IBT also limits speculation of the next sequential instruction after the indirect CALL/JMP [1]. CET-IBT is fundamentally incompatible with retpolines, but provides, as described above, speculation limits itself" [1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html * tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) kvm/emulate: Fix SETcc emulation for ENDBR x86/Kconfig: Only allow CONFIG_X86_KERNEL_IBT with ld.lld >= 14.0.0 x86/Kconfig: Only enable CONFIG_CC_HAS_IBT for clang >= 14.0.0 kbuild: Fixup the IBT kbuild changes x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy x86: Remove toolchain check for X32 ABI capability x86/alternative: Use .ibt_endbr_seal to seal indirect calls objtool: Find unused ENDBR instructions objtool: Validate IBT assumptions objtool: Add IBT/ENDBR decoding objtool: Read the NOENDBR annotation x86: Annotate idtentry_df() x86,objtool: Move the ASM_REACHABLE annotation to objtool.h x86: Annotate call_on_stack() objtool: Rework ASM_REACHABLE x86: Mark __invalid_creds() __noreturn exit: Mark do_group_exit() __noreturn x86: Mark stop_this_cpu() __noreturn objtool: Ignore extra-symbol code objtool: Rename --duplicate to --lto ...	2022-03-27 10:17:23 -07:00
Peter Zijlstra	7ed7aa4de9	crypto: x86/poly1305 - Fixup SLS Due to being a perl generated asm file, it got missed by the mass convertion script. arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_init_x86_64()+0x3a: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_x86_64()+0xf2: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_emit_x86_64()+0x37: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: __poly1305_block()+0x6d: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: __poly1305_init_avx()+0x1e8: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx()+0x18a: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx()+0xaf8: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_emit_avx()+0x99: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx2()+0x18a: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx2()+0x776: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx512()+0x18a: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx512()+0x796: missing int3 after ret arch/x86/crypto/poly1305-x86_64-cryptogams.o: warning: objtool: poly1305_blocks_avx512()+0x10bd: missing int3 after ret Fixes: `f94909ceb1` ("x86: Prepare asm files for straight-line-speculation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-03-25 16:21:06 +12:00
Peter Zijlstra	4327d16851	crypto: x86/chacha20 - Avoid spurious jumps to other functions The chacha_Nblock_xor_avx512vl() functions all have their own, identical, .LdoneN label, however in one particular spot {2,4} jump to the 8 version instead of their own. Resulting in: arch/x86/crypto/chacha-x86_64.o: warning: objtool: chacha_2block_xor_avx512vl() falls through to next function chacha_8block_xor_avx512vl() arch/x86/crypto/chacha-x86_64.o: warning: objtool: chacha_4block_xor_avx512vl() falls through to next function chacha_8block_xor_avx512vl() Make each function consistently use its own done label. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-03-25 16:21:05 +12:00
Linus Torvalds	93e220a62d	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - hwrng core now credits for low-quality RNG devices. Algorithms: - Optimisations for neon aes on arm/arm64. - Add accelerated crc32_be on arm64. - Add ffdheXYZ(dh) templates. - Disallow hmac keys < 112 bits in FIPS mode. - Add AVX assembly implementation for sm3 on x86. Drivers: - Add missing local_bh_disable calls for crypto_engine callback. - Ensure BH is disabled in crypto_engine callback path. - Fix zero length DMA mappings in ccree. - Add synchronization between mailbox accesses in octeontx2. - Add Xilinx SHA3 driver. - Add support for the TDES IP available on sama7g5 SoC in atmel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list crypto: dh - Remove the unused function dh_safe_prime_dh_alg() hwrng: nomadik - Change clk_disable to clk_disable_unprepare crypto: arm64 - cleanup comments crypto: qat - fix initialization of pfvf rts_map_msg structures crypto: qat - fix initialization of pfvf cap_msg structures crypto: qat - remove unneeded assignment crypto: qat - disable registration of algorithms crypto: hisilicon/qm - fix memset during queues clearing crypto: xilinx: prevent probing on non-xilinx hardware crypto: marvell/octeontx - Use swap() instead of open coding it crypto: ccree - Fix use after free in cc_cipher_exit() crypto: ccp - ccp_dmaengine_unregister release dma channels crypto: octeontx2 - fix missing unlock hwrng: cavium - fix NULL but dereferenced coccicheck error crypto: cavium/nitrox - don't cast parameter in bit operations crypto: vmx - add missing dependencies MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver crypto: xilinx - Add Xilinx SHA3 driver ...	2022-03-21 16:02:36 -07:00
Peter Zijlstra	214b9a83b6	x86/ibt,crypto: Add ENDBR for the jump-table entries The code does: ## branch into array mov jump_table(,%rax,8), %bufp JMP_NOSPEC bufp resulting in needing to mark the jump-table entries with ENDBR. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.110500806@infradead.org	2022-03-15 10:32:36 +01:00
YueHaibing	c143a603c9	crypto: x86/des3 - Remove unused inline function des3_ede_enc_blk_3way() This is unused after commit `768db5fee3` ("crypto: x86/des - drop CTR mode implementation") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-02-23 15:28:32 +12:00
YueHaibing	f17f3f8242	crypto: x86/blowfish - Remove unused inline functions This is unused after commit `c0a64926c5` ("crypto: x86/blowfish - drop CTR mode implementation") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-02-23 15:28:32 +12:00
Mark Rutland	7be2e31964	x86: clean up symbol aliasing Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify the definition of function aliases across arch/x86. For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) Where there are only aliases and no exports or other annotations, I have not bothered with line spacing, e.g. SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) SYM_FUNC_ALIAS(alias, func) The tools/perf/ copies of memset_64.S and memset_64.S are updated likewise to avoid the build system complaining these are mismatched: \| Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' \| diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S \| Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' \| diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-22 16:21:34 +00:00
Jason A. Donenfeld	d2a02e3c8b	lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI blake2s_compress_generic is weakly aliased by blake2s_compress. The current harness for function selection uses a function pointer, which is ordinarily inlined and resolved at compile time. But when Clang's CFI is enabled, CFI still triggers when making an indirect call via a weak symbol. This seems like a bug in Clang's CFI, as though it's bucketing weak symbols and strong symbols differently. It also only seems to trigger when "full LTO" mode is used, rather than "thin LTO". [ 0.000000][ T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444) [ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1 [ 0.000000][ T0] Hardware name: MT6873 (DT) [ 0.000000][ T0] Call trace: [ 0.000000][ T0] dump_backtrace+0xfc/0x1dc [ 0.000000][ T0] dump_stack_lvl+0xa8/0x11c [ 0.000000][ T0] panic+0x194/0x464 [ 0.000000][ T0] __cfi_check_fail+0x54/0x58 [ 0.000000][ T0] __cfi_slowpath_diag+0x354/0x4b0 [ 0.000000][ T0] blake2s_update+0x14c/0x178 [ 0.000000][ T0] _extract_entropy+0xf4/0x29c [ 0.000000][ T0] crng_initialize_primary+0x24/0x94 [ 0.000000][ T0] rand_initialize+0x2c/0x6c [ 0.000000][ T0] start_kernel+0x2f8/0x65c [ 0.000000][ T0] __primary_switched+0xc4/0x7be4 [ 0.000000][ T0] Rebooting in 5 seconds.. Nonetheless, the function pointer method isn't so terrific anyway, so this patch replaces it with a simple boolean, which also gets inlined away. This successfully works around the Clang bug. In general, I'm not too keen on all of the indirection involved here; it clearly does more harm than good. Hopefully the whole thing can get cleaned up down the road when lib/crypto is overhauled more comprehensively. But for now, we go with a simple bandaid. Fixes: `6048fdcc5f` ("lib/crypto: blake2s: include as built-in") Link: https://github.com/ClangBuiltLinux/linux/issues/1567 Reported-by: Miles Chen <miles.chen@mediatek.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: John Stultz <john.stultz@linaro.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-02-04 19:22:32 +01:00
Nathan Huckleberry	90be188b65	crypto: x86 - Convert to SPDX identifier Use SPDX-License-Identifier instead of a verbose license text and update external link. Cc: James Guilford <james.guilford@intel.com> Cc: Sean Gulley <sean.m.gulley@intel.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-01-31 11:21:42 +11:00
Tianjia Zhang	930ab34d90	crypto: x86/sm3 - add AVX assembly implementation This patch adds AVX assembly accelerated implementation of SM3 secure hash algorithm. From the benchmark data, compared to pure software implementation sm3-generic, the performance increase is up to 38%. The main algorithm implementation based on SM3 AES/BMI2 accelerated work by libgcrypt at: https://gnupg.org/software/libgcrypt/index.html Benchmark on Intel i5-6200U 2.30GHz, performance data of two implementations, pure software sm3-generic and sm3-avx acceleration. The data comes from the 326 mode and 422 mode of tcrypt. The abscissas are different lengths of per update. The data is tabulated and the unit is Mb/s: update-size \| 16 64 256 1024 2048 4096 8192 ------------+------------------------------------------------------- sm3-generic \| 105.97 129.60 182.12 189.62 188.06 193.66 194.88 sm3-avx \| 119.87 163.05 244.44 260.92 257.60 264.87 265.88 Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-01-28 16:51:11 +11:00
Linus Torvalds	64ad946152	- Get rid of all the .fixup sections because this generates misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and LIVEPATCH as the backtrace misses the function which is being fixed up. - Add Straight Light Speculation mitigation support which uses a new compiler switch -mharden-sls= which sticks an INT3 after a RET or an indirect branch in order to block speculation after them. Reportedly, CPUs do speculate behind such insns. - The usual set of cleanups and improvements -----BEGIN PGP SIGNATURE----- iQIyBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHfKA0ACgkQEsHwGGHe VUqLJg/2I2X2xXr5filJVaK+sQgmvDzk67DKnbxRBW2xcPF+B5sSW5yhe3G5UPW7 SJVdhQ3gHcTiliGGlBf/VE7KXbqxFN0vO4/VFHZm78r43g7OrXTxz6WXXQRJ1n67 U3YwRH3b6cqXZNFMs+X4bJt6qsGJM1kdTTZ2as4aERnaFr5AOAfQvfKbyhxLe/XA 3SakfYISVKCBQ2RkTfpMpwmqlsatGFhTC5IrvuDQ83dDsM7O+Dx1J6Gu3fwjKmie iVzPOjCh+xTpZQp/SIZmt7MzoduZvpSym4YVyHvEnMiexQT4AmyaRthWqrhnEXY/ qOvj8/XIqxmix8EaooGqRIK0Y2ZegxkPckNFzaeC3lsWohwMIGIhNXwHNEeuhNyH yvNGAW9Cq6NeDRgz5MRUXcimYw4P4oQKYLObS1WqFZhNMqm4sNtoEAYpai/lPYfs zUDckgXF2AoPOsSqy3hFAVaGovAgzfDaJVzkt0Lk4kzzjX2WQiNLhmiior460w+K 0l2Iej58IajSp3MkWmFH368Jo8YfUVmkjbbpsmjsBppA08e1xamJB7RmswI/Ezj6 s5re6UioCD+UYdjWx41kgbvYdvIkkZ2RLrktoZd/hqHrOLWEIiwEbyFO2nRFJIAh YjvPkB1p7iNuAeYcP1x9Ft9GNYVIsUlJ+hK86wtFCqy+abV+zQ== =R52z -----END PGP SIGNATURE----- Merge tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Get rid of all the .fixup sections because this generates misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and LIVEPATCH as the backtrace misses the function which is being fixed up. - Add Straight Line Speculation mitigation support which uses a new compiler switch -mharden-sls= which sticks an INT3 after a RET or an indirect branch in order to block speculation after them. Reportedly, CPUs do speculate behind such insns. - The usual set of cleanups and improvements * tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) x86/entry_32: Fix segment exceptions objtool: Remove .fixup handling x86: Remove .fixup section x86/word-at-a-time: Remove .fixup usage x86/usercopy: Remove .fixup usage x86/usercopy_32: Simplify __copy_user_intel_nocache() x86/sgx: Remove .fixup usage x86/checksum_32: Remove .fixup usage x86/vmx: Remove .fixup usage x86/kvm: Remove .fixup usage x86/segment: Remove .fixup usage x86/fpu: Remove .fixup usage x86/xen: Remove .fixup usage x86/uaccess: Remove .fixup usage x86/futex: Remove .fixup usage x86/msr: Remove .fixup usage x86/extable: Extend extable functionality x86/entry_32: Remove .fixup usage x86/entry_64: Remove .fixup usage x86/copy_mc_64: Remove .fixup usage ...	2022-01-12 16:31:19 -08:00
Linus Torvalds	5c947d0dba	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Algorithms: - Drop alignment requirement for data in aesni - Use synchronous seeding from the /dev/random in DRBG - Reseed nopr DRBGs every 5 minutes from /dev/random - Add KDF algorithms currently used by security/DH - Fix lack of entropy on some AMD CPUs with jitter RNG Drivers: - Add support for the D1 variant in sun8i-ce - Add SEV_INIT_EX support in ccp - PFVF support for GEN4 host driver in qat - Compression support for GEN4 devices in qat - Add cn10k random number generator support" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (145 commits) crypto: af_alg - rewrite NULL pointer check lib/mpi: Add the return value check of kcalloc() crypto: qat - fix definition of ring reset results crypto: hisilicon - cleanup warning in qm_get_qos_value() crypto: kdf - select SHA-256 required for self-test crypto: x86/aesni - don't require alignment of data crypto: ccp - remove unneeded semicolon crypto: stm32/crc32 - Fix kernel BUG triggered in probe() crypto: s390/sha512 - Use macros instead of direct IV numbers crypto: sparc/sha - remove duplicate hash init function crypto: powerpc/sha - remove duplicate hash init function crypto: mips/sha - remove duplicate hash init function crypto: sha256 - remove duplicate generic hash init function crypto: jitter - add oversampling of noise source MAINTAINERS: update SEC2 driver maintainers list crypto: ux500 - Use platform_get_irq() to get the interrupt crypto: hisilicon/qm - disable qm clock-gating crypto: omap-aes - Fix broken pm_runtime_and_get() usage MAINTAINERS: update caam crypto driver maintainers list crypto: octeontx2 - prevent underflow in get_cores_bmap() ...	2022-01-11 10:21:35 -08:00
Jason A. Donenfeld	6048fdcc5f	lib/crypto: blake2s: include as built-in In preparation for using blake2s in the RNG, we change the way that it is wired-in to the build system. Instead of using ifdefs to select the right symbol, we use weak symbols. And because ARM doesn't need the generic implementation, we make the generic one default only if an arch library doesn't need it already, and then have arch libraries that do need it opt-in. So that the arch libraries can remain tristate rather than bool, we then split the shash part from the glue code. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: linux-kbuild@vger.kernel.org Cc: linux-crypto@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-01-07 00:25:25 +01:00
Jakub Kicinski	d480a26bdf	crypto: x86/aesni - don't require alignment of data x86 AES-NI routines can deal with unaligned data. Crypto context (key, iv etc.) have to be aligned but we take care of that separately by copying it onto the stack. We were feeding unaligned data into crypto routines up until commit `83c83e6588` ("crypto: aesni - refactor scatterlist processing") switched to use the full skcipher API which uses cra_alignmask to decide data alignment. This fixes 21% performance regression in kTLS. Tested by booting with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y (and running thru various kTLS packets). CC: stable@vger.kernel.org # 5.15+ Fixes: `83c83e6588` ("crypto: aesni - refactor scatterlist processing") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-12-31 18:10:55 +11:00
Jason A. Donenfeld	acd93f8a4c	crypto: x86/curve25519 - use in/out register constraints more precisely Rather than passing all variables as modified, pass ones that are only read into that parameter. This helps with old gcc versions when alternatives are additionally used, and lets gcc's codegen be a little bit more efficient. This also syncs up with the latest Vale/EverCrypt output. Reported-by: Mathias Krause <minipli@grsecurity.net> Cc: Aymeric Fromherz <aymeric.fromherz@inria.fr> Link: https://lore.kernel.org/wireguard/1554725710.1290070.1639240504281.JavaMail.zimbra@inria.fr/ Link: https://github.com/project-everest/hacl-star/pull/501 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-12-24 14:18:22 +11:00
Colin Ian King	015e42c85f	crypto: x86/des3 - remove redundant assignment of variable nbytes The variable nbytes is being assigned a value that is never read, it is being re-assigned in the following statement. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-12-17 16:59:46 +11:00
Peter Zijlstra	f94909ceb1	x86: Prepare asm files for straight-line-speculation Replace all ret/retq instructions with RET in preparation of making RET a macro. Since AS is case insensitive it's a big no-op without RET defined. find arch/x86/ -name \.S \| while read file do sed -i 's/\<ret[q]\>/RET/' $file done Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org	2021-12-08 12:25:37 +01:00
Linus Torvalds	bfc484fe6a	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Delay boot-up self-test for built-in algorithms Algorithms: - Remove fallback path on arm64 as SIMD now runs with softirq off Drivers: - Add Keem Bay OCS ECC Driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (61 commits) crypto: testmgr - fix wrong key length for pkcs1pad crypto: pcrypt - Delay write to padata->info crypto: ccp - Make use of the helper macro kthread_run() crypto: sa2ul - Use the defined variable to clean code crypto: s5p-sss - Add error handling in s5p_aes_probe() crypto: keembay-ocs-ecc - Add Keem Bay OCS ECC Driver dt-bindings: crypto: Add Keem Bay ECC bindings crypto: ecc - Export additional helper functions crypto: ecc - Move ecc.h to include/crypto/internal crypto: engine - Add KPP Support to Crypto Engine crypto: api - Do not create test larvals if manager is disabled crypto: tcrypt - fix skcipher multi-buffer tests for 1420B blocks hwrng: s390 - replace snprintf in show functions with sysfs_emit crypto: octeontx2 - set assoclen in aead_do_fallback() crypto: ccp - Fix whitespace in sev_cmd_buffer_len() hwrng: mtk - Force runtime pm ops for sleep ops crypto: testmgr - Only disable migration in crypto_disable_simd_for_test() crypto: qat - share adf_enable_pf2vf_comms() from adf_pf2vf_msg.c crypto: qat - extract send and wait from adf_vf2pf_request_version() crypto: qat - add VF and PF wrappers to common send function ...	2021-11-01 21:24:02 -07:00
Tianjia Zhang	f8690a4b5a	crypto: x86/sm4 - Fix invalid section entry size This fixes the following warning: vmlinux.o: warning: objtool: elf_update: invalid section entry size The size of the rodata section is 164 bytes, directly using the entry_size of 164 bytes will cause errors in some versions of the gcc compiler, while using 16 bytes directly will cause errors in the clang compiler. This patch correct it by filling the size of rodata to a 16-byte boundary. Fixes: `a7ee22ee14` ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation") Fixes: `5b2efa2bb8` ("crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation") Reported-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Tested-by: Heyuan Shi <heyuan@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-10-22 20:23:01 +08:00
Josh Poimboeuf	0e14ef3866	crypto: x86/sm4 - Fix frame pointer stack corruption sm4_aesni_avx_crypt8() sets up the frame pointer (which includes pushing RBP) before doing a conditional sibling call to sm4_aesni_avx_crypt4(), which sets up an additional frame pointer. Things will not go well when sm4_aesni_avx_crypt4() pops only the innermost single frame pointer and then tries to return to the outermost frame pointer. Sibling calls need to occur with an empty stack frame. Do the conditional sibling call before setting up the stack pointer. This fixes the following warning: arch/x86/crypto/sm4-aesni-avx-asm_64.o: warning: objtool: sm4_aesni_avx_crypt8()+0x8: sibling call from callable instruction with modified stack frame Fixes: `a7ee22ee14` ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Arnd Bergmann <arnd@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-09-24 15:58:50 +08:00
Shreyansh Chouhan	a2d3cbc80d	crypto: aesni - check walk.nbytes instead of err In the code for xts_crypt(), we check for the err value returned by skcipher_walk_virt() and return from the function if it is non zero. However, skcipher_walk_virt() can set walk.nbytes to 0, which would cause us to call kernel_fpu_begin(), and then skip the kernel_fpu_end() call. This patch checks for the walk.nbytes value instead, and returns if walk.nbytes is 0. This prevents us from calling kernel_fpu_begin() in the first place and also covers the case of having a non zero err value returned from skcipher_walk_virt(). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-09-17 11:06:15 +08:00
Shreyansh Chouhan	72ff2bf04d	crypto: aesni - xts_crypt() return if walk.nbytes is 0 xts_crypt() code doesn't call kernel_fpu_end() after calling kernel_fpu_begin() if walk.nbytes is 0. The correct behavior should be not calling kernel_fpu_begin() if walk.nbytes is 0. Reported-by: syzbot+20191dc583eff8602d2d@syzkaller.appspotmail.com Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-08-27 16:30:19 +08:00
Tianjia Zhang	5b2efa2bb8	crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation Like the implementation of AESNI/AVX, this patch adds an accelerated implementation of AESNI/AVX2. In terms of code implementation, by reusing AESNI/AVX mode-related codes, the amount of code is greatly reduced. From the benchmark data, it can be seen that when the block size is 1024, compared to AVX acceleration, the performance achieved by AVX2 has increased by about 70%, it is also 7.7 times of the pure software implementation of sm4-generic. The main algorithm implementation comes from SM4 AES-NI work by libgcrypt and Markku-Juhani O. Saarinen at: https://github.com/mjosaarinen/sm4ni This optimization supports the four modes of SM4, ECB, CBC, CFB, and CTR. Since CBC and CFB do not support multiple block parallel encryption, the optimization effect is not obvious. Benchmark on Intel i5-6200U 2.30GHz, performance data of three implementation methods, pure software sm4-generic, aesni/avx acceleration, and aesni/avx2 acceleration, the data comes from the 218 mode and 518 mode of tcrypt. The abscissas are blocks of different lengths. The data is tabulated and the unit is Mb/s: block-size \| 16 64 128 256 1024 1420 4096 sm4-generic ECB enc \| 60.94 70.41 72.27 73.02 73.87 73.58 73.59 ECB dec \| 61.87 70.53 72.15 73.09 73.89 73.92 73.86 CBC enc \| 56.71 66.31 68.05 69.84 70.02 70.12 70.24 CBC dec \| 54.54 65.91 68.22 69.51 70.63 70.79 70.82 CFB enc \| 57.21 67.24 69.10 70.25 70.73 70.52 71.42 CFB dec \| 57.22 64.74 66.31 67.24 67.40 67.64 67.58 CTR enc \| 59.47 68.64 69.91 71.02 71.86 71.61 71.95 CTR dec \| 59.94 68.77 69.95 71.00 71.84 71.55 71.95 sm4-aesni-avx ECB enc \| 44.95 177.35 292.06 316.98 339.48 322.27 330.59 ECB dec \| 45.28 178.66 292.31 317.52 339.59 322.52 331.16 CBC enc \| 57.75 67.68 69.72 70.60 71.48 71.63 71.74 CBC dec \| 44.32 176.83 284.32 307.24 328.61 312.61 325.82 CFB enc \| 57.81 67.64 69.63 70.55 71.40 71.35 71.70 CFB dec \| 43.14 167.78 282.03 307.20 328.35 318.24 325.95 CTR enc \| 42.35 163.32 279.11 302.93 320.86 310.56 317.93 CTR dec \| 42.39 162.81 278.49 302.37 321.11 310.33 318.37 sm4-aesni-avx2 ECB enc \| 45.19 177.41 292.42 316.12 339.90 322.53 330.54 ECB dec \| 44.83 178.90 291.45 317.31 339.85 322.55 331.07 CBC enc \| 57.66 67.62 69.73 70.55 71.58 71.66 71.77 CBC dec \| 44.34 176.86 286.10 501.68 559.58 483.87 527.46 CFB enc \| 57.43 67.60 69.61 70.52 71.43 71.28 71.65 CFB dec \| 43.12 167.75 268.09 499.33 558.35 490.36 524.73 CTR enc \| 42.42 163.39 256.17 493.95 552.45 481.58 517.19 CTR dec \| 42.49 163.11 256.36 493.34 552.62 481.49 516.83 Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-08-27 16:30:18 +08:00
Tianjia Zhang	de79d9aae4	crypto: x86/sm4 - export reusable AESNI/AVX functions Export the reusable functions in the SM4 AESNI/AVX implementation, mainly public functions, which are used to develop the SM4 AESNI/AVX2 implementation, and eliminate unnecessary duplication of code. At the same time, in order to make the public function universal, minor fixes was added. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-08-27 16:30:18 +08:00
Tianjia Zhang	a7ee22ee14	crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4 block cipher. Through two affine transforms, we can use the AES S-Box to simulate the SM4 S-Box to achieve the effect of instruction acceleration. The main algorithm implementation comes from SM4 AES-NI work by libgcrypt and Markku-Juhani O. Saarinen at: https://github.com/mjosaarinen/sm4ni This optimization supports the four modes of SM4, ECB, CBC, CFB, and CTR. Since CBC and CFB do not support multiple block parallel encryption, the optimization effect is not obvious. Benchmark on Intel Xeon Cascadelake, the data comes from the 218 mode and 518 mode of tcrypt. The abscissas are blocks of different lengths. The data is tabulated and the unit is Mb/s: sm4-generic \| 16 64 128 256 1024 1420 4096 ECB enc \| 40.99 46.50 48.05 48.41 49.20 49.25 49.28 ECB dec \| 41.07 46.99 48.15 48.67 49.20 49.25 49.29 CBC enc \| 37.71 45.28 46.77 47.60 48.32 48.37 48.40 CBC dec \| 36.48 44.82 46.43 47.45 48.23 48.30 48.36 CFB enc \| 37.94 44.84 46.12 46.94 47.57 47.46 47.68 CFB dec \| 37.50 42.84 43.74 44.37 44.85 44.80 44.96 CTR enc \| 39.20 45.63 46.75 47.49 48.09 47.85 48.08 CTR dec \| 39.64 45.70 46.72 47.47 47.98 47.88 48.06 sm4-aesni-avx ECB enc \| 33.75 134.47 221.64 243.43 264.05 251.58 258.13 ECB dec \| 34.02 134.92 223.11 245.14 264.12 251.04 258.33 CBC enc \| 38.85 46.18 47.67 48.34 49.00 48.96 49.14 CBC dec \| 33.54 131.29 223.88 245.27 265.50 252.41 263.78 CFB enc \| 38.70 46.10 47.58 48.29 49.01 48.94 49.19 CFB dec \| 32.79 128.40 223.23 244.87 265.77 253.31 262.79 CTR enc \| 32.58 122.23 220.29 241.16 259.57 248.32 256.69 CTR dec \| 32.81 122.47 218.99 241.54 258.42 248.58 256.61 Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-07-30 10:58:31 +08:00
Ard Biesheuvel	821720b9f3	crypto: x86/aes-ni - add missing error checks in XTS code The updated XTS code fails to check the return code of skcipher_walk_virt, which may lead to skcipher_walk_abort() or skcipher_walk_done() being called while the walk argument is in an inconsistent state. So check the return value after each such call, and bail on errors. Fixes: `2481104fe9` ("crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper") Reported-by: Dave Hansen <dave.hansen@intel.com> Reported-by: syzbot <syzbot+5d1bad8042a8f0e8117a@syzkaller.appspotmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-07-23 14:49:18 +08:00
Hangbin Liu	1b82435d17	crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit In curve25519_mod_init() the curve25519_alg will be registered only when (X86_FEATURE_BMI2 && X86_FEATURE_ADX). But in curve25519_mod_exit() it still checks (X86_FEATURE_BMI2 \|\| X86_FEATURE_ADX) when do crypto unregister. This will trigger a BUG_ON in crypto_unregister_alg() as alg->cra_refcnt is 0 if the cpu only supports one of X86_FEATURE_BMI2 and X86_FEATURE_ADX. Fixes: `07b586fe06` ("crypto: x86/curve25519 - replace with formally verified implementation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-06-11 15:03:29 +08:00
Linus Torvalds	03b2cd72aa	Objtool updates in this cycle were: - Standardize the crypto asm code so that it looks like compiler-generated code to objtool - so that it can understand it. This enables unwinding from crypto asm code - and also fixes the last known remaining objtool warnings for LTO and more. - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit - Misc fixes and cleanups Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJEOQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jN0A//dZIR9GPW1cHkPD3Na+lxb+RWgCVnFUgw meLEOum389zWr7S8YmcFpKWLy94f3l24i/e7ufKn6/RMdaQuT6pZUa6teiNPqDKN Qq1v6EX/LT49Q1zh/zCCnKdlmF1t7wDyA1/+HBLnB4YfYEtteEt+p2Apyv4xIHOl xWqaTMFcVR/El9FXSyqRWRR4zuqY0Uatz0fmfo5jmi2xq460k53fQlTLA/0w5Jw0 V3omyA3AYMUW6YlW5TGUINOhyDeAJm4PWl3siSUnSd6t8A/TVs5zpZX15BtseCle 0FRp2SbxOoVkiyo3N3XmkfYYns9+4wK7cr9qja9U9MsSBZJZwaBm2LO/t2WFrAhq 5dkOsoPmpIsjutsQnIhQgtVT9I/A4/u5m5Zi3trlXsBS0XAt/q+2GPfEngFmgb3q nae4rhGUsQ3NTGBiqNuMHQF4yeEvQZ8DCf3ytTz7DjBeiQ9nAtwzbUUGQjYl2mj1 ZPOnl7Xmq/Nyw+AmdpffFPiEUJxqEg9HWjDo7DQATXb3Hw2VJ3cU8jwPRqDDlO10 OB81vysXNGTmhOngHXexxncpmU9gDOIC1imZZpw5lNx4W9Qn20AlGaGAIbqzlfx0 p5VuhkIWCySe1bOZx03xuk7Gq7GBIPPy/a2m204Ftipetlo1HBYwT3KB/wVpHmh7 CSjWgdiW3+k= =poAZ -----END PGP SIGNATURE----- Merge tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Standardize the crypto asm code so that it looks like compiler- generated code to objtool - so that it can understand it. This enables unwinding from crypto asm code - and also fixes the last known remaining objtool warnings for LTO and more. - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit - Misc fixes and cleanups * tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/crypto: Enable objtool in crypto code x86/crypto/sha512-ssse3: Standardize stack alignment prologue x86/crypto/sha512-avx2: Standardize stack alignment prologue x86/crypto/sha512-avx: Standardize stack alignment prologue x86/crypto/sha256-avx2: Standardize stack alignment prologue x86/crypto/sha1_avx2: Standardize stack alignment prologue x86/crypto/sha_ni: Standardize stack alignment prologue x86/crypto/crc32c-pcl-intel: Standardize jump table x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer x86/crypto/aesni-intel_avx: Standardize stack alignment prologue x86/crypto/aesni-intel_avx: Fix register usage comments x86/crypto/aesni-intel_avx: Remove unused macros objtool: Support asm jump tables objtool: Parse options from OBJTOOL_ARGS objtool: Collate parse_options() users objtool: Add --backup objtool,x86: More ModRM sugar objtool,x86: Rewrite ADD/SUB/AND objtool,x86: Support %riz encodings objtool,x86: Simplify register decode ...	2021-04-28 12:53:24 -07:00
Linus Torvalds	ea5bc7b977	Trivial cleanups and fixes all over the place. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCGmYIACgkQEsHwGGHe VUr45w/8CSXr7MXaFBj4To0hTWJXSZyF6YGqlZOSJXFcFh4cWTNwfVOoFaV47aDo +HsCNTkGENcKhLrDUWDRiG/Uo46jxtOtl1vhq7U4pGemSYH871XWOKfb5k5XNMwn /uhaHMI4aEfd6bUFnF518NeyRIsD0BdqFj4tB7RbAiyFwdETDX9Tkj/uBKnQ4zon 4tEDoXgThuK5YKK9zVQg5pa7aFp2zg1CAdX/WzBkS8BHVBPXSV0CF97AJYQOM/V+ lUHv+BN3wp97GYHPQMPsbkNr8IuFoe2mIvikwjxg8iOFpzEU1G1u09XV9R+PXByX LclFTRqK/2uU5hJlcsBiKfUuidyErYMRYImbMAOREt2w0ogWVu2zQ7HkjVve25h1 sQPwPudbAt6STbqRxvpmB3yoV4TCYwnF91FcWgEy+rcEK2BDsHCnScA45TsK5I1C kGR1K17pHXprgMZFPveH+LgxewB6smDv+HllxQdSG67LhMJXcs2Epz0TsN8VsXw8 dlD3lGReK+5qy9FTgO7mY0xhiXGz1IbEdAPU4eRBgih13puu03+jqgMaMabvBWKD wax+BWJUrPtetwD5fBPhlS/XdJDnd8Mkv2xsf//+wT0s4p+g++l1APYxeB8QEehm Pd7Mvxm4GvQkfE13QEVIPYQRIXCMH/e9qixtY5SHUZDBVkUyFM0= =bO1i -----END PGP SIGNATURE----- Merge tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions	2021-04-26 09:25:47 -07:00
Josh Poimboeuf	7d3d10e0e8	x86/crypto: Enable objtool in crypto code Now that all the stack alignment prologues have been cleaned up in the crypto code, enable objtool. Among other benefits, this will allow ORC unwinding to work. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/fc2a1918c50e33e46ef0e9a5de02743f2f6e3639.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:37 -05:00
Josh Poimboeuf	27d26793f2	x86/crypto/sha512-ssse3: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/6ecaaac9f3828fbb903513bf90c34a08380a8e35.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:37 -05:00
Josh Poimboeuf	ec063e090b	x86/crypto/sha512-avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/b1a7b29fcfc65d60a3b6e77ef75f4762a5b8488d.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:36 -05:00
Josh Poimboeuf	d61684b56e	x86/crypto/sha512-avx: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/d36e9ea1c819d87fa89b3df3fa83e2a1ede18146.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:36 -05:00
Josh Poimboeuf	ce58466680	x86/crypto/sha256-avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/8048e7444c49a8137f05265262b83dc50f8fb7f3.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:36 -05:00
Josh Poimboeuf	20114c899c	x86/crypto/sha1_avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/fdaaf8670ed1f52f55ba9a6bbac98c1afddc1af6.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:35 -05:00
Josh Poimboeuf	35a0067d2c	x86/crypto/sha_ni: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/5033e1a79867dff1b18e1b4d0783c38897d3f223.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:35 -05:00
Josh Poimboeuf	2b02ed5548	x86/crypto/crc32c-pcl-intel: Standardize jump table Simplify the jump table code so that it resembles a compiler-generated table. This enables ORC unwinding by allowing objtool to follow all the potential code paths. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/5357a039def90b8ef6b5874ef12cda008ecf18ba.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00
Josh Poimboeuf	dabe5167a3	x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer A conditional stack allocation violates traditional unwinding requirements when a single instruction can have differing stack layouts. There's no benefit in allocating the stack buffer conditionally. Just do it unconditionally. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/85ac96613ee5784b6239c18d3f68b1f3c509caa3.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00
Josh Poimboeuf	e163be86ff	x86/crypto/aesni-intel_avx: Standardize stack alignment prologue Use RBP instead of R14 for saving the old stack pointer before realignment. This resembles what compilers normally do. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/02d00a0903a0959f4787e186e2a07d271e1f63d4.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00

1 2 3 4 5 ...

920 Commits