Commit Graph

18752 Commits

Author SHA1 Message Date
Linus Torvalds
43972cf2de - Do not parse the confidential computing blob on non-AMD hardware as it
leads to an EFI config table ending up unmapped
 
 - Use the correct segment selector in the 32-bit version of getcpu() in
   the vDSO
 
 - Make sure vDSO and VVAR regions are placed in the 47-bit VA range even
   on 5-level paging systems
 
 - Add models 0x90-0x91 to the range of AMD Zenbleed-affected CPUs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTXVWMACgkQEsHwGGHe
 VUrrpg//f0BFZV4dbANuYAX47A/hhrm8n5KwV9Mjq8JDeG/TumUN/vAK9+n8outO
 pLpPOoXRSCSPybXBSmI1ryVwLaDq6BJy0fYyq4kHOvBFXYKodfTNdO8Oec+le3V/
 DBGnY6TQ5x1PgWuAUE9WoeGQuTN1d3Dxm0V/pG2LU/qW3mr+GlBXsSKUVjwp9/OW
 JPNw1XQDFVuxT+heRLxQONPkdTxkwwKxZBDRwnSSj7chbJ/jSbnX9a5xinwBvMZY
 Q6nelt/AMwAgfO2oz38y1tnR0bfd8fM08SUUgoajWWghKemZK5uNAgZZJd4tPNq6
 lBonNc6jF9UGohzgQbNOAjfmDomtyxe4JYGl2SnyWcfwFzGANpcSKSbM9H91LMaI
 Sh7hKykZQNGmDctbLsxvlPFgIoxWFLK1DfNeM2yQRzlayqDF8CRlDIcWXHEtYOXf
 AOZFMJtOJBZjLbSeWIBW278atNG3ONWEb75kqiKRYxsX9QzvwAZYCYZH+NGe6sLO
 kkCm7g3NcAItf1qrPNGZd/k9fA/+3RXEWNdjYsmegMvU8vQBPY0w4NVwGtU9LCkq
 jspQxnNlVy1ayqr/TQXRhzn5+d7CQ1PLNwVsGh4S+diCEFu2aEdOhrBhS1uaujLv
 5iLpmyyh0yO9aHebK/u4cciAvwVB7WuzQqWYIGzdsolrc+lbY54=
 =zPFr
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.5_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Do not parse the confidential computing blob on non-AMD hardware as
   it leads to an EFI config table ending up unmapped

 - Use the correct segment selector in the 32-bit version of getcpu() in
   the vDSO

 - Make sure vDSO and VVAR regions are placed in the 47-bit VA range
   even on 5-level paging systems

 - Add models 0x90-0x91 to the range of AMD Zenbleed-affected CPUs

* tag 'x86_urgent_for_v6.5_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/amd: Enable Zenbleed fix for AMD Custom APU 0405
  x86/mm: Fix VDSO and VVAR placement on 5-level paging machines
  x86/linkage: Fix typo of BUILD_VDSO in asm/linkage.h
  x86/vdso: Choose the right GDT_ENTRY_CPUNODE for 32-bit getcpu() on 64-bit kernel
  x86/sev: Do not try to parse for the CC blob on non-AMD hardware
2023-08-12 08:47:01 -07:00
Linus Torvalds
272b86ba9d - A first series of cleanups/unifications and documentation improvements
to the SRSO and GDS mitigations code which got postponed to after the
   embargo date
 
 - Fix the SRSO aliasing addresses assertion so that the LLVM linker can
   parse it too
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTXULMACgkQEsHwGGHe
 VUqAKxAAlGm4YodsbEX+SQTVDLsECBw9lpJ3wU8lD0fBmIAmGiWFy9jqn4FYyAi8
 YlKNI8Ru/mME5cM5BBV5ZZ5PNIZPD8OmqbbaHkSjcjLJfdNg03D/RDUPELSyY7T5
 HviLaknJFjn6HwvLkSLdBIsAAkVqF5lYitP8x5OgBp6Lc9PO7/xWjSZoVhOUkrFe
 Pjc8sT0DgtC2PWkxbB66/uxhdUnFqpioWL+06akeSWuHweIQDQ+P7sxfCB8NZO0u
 YV4hmd/JfjoVc0DtvS3HOm14Ruhmru/oiKg/XcJO7uGPBKxuVK8xsHqeUyGMdTeS
 +sNXA0XjbvaUV9IihuvVHrX8nMirkW7u0NWMNlJCO9QF5eJPfc0I07VLpKJGEsph
 wKSNCN7F64GfjkRGl4jPo26tX+fXGMm32+gGgpqsCYnTBu+nrqprXck4DJQZBNl4
 6Le7sfUky2PSllbFh5MnKaylfeWKcqlOzfko7tjWtFm7raOHEGy31m92igKms0hM
 IlCyEe6mJUcMJ60QzYwaB9FJ+50jIZXeckRnud/mExgaAGQqe7RcVbwurQCCDtYq
 vd4sb9TV9vU07Uqz1NBxmzl6GbYM1ORV9hnlpj/eDnh/ArBzj44UwiGB1bVQ31Iy
 OMBJZ+RQtspa12xq7Zu++mjc+9XTeX9JK81PYg6UU+5ogQapdx4=
 =P0vQ
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_v6.5_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mitigation fixes from Borislav Petkov:
 "The first set of fallout fixes after the embargo madness. There will
  be another set next week too.

   - A first series of cleanups/unifications and documentation
     improvements to the SRSO and GDS mitigations code which got
     postponed to after the embargo date

   - Fix the SRSO aliasing addresses assertion so that the LLVM linker
     can parse it too"

* tag 'x86_bugs_for_v6.5_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  driver core: cpu: Fix the fallback cpu_show_gds() name
  x86: Move gds_ucode_mitigated() declaration to header
  x86/speculation: Add cpu_show_gds() prototype
  driver core: cpu: Make cpu_show_not_affected() static
  x86/srso: Fix build breakage with the LLVM linker
  Documentation/srso: Document IBPB aspect and fix formatting
  driver core: cpu: Unify redundant silly stubs
  Documentation/hw-vuln: Unify filename specification in index
2023-08-12 08:34:20 -07:00
Cristian Ciocaltea
6dbef74aeb x86/cpu/amd: Enable Zenbleed fix for AMD Custom APU 0405
Commit

  522b1d6921 ("x86/cpu/amd: Add a Zenbleed fix")

provided a fix for the Zen2 VZEROUPPER data corruption bug affecting
a range of CPU models, but the AMD Custom APU 0405 found on SteamDeck
was not listed, although it is clearly affected by the vulnerability.

Add this CPU variant to the Zenbleed erratum list, in order to
unconditionally enable the fallback fix until a proper microcode update
is available.

Fixes: 522b1d6921 ("x86/cpu/amd: Add a Zenbleed fix")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230811203705.1699914-1-cristian.ciocaltea@collabora.com
2023-08-11 22:52:29 +02:00
Linus Torvalds
29d99aae13 ACPI fixes for 6.5-rc6
Rework the handling of interrupt overrides on AMD Zen-based machines to
 avoid recently introduced regressions (Hans de Goede).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmTWgQMSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxaggP/iYft7/0saIW/gQEWUngTdwwYpDdL5c6
 cnym9fO7dNvZ7svGgv1oQj3Nge79hWWWKTOVB5F3ZKPEmYHlhAekSYb+SI+/YUcp
 fpIyVKO/y8nO+Cnw5t9EoTteJ65WJaadBNTuRUw28j4ETs/4AewWFFXSogzO2IIU
 x4UoWPHEXhhjYTnYL+ivRAKJG5iTIbfQS16muO7p2z630FneUYGDvUxx0XuIvbf0
 dXVjKi/rxgfozNyL8HmyeDE3LSjG1b7h7h6MS/ZrdM2LoE4V2pp/Z7qtxs53sIeG
 Y+dsBZEp8G/YReuMPush31Sgepf6l4OQpjKkq54vRMD6su36IMojMYXayLWHmWK7
 RP7n896takZ80tC7GH/pDJUUCoiyvE08/RUxnwg8QPCj5PCMmp2w0/DhhYi1FeFD
 4QxYx7UBYGGFy94Bx97LvBVZXM484F8CJnQreXm7csniy0BME5MeV6S6YNjXsXJH
 M+O0uwdlk24g4SH7E21jUQl6Ch+YtbcH0LRtivGnC6YkcItWkcNRccph3H27wNHH
 jMR43tJsWp6yjXeyWw+vrML9VZ0aK3yyUKPAqcXhC4aIp8DSgVaVuLdRwXHmBqK5
 Phx5+7zuxJ4UpOAQb1RtFOCgRAtfNN57QaXJ0hPuOuzopVOWPHjmcuw4fUdW5YCG
 esjxuO9VAyQZ
 =4kD6
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "Rework the handling of interrupt overrides on AMD Zen-based machines
  to avoid recently introduced regressions (Hans de Goede).

  Note that this is intended as a short-term mitigation for 6.5 and the
  long-term approach will be to attempt to use the configuration left by
  the BIOS, but it requires more investigation"

* tag 'acpi-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: resource: Add IRQ override quirk for PCSpecialist Elimina Pro 16 M
  ACPI: resource: Honor MADT INT_SRC_OVR settings for IRQ1 on AMD Zen
  ACPI: resource: Always use MADT override IRQ settings for all legacy non i8042 IRQs
  ACPI: resource: revert "Remove "Zen" specific match and quirks"
2023-08-11 12:30:00 -07:00
Nick Desaulniers
cbe8ded48b x86/srso: Fix build breakage with the LLVM linker
The assertion added to verify the difference in bits set of the
addresses of srso_untrain_ret_alias() and srso_safe_ret_alias() would fail
to link in LLVM's ld.lld linker with the following error:

  ld.lld: error: ./arch/x86/kernel/vmlinux.lds:210: at least one side of
  the expression must be absolute
  ld.lld: error: ./arch/x86/kernel/vmlinux.lds:211: at least one side of
  the expression must be absolute

Use ABSOLUTE to evaluate the expression referring to at least one of the
symbols so that LLD can evaluate the linker script.

Also, add linker version info to the comment about XOR being unsupported
in either ld.bfd or ld.lld until somewhat recently.

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Closes: https://lore.kernel.org/llvm/CA+G9fYsdUeNu-gwbs0+T6XHi4hYYk=Y9725-wFhZ7gJMspLDRA@mail.gmail.com/
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Daniel Kolesa <daniel@octaforge.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Sven Volkinsfeld <thyrc@gmx.net>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://github.com/ClangBuiltLinux/linux/issues/1907
Link: https://lore.kernel.org/r/20230809-gds-v1-1-eaac90b0cbcc@google.com
2023-08-10 11:03:12 +02:00
Hans de Goede
c6a1fd910d ACPI: resource: Honor MADT INT_SRC_OVR settings for IRQ1 on AMD Zen
On AMD Zen acpi_dev_irq_override() by default prefers the DSDT IRQ 1
settings over the MADT settings.

This causes the keyboard to malfunction on some laptop models
(see Links), all models from the Links have an INT_SRC_OVR MADT entry
for IRQ 1.

Fixes: a9c4a912b7 ("ACPI: resource: Remove "Zen" specific match and quirks")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217336
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217394
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217406
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-09 21:18:46 +02:00
Borislav Petkov (AMD)
77245f1c3c x86/CPU/AMD: Do not leak quotient data after a division by 0
Under certain circumstances, an integer division by 0 which faults, can
leave stale quotient data from a previous division operation on Zen1
microarchitectures.

Do a dummy division 0/1 before returning from the #DE exception handler
in order to avoid any leaks of potentially sensitive data.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-09 07:55:00 -07:00
Linus Torvalds
64094e7e31 Mitigate Gather Data Sampling issue
* Add Base GDS mitigation
  * Support GDS_NO under KVM
  * Fix a documentation typo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTJh5YACgkQaDWVMHDJ
 krAzAw/8DzjhAYEa7a1AodCBMNg8uNOPnLNoRPPNhaN5Iw6W3zXYDBDKT9PyjAIx
 RoIM0aHx/oY9nCpK441o25oCWAAyzk6E5/+q9hMa7B4aHUGKqiDUC6L9dC8UiiSN
 yvoBv4g7F81QnmyazwYI64S6vnbr4Cqe7K/mvVqQ/vbJiugD25zY8mflRV9YAuMk
 Oe7Ff/mCA+I/kqyKhJE3cf3qNhZ61FsFI886fOSvIE7g4THKqo5eGPpIQxR4mXiU
 Ri2JWffTaeHr2m0sAfFeLH4VTZxfAgBkNQUEWeG6f2kDGTEKibXFRsU4+zxjn3gl
 xug+9jfnKN1ceKyNlVeJJZKAfr2TiyUtrlSE5d+subIRKKBaAGgnCQDasaFAluzd
 aZkOYz30PCebhN+KTrR84FySHCaxnev04jqdtVGAQEDbTvyNagFUdZFGhWijJShV
 l2l4A0gFSYJmPfPVuuAwOJnnZtA1sRH9oz/Sny3+z9BKloZh+Nc/+Cu9zC8SLjaU
 BF3Qv2gU9HKTJ+MSy2JrGS52cONfpO5ngFHoOMilZ1KBHrfSb1eiy32PDT+vK60Y
 PFEmI8SWl7bmrO1snVUCfGaHBsHJSu5KMqwBGmM4xSRzJpyvRe493xC7+nFvqNLY
 vFOFc4jGeusOXgiLPpfGduppkTGcM7sy75UMLwTSLcQbDK99mus=
 =ZAPY
 -----END PGP SIGNATURE-----

Merge tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/gds fixes from Dave Hansen:
 "Mitigate Gather Data Sampling issue:

   - Add Base GDS mitigation

   - Support GDS_NO under KVM

   - Fix a documentation typo"

* tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Fix backwards on/off logic about YMM support
  KVM: Add GDS_NO support to KVM
  x86/speculation: Add Kconfig option for GDS
  x86/speculation: Add force option to GDS mitigation
  x86/speculation: Add Gather Data Sampling mitigation
2023-08-07 17:03:54 -07:00
Linus Torvalds
138bcddb86 Add a mitigation for the speculative RAS (Return Address Stack) overflow
vulnerability on AMD processors. In short, this is yet another issue
 where userspace poisons a microarchitectural structure which can then be
 used to leak privileged information through a side channel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTQs1gACgkQEsHwGGHe
 VUo1UA/8C34PwJveZDcerdkaxSF+WKx7AjOI/L2ws1qn9YVFA3ItFMgVuFTrlY6c
 1eYKYB3FS9fVN3KzGOXGyhho6seHqfY0+8cyYupR+PVLn9rSy7GqHaIMr37FdQ2z
 yb9xu26v+gsvuPEApazS6MxijYS98u71rHhmg97qsHCnUiMJ01+TaGucntukNJv8
 FfwjZJvgeUiBPQ/6IeA/O0413tPPJ9weawPyW+sV1w7NlXjaUVkNXwiq/Xxbt9uI
 sWwMBjFHpSnhBRaDK8W5Blee/ZfsS6qhJ4jyEKUlGtsElMnZLPHbnrbpxxqA9gyE
 K+3ZhoHf/W1hhvcZcALNoUHLx0CvVekn0o41urAhPfUutLIiwLQWVbApmuW80fgC
 DhPedEFu7Wp6Okj5+Bqi/XOsOOWN2WRDSzdAq10o1C+e+fzmkr6y4E6gskfz1zXU
 ssD9S4+uAJ5bccS5lck4zLffsaA03nAYTlvl1KRP4pOz5G9ln6eyO20ar1WwfGAV
 o5ZsTJVGQMyVA49QFkksj+kOI3chkmDswPYyGn2y8OfqYXU4Ip4eN+VkjorIAo10
 zIec3Z0bCGZ9UUMylUmdtH3KAm8q0wVNoFrUkMEmO8j6nn7ew2BhwLMn4uu+nOnw
 lX2AG6PNhRLVDVaNgDsWMwejaDsitQPoWRuCIAZ0kQhbeYuwfpM=
 =73JY
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/srso fixes from Borislav Petkov:
 "Add a mitigation for the speculative RAS (Return Address Stack)
  overflow vulnerability on AMD processors.

  In short, this is yet another issue where userspace poisons a
  microarchitectural structure which can then be used to leak privileged
  information through a side channel"

* tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/srso: Tie SBPB bit setting to microcode patch detection
  x86/srso: Add a forgotten NOENDBR annotation
  x86/srso: Fix return thunks in generated code
  x86/srso: Add IBPB on VMEXIT
  x86/srso: Add IBPB
  x86/srso: Add SRSO_NO support
  x86/srso: Add IBPB_BRTYPE support
  x86/srso: Add a Speculative RAS Overflow mitigation
  x86/bugs: Increase the x86 bugs vector size to two u32s
2023-08-07 16:35:44 -07:00
Borislav Petkov (AMD)
5a15d83488 x86/srso: Tie SBPB bit setting to microcode patch detection
The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode
patch has been applied so set X86_FEATURE_SBPB only then. Otherwise,
guests would attempt to set that bit and #GP on the MSR write.

While at it, make SMT detection more robust as some guests - depending
on how and what CPUID leafs their report - lead to cpu_smt_control
getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any
guest incarnation where one simply cannot do SMT, for whatever reason.

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-08-07 10:53:08 +02:00
Linus Torvalds
d410b62e45 - AMD's automatic IBRS doesn't enable cross-thread branch target
injection protection (STIBP) for user processes. Enable STIBP on such
   systems.
 
 - Do not delete (but put the ref instead) of AMD MCE error thresholding
   sysfs kobjects when destroying them in order not to delete the kernfs
   pointer prematurely
 
 - Restore annotation in ret_from_fork_asm() in order to fix kthread
   stack unwinding from being marked as unreliable and thus breaking
   livepatching
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTGLFUACgkQEsHwGGHe
 VUpgDRAAm3uatlqiY2M1Gu9BMMmchTkjr2Fq06TmDQ53SGc6FqLKicltBCZsxbrm
 kOrAtmw0jYPTTzqiDy8llyAt+1BC200nAKWTABKhKBrgUiD2crIIC8Rr6YycZ4tm
 ueepk4CCxzh+ffcvGau2OuH05SHwQLeTNPr5Rgk4BlVPToaMdXAJChZA/JXsj4gR
 3EiWV5/UnC6znzmQKN5PG+BmDrrOlsyDCJXYBVH+vQFa0Udit/rx0YZQ5ZOcD8Tn
 D7Ix10pGQV/ESOsD+UFq/u1LPZvJSD2GDsMpWitrw65wnC2TF/XTxBc+pK0mbyKL
 3XmH2NPlp1igv3EZ3hltXUcw6Rv8u3hX7VE5S+eQ0FRXJGjxSwoLC9ndw28oPful
 FlMjrmI9SE5ojssZ6evLN0/dPXHEz8HvRgw5UTy5I+RqpelMWtML5iDIipaMwoUT
 yB9JNIsufY1CM1IHiZBVLZkqIl8X8RtllbJR/RWGfYEHuiXworumgMDp9MsEFY2C
 MHr9+/j9E1vU71CvjIYAaJCfWU1Ce+lYCUZ+1SxyDDe3watJKlduuAXbalmyYe0w
 ExE5Wt+3ghOzwgj4OtofUivXLWMXr4IgpKliO5TrZ3lGyS3LWQv1dJstCZUnknLZ
 A5D/qUSvIXkUdrJbkXrYLQJxtd0ambHc+6ymAIjtMBM8/HF0pR4=
 =49ii
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - AMD's automatic IBRS doesn't enable cross-thread branch target
   injection protection (STIBP) for user processes. Enable STIBP on such
   systems.

 - Do not delete (but put the ref instead) of AMD MCE error thresholding
   sysfs kobjects when destroying them in order not to delete the kernfs
   pointer prematurely

 - Restore annotation in ret_from_fork_asm() in order to fix kthread
   stack unwinding from being marked as unreliable and thus breaking
   livepatching

* tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
  x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
  x86: Fix kthread unwind
2023-07-30 11:05:35 -07:00
Josh Poimboeuf
238ec850b9 x86/srso: Fix return thunks in generated code
Set X86_FEATURE_RETHUNK when enabling the SRSO mitigation so that
generated code (e.g., ftrace, static call, eBPF) generates "jmp
__x86_return_thunk" instead of RET.

  [ bp: Add a comment. ]

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-29 14:15:19 +02:00
Borislav Petkov (AMD)
d893832d0e x86/srso: Add IBPB on VMEXIT
Add the option to flush IBPB only on VMEXIT in order to protect from
malicious guests but one otherwise trusts the software that runs on the
hypervisor.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:19 +02:00
Borislav Petkov (AMD)
233d6f68b9 x86/srso: Add IBPB
Add the option to mitigate using IBPB on a kernel entry. Pull in the
Retbleed alternative so that the IBPB call from there can be used. Also,
if Retbleed mitigation is done using IBPB, the same mitigation can and
must be used here.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:19 +02:00
Borislav Petkov (AMD)
1b5277c0ea x86/srso: Add SRSO_NO support
Add support for the CPUID flag which denotes that the CPU is not
affected by SRSO.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:19 +02:00
Borislav Petkov (AMD)
79113e4060 x86/srso: Add IBPB_BRTYPE support
Add support for the synthetic CPUID flag which "if this bit is 1,
it indicates that MSR 49h (PRED_CMD) bit 0 (IBPB) flushes all branch
type predictions from the CPU branch predictor."

This flag is there so that this capability in guests can be detected
easily (otherwise one would have to track microcode revisions which is
impossible for guests).

It is also needed only for Zen3 and -4. The other two (Zen1 and -2)
always flush branch type predictions by default.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:19 +02:00
Borislav Petkov (AMD)
fb3bd914b3 x86/srso: Add a Speculative RAS Overflow mitigation
Add a mitigation for the speculative return address stack overflow
vulnerability found on AMD processors.

The mitigation works by ensuring all RET instructions speculate to
a controlled location, similar to how speculation is controlled in the
retpoline sequence.  To accomplish this, the __x86_return_thunk forces
the CPU to mispredict every function return using a 'safe return'
sequence.

To ensure the safety of this mitigation, the kernel must ensure that the
safe return sequence is itself free from attacker interference.  In Zen3
and Zen4, this is accomplished by creating a BTB alias between the
untraining function srso_untrain_ret_alias() and the safe return
function srso_safe_ret_alias() which results in evicting a potentially
poisoned BTB entry and using that safe one for all function returns.

In older Zen1 and Zen2, this is accomplished using a reinterpretation
technique similar to Retbleed one: srso_untrain_ret() and
srso_safe_ret().

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:14 +02:00
Kirill A. Shutemov
9f91164061 x86/traps: Fix load_unaligned_zeropad() handling for shared TDX memory
Commit c4e34dd99f ("x86: simplify load_unaligned_zeropad()
implementation") changes how exceptions around load_unaligned_zeropad()
handled.  The kernel now uses the fault_address in fixup_exception() to
verify the address calculations for the load_unaligned_zeropad().

It works fine for #PF, but breaks on #VE since no fault address is
passed down to fixup_exception().

Propagating ve_info.gla down to fixup_exception() resolves the issue.

See commit 1e7769653b ("x86/tdx: Handle load_unaligned_zeropad()
page-cross to a shared page") for more context.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Michael Kelley <mikelley@microsoft.com>
Fixes: c4e34dd99f ("x86: simplify load_unaligned_zeropad() implementation")
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-25 15:29:01 -07:00
Kim Phillips
fd470a8bee x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
Unlike Intel's Enhanced IBRS feature, AMD's Automatic IBRS does not
provide protection to processes running at CPL3/user mode, see section
"Extended Feature Enable Register (EFER)" in the APM v2 at
https://bugzilla.kernel.org/attachment.cgi?id=304652

Explicitly enable STIBP to protect against cross-thread CPL3
branch target injections on systems with Automatic IBRS enabled.

Also update the relevant documentation.

Fixes: e7862eda30 ("x86/cpu: Support AMD Automatic IBRS")
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230720194727.67022-1-kim.phillips@amd.com
2023-07-22 18:04:22 +02:00
Yazen Ghannam
3ba2e83334 x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
AMD systems from Family 10h to 16h share MCA bank 4 across multiple CPUs.
Therefore, the threshold_bank structure for bank 4, and its threshold_block
structures, will be initialized once at boot time. And the kobject for the
shared bank will be added to each of the CPUs that share it. Furthermore,
the threshold_blocks for the shared bank will be added again to the bank's
kobject. These additions will increase the refcount for the bank's kobject.

For example, a shared bank with two blocks and shared across two CPUs will
be set up like this:

  CPU0 init
    bank create and add; bank refcount = 1; threshold_create_bank()
      block 0 init and add; bank refcount = 2; allocate_threshold_blocks()
      block 1 init and add; bank refcount = 3; allocate_threshold_blocks()
  CPU1 init
    bank add; bank refcount = 3; threshold_create_bank()
      block 0 add; bank refcount = 4; __threshold_add_blocks()
      block 1 add; bank refcount = 5; __threshold_add_blocks()

Currently in threshold_remove_bank(), if the bank is shared then
__threshold_remove_blocks() is called. Here the shared bank's kobject and
the bank's blocks' kobjects are deleted. This is done on the first call
even while the structures are still shared. Subsequent calls from other
CPUs that share the structures will attempt to delete the kobjects.

During kobject_del(), kobject->sd is removed. If the kobject is not part of
a kset with default_groups, then subsequent kobject_del() calls seem safe
even with kobject->sd == NULL.

Originally, the AMD MCA thresholding structures did not use default_groups.
And so the above behavior was not apparent.

However, a recent change implemented default_groups for the thresholding
structures. Therefore, kobject_del() will go down the sysfs_remove_groups()
code path. In this case, the first kobject_del() may succeed and remove
kobject->sd. But subsequent kobject_del() calls will give a WARNing in
kernfs_remove_by_name_ns() since kobject->sd == NULL.

Use kobject_put() on the shared bank's kobject when "removing" blocks. This
decrements the bank's refcount while keeping kobjects enabled until the
bank is no longer shared. At that point, kobject_put() will be called on
the blocks which drives their refcount to 0 and deletes them and also
decrementing the bank's refcount. And finally kobject_put() will be called
on the bank driving its refcount to 0 and deleting it.

The same example above:

  CPU1 shutdown
    bank is shared; bank refcount = 5; threshold_remove_bank()
      block 0 put parent bank; bank refcount = 4; __threshold_remove_blocks()
      block 1 put parent bank; bank refcount = 3; __threshold_remove_blocks()
  CPU0 shutdown
    bank is no longer shared; bank refcount = 3; threshold_remove_bank()
      block 0 put block; bank refcount = 2; deallocate_threshold_blocks()
      block 1 put block; bank refcount = 1; deallocate_threshold_blocks()
    put bank; bank refcount = 0; threshold_remove_bank()

Fixes: 7f99cb5e60 ("x86/CPU/AMD: Use default_groups in kobj_type")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/alpine.LRH.2.02.2205301145540.25840@file01.intranet.prod.int.rdu2.redhat.com
2023-07-22 17:35:16 +02:00
Daniel Sneddon
81ac7e5d74 KVM: Add GDS_NO support to KVM
Gather Data Sampling (GDS) is a transient execution attack using
gather instructions from the AVX2 and AVX512 extensions. This attack
allows malicious code to infer data that was previously stored in
vector registers. Systems that are not vulnerable to GDS will set the
GDS_NO bit of the IA32_ARCH_CAPABILITIES MSR. This is useful for VM
guests that may think they are on vulnerable systems that are, in
fact, not affected. Guests that are running on affected hosts where
the mitigation is enabled are protected as if they were running
on an unaffected system.

On all hosts that are not affected or that are mitigated, set the
GDS_NO bit.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-07-21 13:02:35 -07:00
Daniel Sneddon
53cf5797f1 x86/speculation: Add Kconfig option for GDS
Gather Data Sampling (GDS) is mitigated in microcode. However, on
systems that haven't received the updated microcode, disabling AVX
can act as a mitigation. Add a Kconfig option that uses the microcode
mitigation if available and disables AVX otherwise. Setting this
option has no effect on systems not affected by GDS. This is the
equivalent of setting gather_data_sampling=force.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-07-21 13:02:35 -07:00
Daniel Sneddon
553a5c03e9 x86/speculation: Add force option to GDS mitigation
The Gather Data Sampling (GDS) vulnerability allows malicious software
to infer stale data previously stored in vector registers. This may
include sensitive data such as cryptographic keys. GDS is mitigated in
microcode, and systems with up-to-date microcode are protected by
default. However, any affected system that is running with older
microcode will still be vulnerable to GDS attacks.

Since the gather instructions used by the attacker are part of the
AVX2 and AVX512 extensions, disabling these extensions prevents gather
instructions from being executed, thereby mitigating the system from
GDS. Disabling AVX2 is sufficient, but we don't have the granularity
to do this. The XCR0[2] disables AVX, with no option to just disable
AVX2.

Add a kernel parameter gather_data_sampling=force that will enable the
microcode mitigation if available, otherwise it will disable AVX on
affected systems.

This option will be ignored if cmdline mitigations=off.

This is a *big* hammer.  It is known to break buggy userspace that
uses incomplete, buggy AVX enumeration.  Unfortunately, such userspace
does exist in the wild:

	https://www.mail-archive.com/bug-coreutils@gnu.org/msg33046.html

[ dhansen: add some more ominous warnings about disabling AVX ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-07-21 12:59:49 -07:00
Daniel Sneddon
8974eb5882 x86/speculation: Add Gather Data Sampling mitigation
Gather Data Sampling (GDS) is a hardware vulnerability which allows
unprivileged speculative access to data which was previously stored in
vector registers.

Intel processors that support AVX2 and AVX512 have gather instructions
that fetch non-contiguous data elements from memory. On vulnerable
hardware, when a gather instruction is transiently executed and
encounters a fault, stale data from architectural or internal vector
registers may get transiently stored to the destination vector
register allowing an attacker to infer the stale data using typical
side channel techniques like cache timing attacks.

This mitigation is different from many earlier ones for two reasons.
First, it is enabled by default and a bit must be set to *DISABLE* it.
This is the opposite of normal mitigation polarity. This means GDS can
be mitigated simply by updating microcode and leaving the new control
bit alone.

Second, GDS has a "lock" bit. This lock bit is there because the
mitigation affects the hardware security features KeyLocker and SGX.
It needs to be enabled and *STAY* enabled for these features to be
mitigated against GDS.

The mitigation is enabled in the microcode by default. Disable it by
setting gather_data_sampling=off or by disabling all mitigations with
mitigations=off. The mitigation status can be checked by reading:

    /sys/devices/system/cpu/vulnerabilities/gather_data_sampling

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-07-19 16:45:37 -07:00
Borislav Petkov (AMD)
522b1d6921 x86/cpu/amd: Add a Zenbleed fix
Add a fix for the Zen2 VZEROUPPER data corruption bug where under
certain circumstances executing VZEROUPPER can cause register
corruption or leak data.

The optimal fix is through microcode but in the case the proper
microcode revision has not been applied, enable a fallback fix using
a chicken bit.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-17 15:48:10 +02:00
Borislav Petkov (AMD)
8b6f687743 x86/cpu/amd: Move the errata checking functionality up
Avoid new and remove old forward declarations.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-17 15:47:46 +02:00
Linus Torvalds
b6e6cc1f78 Fix kCFI/FineIBT weaknesses
The primary bug Alyssa noticed was that with FineIBT enabled function
 prologues have a spurious ENDBR instruction:
 
   __cfi_foo:
 	endbr64
 	subl	$hash, %r10d
 	jz	1f
 	ud2
 	nop
   1:
   foo:
 	endbr64 <--- *sadface*
 
 This means that any indirect call that fails to target the __cfi symbol
 and instead targets (the regular old) foo+0, will succeed due to that
 second ENDBR.
 
 Fixing this lead to the discovery of a single indirect call that was
 still doing this: ret_from_fork(), since that's an assembly stub the
 compmiler would not generate the proper kCFI indirect call magic and it
 would not get patched.
 
 Brian came up with the most comprehensive fix -- convert the thing to C
 with only a very thin asm wrapper. This ensures the kernel thread
 boostrap is a proper kCFI call.
 
 While discussing all this, Kees noted that kCFI hashes could/should be
 poisoned to seal all functions whose address is never taken, further
 limiting the valid kCFI targets -- much like we already do for IBT.
 
 So what was a 'simple' observation and fix cascaded into a bunch of
 inter-related CFI infrastructure fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEv3OU3/byMaA0LqWJdkfhpEvA5LoFAmSxr64VHHBldGVyekBp
 bmZyYWRlYWQub3JnAAoJEHZH4aRLwOS6L7kQAIjDWbxqVtmiBiz+IBcWcsxt7BXX
 pRBaSe/eBp3KLhqgzYUY0mXIi0ua7y3CBtW4SdQUSPsAKtCgBUuq2JjQWToRghjN
 4ndCky4oxb9z8ADr/R/qfU8ZpSOwoX3kgBHqyjcQ0fQsg/DFKs3sWKqluwT0PtvU
 vLYAw2QKSv56NG/u3CujWPdcIWgzJ+M3214xuqIWCTwEcqdP+xkXmQstkXkyPQ6d
 XE0iG/wo9uiX4icfsRVp8JL0TkzNqGJfgr9Mv1rBKT4wbT64zKI6RyMJVlUS0yrk
 1jeDgNbVfx4ZpvtHmTsQn1jogWI3pqGkqoPwHqJSFg42Eer5OSodH/uVd3HK/0tD
 1nlhCfue6zc4smu480064s3fWAE7kC6ySdmijQXOJo3YWVGdagxVp/CSE4Ek0TFq
 y+CltNEA6bthKImWg8GFWxS8bMnuZv2joJ8yhgfpnG5sppVOYs2HJ3ipIks9sZjO
 o65auDeOkGg1+NhgDx+2uay6/fbxTNjbAyjV4HttkN70SO5kTTT4zWyh2PLwXaTy
 wv0B4i0laxTRU7boIA4nFJAKz5xKfyh9e2idxbmPlrV5FY4mEPA2oLeWsn8cS4VG
 0SWJ30ky7C4r7VWd9DWhGcCRcrlCvCM8LdjwzImZHXRQ2KweEuGMmrXYtHCrTRZn
 IMijS/9q653h9ws7
 =RhPI
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CFI fixes from Peter Zijlstra:
 "Fix kCFI/FineIBT weaknesses

  The primary bug Alyssa noticed was that with FineIBT enabled function
  prologues have a spurious ENDBR instruction:

    __cfi_foo:
	endbr64
	subl	$hash, %r10d
	jz	1f
	ud2
	nop
    1:
    foo:
	endbr64 <--- *sadface*

  This means that any indirect call that fails to target the __cfi
  symbol and instead targets (the regular old) foo+0, will succeed due
  to that second ENDBR.

  Fixing this led to the discovery of a single indirect call that was
  still doing this: ret_from_fork(). Since that's an assembly stub the
  compiler would not generate the proper kCFI indirect call magic and it
  would not get patched.

  Brian came up with the most comprehensive fix -- convert the thing to
  C with only a very thin asm wrapper. This ensures the kernel thread
  boostrap is a proper kCFI call.

  While discussing all this, Kees noted that kCFI hashes could/should be
  poisoned to seal all functions whose address is never taken, further
  limiting the valid kCFI targets -- much like we already do for IBT.

  So what was a 'simple' observation and fix cascaded into a bunch of
  inter-related CFI infrastructure fixes"

* tag 'x86_urgent_for_6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cfi: Only define poison_cfi() if CONFIG_X86_KERNEL_IBT=y
  x86/fineibt: Poison ENDBR at +0
  x86: Rewrite ret_from_fork() in C
  x86/32: Remove schedule_tail_wrapper()
  x86/cfi: Extend ENDBR sealing to kCFI
  x86/alternative: Rename apply_ibt_endbr()
  x86/cfi: Extend {JMP,CAKK}_NOSPEC comment
2023-07-14 20:19:25 -07:00
Ingo Molnar
535d0ae391 x86/cfi: Only define poison_cfi() if CONFIG_X86_KERNEL_IBT=y
poison_cfi() was introduced in:

  9831c6253a ("x86/cfi: Extend ENDBR sealing to kCFI")

... but it's only ever used under CONFIG_X86_KERNEL_IBT=y,
and if that option is disabled, we get:

  arch/x86/kernel/alternative.c:1243:13: error: ‘poison_cfi’ defined but not used [-Werror=unused-function]

Guard the definition with CONFIG_X86_KERNEL_IBT.

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-07-11 10:17:55 +02:00
YueHaibing
b599b06544 x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret()
This is now unused, so can remove it.

Link: https://lore.kernel.org/linux-trace-kernel/20230623091640.21952-1-yuehaibing@huawei.com

Cc: <mark.rutland@arm.com>
Cc: <tglx@linutronix.de>
Cc: <mingo@redhat.com>
Cc: <bp@alien8.de>
Cc: <dave.hansen@linux.intel.com>
Cc: <x86@kernel.org>
Cc: <hpa@zytor.com>
Cc: <peterz@infradead.org>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-10 21:38:13 -04:00
Peter Zijlstra
04505bbbbb x86/fineibt: Poison ENDBR at +0
Alyssa noticed that when building the kernel with CFI_CLANG+IBT and
booting on IBT enabled hardware to obtain FineIBT, the indirect
functions look like:

  __cfi_foo:
	endbr64
	subl	$hash, %r10d
	jz	1f
	ud2
	nop
  1:
  foo:
	endbr64

This is because the compiler generates code for kCFI+IBT. In that case
the caller does the hash check and will jump to +0, so there must be
an ENDBR there. The compiler doesn't know about FineIBT at all; also
it is possible to actually use kCFI+IBT when booting with 'cfi=kcfi'
on IBT enabled hardware.

Having this second ENDBR however makes it possible to elide the CFI
check. Therefore, we should poison this second ENDBR when switching to
FineIBT mode.

Fixes: 931ab63664 ("x86/ibt: Implement FineIBT")
Reported-by: "Milburn, Alyssa" <alyssa.milburn@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20230615193722.194131053@infradead.org
2023-07-10 09:52:25 +02:00
Brian Gerst
3aec4ecb3d x86: Rewrite ret_from_fork() in C
When kCFI is enabled, special handling is needed for the indirect call
to the kernel thread function.  Rewrite the ret_from_fork() function in
C so that the compiler can properly handle the indirect call.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lkml.kernel.org/r/20230623225529.34590-3-brgerst@gmail.com
2023-07-10 09:52:25 +02:00
Peter Zijlstra
9831c6253a x86/cfi: Extend ENDBR sealing to kCFI
Kees noted that IBT sealing could be extended to kCFI.

Fundamentally it is the list of functions that do not have their
address taken and are thus never called indirectly. It doesn't matter
that objtool uses IBT infrastructure to determine this list, once we
have it it can also be used to clobber kCFI hashes and avoid kCFI
indirect calls.

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lkml.kernel.org/r/20230622144321.494426891%40infradead.org
2023-07-10 09:52:24 +02:00
Peter Zijlstra
be0fffa5ca x86/alternative: Rename apply_ibt_endbr()
The current name doesn't reflect what it does very well.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lkml.kernel.org/r/20230622144321.427441595%40infradead.org
2023-07-10 09:52:23 +02:00
Linus Torvalds
e3da8db055 A single fix for the mechanism to park CPUs with an INIT IPI.
On shutdown or kexec, the kernel tries to park the non-boot CPUs with an
 INIT IPI. But the same code path is also used by the crash utility. If the
 CPU which panics is not the boot CPU then it sends an INIT IPI to the boot
 CPU which resets the machine. Prevent this by validating that the CPU which
 runs the stop mechanism is the boot CPU. If not, leave the other CPUs in
 HLT.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSqoEcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYNSEACwo5zgibek27qeMvJGfNztm0qRa4mw
 wN0qV31yaNcEfhqL8bMU8n3wvEA+pZBqhaU5fyalY+yxc29jI/j9eda5zR+Fi9e5
 kVyFT2M0rVSDLFraoQeD+T/tSSK2MJtswF12ytY5mHzHMCb6Uy9fNCpUiQlB+i81
 AcnlKQk9ifZXFdMJPj5E+E6l776T8NZPoYEdFgJloxaYOGTdFJDWDlryx4LD7Urz
 Fx/ec8Ug/FYSPl2XzXHugvHjNefxKoomcZ3v3CSZonBcav7Gz6F06HAR5vVRWSHx
 4Dlh6zdy+60YKBmkvpb+RJIBMo8aXclwT+tntaoJvGHZ+PNASO6JVz9PvmoNgfWK
 Oy2n1K687qIOY6d+yxUZgbZpwXX5bG6kc0xbicUNigGagrYTfd83G5RAfwxNkqsY
 23Qw4Ue8uxve4M8iM/FfxKIShuDBiLCIDDIrWDjEkvIAnr1pd+NPUv7kqOTI7Kz/
 srNgcOwalypzuS93lgaN1yjRv1mmaPXhdhjy0DwGbC54bKgNzfq+7z75Ibn0dSFF
 JUPFVjztB+ymnM6PJ1dR77SvPi+xOi60nw7L+Qu9US4yKkW0NeGiIWVsggNorbU6
 UPFSE5gxwFD0w1EZ9W+IDeOZUNhjJUINZsn8txm+tb+oEqTIGRPHPOo0C1dBmLW9
 AmDIeHljj0iWIw==
 =DOCF
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
 "A single fix for the mechanism to park CPUs with an INIT IPI.

  On shutdown or kexec, the kernel tries to park the non-boot CPUs with
  an INIT IPI. But the same code path is also used by the crash utility.
  If the CPU which panics is not the boot CPU then it sends an INIT IPI
  to the boot CPU which resets the machine.

  Prevent this by validating that the CPU which runs the stop mechanism
  is the boot CPU. If not, leave the other CPUs in HLT"

* tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Don't send INIT to boot CPU
2023-07-09 10:08:38 -07:00
Thomas Gleixner
b1472a60a5 x86/smp: Don't send INIT to boot CPU
Parking CPUs in INIT works well, except for the crash case when the CPU
which invokes smp_park_other_cpus_in_init() is not the boot CPU. Sending
INIT to the boot CPU resets the whole machine.

Prevent this by validating that this runs on the boot CPU. If not fall back
and let CPUs hang in HLT.

Fixes: 45e34c8af5 ("x86/smp: Put CPUs into INIT on shutdown if possible")
Reported-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/87ttui91jo.ffs@tglx
2023-07-07 15:42:31 +02:00
Linus Torvalds
cccf0c2ee5 Tracing updates for 6.5:
- Add new feature to have function graph tracer record the return value.
   Adds a new option: funcgraph-retval ; when set, will show the return
   value of a function in the function graph tracer.
 
 - Also add the option: funcgraph-retval-hex where if it is not set, and
   the return value is an error code, then it will return the decimal of
   the error code, otherwise it still reports the hex value.
 
 - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd
   That when a application opens it, it becomes the task that the timer lat
   tracer traces. The application can also read this file to find out how
   it's being interrupted.
 
 - Add the file /sys/kernel/tracing/available_filter_functions_addrs
   that works just the same as available_filter_functions but also shows
   the addresses of the functions like kallsyms, except that it gives the
   address of where the fentry/mcount jump/nop is. This is used by BPF to
   make it easier to attach BPF programs to ftrace hooks.
 
 - Replace strlcpy with strscpy in the tracing boot code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZJy6ixQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnzRAPsEI2YgjaJSHnuPoGRHbrNil6pq66wY
 LYaLizGI4Jv9BwEAqdSdcYcMiWo1SFBAO8QxEDM++BX3zrRyVgW8ahaTNgs=
 =TF0C
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Add new feature to have function graph tracer record the return
   value. Adds a new option: funcgraph-retval ; when set, will show the
   return value of a function in the function graph tracer.

 - Also add the option: funcgraph-retval-hex where if it is not set, and
   the return value is an error code, then it will return the decimal of
   the error code, otherwise it still reports the hex value.

 - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd
   That when a application opens it, it becomes the task that the timer
   lat tracer traces. The application can also read this file to find
   out how it's being interrupted.

 - Add the file /sys/kernel/tracing/available_filter_functions_addrs
   that works just the same as available_filter_functions but also shows
   the addresses of the functions like kallsyms, except that it gives
   the address of where the fentry/mcount jump/nop is. This is used by
   BPF to make it easier to attach BPF programs to ftrace hooks.

 - Replace strlcpy with strscpy in the tracing boot code.

* tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix warnings when building htmldocs for function graph retval
  riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
  tracing/boot: Replace strlcpy with strscpy
  tracing/timerlat: Add user-space interface
  tracing/osnoise: Skip running osnoise if all instances are off
  tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable
  ftrace: Show all functions with addresses in available_filter_functions_addrs
  selftests/ftrace: Add funcgraph-retval test case
  LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
  x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
  arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
  tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex
  function_graph: Support recording and printing the return value of function
  fgraph: Add declaration of "struct fgraph_ret_regs"
2023-06-30 10:33:17 -07:00
Linus Torvalds
6e17c6de3d - Yosry Ahmed brought back some cgroup v1 stats in OOM logs.
- Yosry has also eliminated cgroup's atomic rstat flushing.
 
 - Nhat Pham adds the new cachestat() syscall.  It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability.
 
 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning.
 
 - Lorenzo Stoakes has done some maintanance work on the get_user_pages()
   interface.
 
 - Liam Howlett continues with cleanups and maintenance work to the maple
   tree code.  Peng Zhang also does some work on maple tree.
 
 - Johannes Weiner has done some cleanup work on the compaction code.
 
 - David Hildenbrand has contributed additional selftests for
   get_user_pages().
 
 - Thomas Gleixner has contributed some maintenance and optimization work
   for the vmalloc code.
 
 - Baolin Wang has provided some compaction cleanups,
 
 - SeongJae Park continues maintenance work on the DAMON code.
 
 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting.
 
 - Christoph Hellwig has some cleanups for the filemap/directio code.
 
 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the provided
   APIs rather than open-coding accesses.
 
 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings.
 
 - John Hubbard has a series of fixes to the MM selftesting code.
 
 - ZhangPeng continues the folio conversion campaign.
 
 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock.
 
 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
   128 to 8.
 
 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management.
 
 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code.
 
 - Vishal Moola also has done some folio conversion work.
 
 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
 joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
 J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
 =B7yQ
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
2023-06-28 10:28:11 -07:00
Linus Torvalds
18eb3b6dff xen: branch for v6.5-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZJp4CgAKCRCAXGG7T9hj
 vmmpAP4gMe7T2QXWY9VvWgyf97z3AtBx2NdTzLAmArFySzPFtgEAgCHE3yy95bmR
 JAX4+q/2QPbFxp0TgJrrxlq5RDn5Ago=
 =2HjA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.5-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - three patches adding missing prototypes

 - a fix for finding the iBFT in a Xen dom0 for supporting diskless
   iSCSI boot

* tag 'for-linus-6.5-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86: xen: add missing prototypes
  x86/xen: add prototypes for paravirt mmu functions
  iscsi_ibft: Fix finding the iBFT under Xen Dom 0
  xen: xen_debug_interrupt prototype to global header
2023-06-27 16:03:20 -07:00
Linus Torvalds
6f612579be objtool changes for v6.5:
- Build footprint & performance improvements:
 
     - Reduce memory usage with CONFIG_DEBUG_INFO=y
 
       In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel, DWARF
       creates almost 200 million relocations, ballooning objtool's peak heap
       usage to 53GB.  These patches reduce that to 25GB.
 
       On a distro-type kernel with kernel IBT enabled, they reduce objtool's
       peak heap usage from 4.2GB to 2.8GB.
 
       These changes also improve the runtime significantly.
 
 - Debuggability improvements:
 
     - Add the unwind_debug command-line option, for more extend unwinding
       debugging output.
     - Limit unreachable warnings to once per function
     - Add verbose option for disassembling affected functions
     - Include backtrace in verbose mode
     - Detect missing __noreturn annotations
     - Ignore exc_double_fault() __noreturn warnings
     - Remove superfluous global_noreturns entries
     - Move noreturn function list to separate file
     - Add __kunit_abort() to noreturns
 
 - Unwinder improvements:
 
     - Allow stack operations in UNWIND_HINT_UNDEFINED regions
     - drm/vmwgfx: Add unwind hints around RBP clobber
 
 - Cleanups:
 
     - Move the x86 entry thunk restore code into thunk functions
     - x86/unwind/orc: Use swap() instead of open coding it
     - Remove unnecessary/unused variables
 
 - Fixes for modern stack canary handling
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSaxcoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ht5w//f8mBoABct29pS4ib6pDwRZQDoG8fCA7M
 +KWjFD1AhX7RsJVEbM4uBUXdSWZD61xxIa8p8LO2jjzE5RyhM+EuNaisKujKqmfj
 uQTSnRhIRHMPqqVGK/gQxy1v4+3+12O32XFIJhAPYCp/dpbZJ2yKDsiHjapzZTDy
 BM+86hbIyHFmSl5uJcBFHEv6EGhoxwdrrrOxhpao1CqfAUi+uVgamHGwVqx+NtTY
 MvOmcy3/0ukHwDLON0MIMu9MSwvnXorD7+RSkYstwAM/k6ao/k78iJ31sOcynpRn
 ri0gmfygJsh2bxL4JUlY4ZeTs7PLWkj3i60deePc5u6EyV4JDJ2borUibs5oGoF6
 pN0AwbtubLHHhUI/v74B3E6K6ZGvLiEn9dsNTuXsJffD+qU2REb+WLhr4ut+E1Wi
 IKWrYh811yBLyOqFEW3XudZTiXSJlgi3eYiCxspEsKw2RIFFt2g6vYcwrIb0Hatw
 8R4/jCWk1nc6Wa3RQYsVnhkglAECSKQdDfS7p2e1hNUTjZuess4EEJjSLs8upIQ9
 D1bmuUxEzRxVwAZtXYNh0NKe7OtyOrqgsVTQuqxvWXq2CpC7Hqj8piVJWHdBWgHO
 0o2OQqjwSrzAtevpAIaYQv9zhPs1hV7CpBgzzqWGXrwJ3vM6YoSRLf0bg+5OkN8I
 O4U2xq2OVa8=
 =uNnc
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molar:
 "Build footprint & performance improvements:

   - Reduce memory usage with CONFIG_DEBUG_INFO=y

     In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel,
     DWARF creates almost 200 million relocations, ballooning objtool's
     peak heap usage to 53GB. These patches reduce that to 25GB.

     On a distro-type kernel with kernel IBT enabled, they reduce
     objtool's peak heap usage from 4.2GB to 2.8GB.

     These changes also improve the runtime significantly.

  Debuggability improvements:

   - Add the unwind_debug command-line option, for more extend unwinding
     debugging output
   - Limit unreachable warnings to once per function
   - Add verbose option for disassembling affected functions
   - Include backtrace in verbose mode
   - Detect missing __noreturn annotations
   - Ignore exc_double_fault() __noreturn warnings
   - Remove superfluous global_noreturns entries
   - Move noreturn function list to separate file
   - Add __kunit_abort() to noreturns

  Unwinder improvements:

   - Allow stack operations in UNWIND_HINT_UNDEFINED regions
   - drm/vmwgfx: Add unwind hints around RBP clobber

  Cleanups:

   - Move the x86 entry thunk restore code into thunk functions
   - x86/unwind/orc: Use swap() instead of open coding it
   - Remove unnecessary/unused variables

  Fixes for modern stack canary handling"

* tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/orc: Make the is_callthunk() definition depend on CONFIG_BPF_JIT=y
  objtool: Skip reading DWARF section data
  objtool: Free insns when done
  objtool: Get rid of reloc->rel[a]
  objtool: Shrink elf hash nodes
  objtool: Shrink reloc->sym_reloc_entry
  objtool: Get rid of reloc->jump_table_start
  objtool: Get rid of reloc->addend
  objtool: Get rid of reloc->type
  objtool: Get rid of reloc->offset
  objtool: Get rid of reloc->idx
  objtool: Get rid of reloc->list
  objtool: Allocate relocs in advance for new rela sections
  objtool: Add for_each_reloc()
  objtool: Don't free memory in elf_close()
  objtool: Keep GElf_Rel[a] structs synced
  objtool: Add elf_create_section_pair()
  objtool: Add mark_sec_changed()
  objtool: Fix reloc_hash size
  objtool: Consolidate rel/rela handling
  ...
2023-06-27 15:05:41 -07:00
Linus Torvalds
bc6cb4d5bc Locking changes for v6.5:
- Introduce cmpxchg128() -- aka. the demise of cmpxchg_double().
 
   The cmpxchg128() family of functions is basically & functionally
   the same as cmpxchg_double(), but with a saner interface: instead
   of a 6-parameter horror that forced u128 - u64/u64-halves layout
   details on the interface and exposed users to complexity,
   fragility & bugs, use a natural 3-parameter interface with u128 types.
 
 - Restructure the generated atomic headers, and add
   kerneldoc comments for all of the generic atomic{,64,_long}_t
   operations. Generated definitions are much cleaner now,
   and come with documentation.
 
 - Implement lock_set_cmp_fn() on lockdep, for defining an ordering
   when taking multiple locks of the same type. This gets rid of
   one use of lockdep_set_novalidate_class() in the bcache code.
 
 - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended
   variable shadowing generating garbage code on Clang on certain
   ARM builds.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSav3wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gDyxAAjCHQjpolrre7fRpyiTDwqzIKT27H04vQ
 zrQVlVc42WBnn9pe8LthGy43/RvYvqlZvLoLONA4fMkuYriM6nSMsoZjeUmE+6Rs
 QAElQC74P5YvEBOa67VNY3/M7sj22ftDe7ODtVV8OrnPjMk1sQNRvaK025Cs3yig
 8MAI//hHGNmyVAp1dPYZMJNqxGCvluReLZ4SaUJFCMrg7YgUXgCBj/5Gi07TlKxn
 sT8BFCssoEW/B9FXkh59B1t6FBCZoSy4XSZfsZe0uVAUJ4XDEOO+zBgaWFCedNQT
 wP323ryBgMrkzUKA8j2/o5d3QnMA1GcBfHNNlvAl/fOfrxWXzDZnOEY26YcaLMa0
 YIuRF/JNbPZlt6DCUVBUEvMPpfNYi18dFN0rat1a6xL2L4w+tm55y3mFtSsg76Ka
 r7L2nWlRrAGXnuA+VEPqkqbSWRUSWOv5hT2Mcyb5BqqZRsxBETn6G8GVAzIO6j6v
 giyfUdA8Z9wmMZ7NtB6usxe3p1lXtnZ/shCE7ZHXm6xstyZrSXaHgOSgAnB9DcuJ
 7KpGIhhSODQSwC/h/J0KEpb9Pr/5jCWmXAQ2DWnZK6ndt1jUfFi8pfK58wm0AuAM
 o9t8Mx3o8wZjbMdt6up9OIM1HyFiMx2BSaZK+8f/bWemHQ0xwez5g4k5O5AwVOaC
 x9Nt+Tp0Ze4=
 =DsYj
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()

   The cmpxchg128() family of functions is basically & functionally the
   same as cmpxchg_double(), but with a saner interface.

   Instead of a 6-parameter horror that forced u128 - u64/u64-halves
   layout details on the interface and exposed users to complexity,
   fragility & bugs, use a natural 3-parameter interface with u128
   types.

 - Restructure the generated atomic headers, and add kerneldoc comments
   for all of the generic atomic{,64,_long}_t operations.

   The generated definitions are much cleaner now, and come with
   documentation.

 - Implement lock_set_cmp_fn() on lockdep, for defining an ordering when
   taking multiple locks of the same type.

   This gets rid of one use of lockdep_set_novalidate_class() in the
   bcache code.

 - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable
   shadowing generating garbage code on Clang on certain ARM builds.

* tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc
  percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg()
  locking/atomic: treewide: delete arch_atomic_*() kerneldoc
  locking/atomic: docs: Add atomic operations to the driver basic API documentation
  locking/atomic: scripts: generate kerneldoc comments
  docs: scripts: kernel-doc: accept bitwise negation like ~@var
  locking/atomic: scripts: simplify raw_atomic*() definitions
  locking/atomic: scripts: simplify raw_atomic_long*() definitions
  locking/atomic: scripts: split pfx/name/sfx/order
  locking/atomic: scripts: restructure fallback ifdeffery
  locking/atomic: scripts: build raw_atomic_long*() directly
  locking/atomic: treewide: use raw_atomic*_<op>()
  locking/atomic: scripts: add trivial raw_atomic*_<op>()
  locking/atomic: scripts: factor out order template generation
  locking/atomic: scripts: remove leftover "${mult}"
  locking/atomic: scripts: remove bogus order parameter
  locking/atomic: xtensa: add preprocessor symbols
  locking/atomic: x86: add preprocessor symbols
  locking/atomic: sparc: add preprocessor symbols
  locking/atomic: sh: add preprocessor symbols
  ...
2023-06-27 14:14:30 -07:00
Linus Torvalds
ed3b7923a8 Scheduler changes for v6.5:
- Scheduler SMP load-balancer improvements:
 
     - Avoid unnecessary migrations within SMT domains on hybrid systems.
 
       Problem:
 
         On hybrid CPU systems, (processors with a mixture of higher-frequency
 	SMT cores and lower-frequency non-SMT cores), under the old code
 	lower-priority CPUs pulled tasks from the higher-priority cores if
 	more than one SMT sibling was busy - resulting in many unnecessary
 	task migrations.
 
       Solution:
 
         The new code improves the load balancer to recognize SMT cores with more
         than one busy sibling and allows lower-priority CPUs to pull tasks, which
         avoids superfluous migrations and lets lower-priority cores inspect all SMT
         siblings for the busiest queue.
 
     - Implement the 'runnable boosting' feature in the EAS balancer: consider CPU
       contention in frequency, EAS max util & load-balance busiest CPU selection.
 
       This improves CPU utilization for certain workloads, while leaves other key
       workloads unchanged.
 
 - Scheduler infrastructure improvements:
 
     - Rewrite the scheduler topology setup code by consolidating it
       into the build_sched_topology() helper function and building
       it dynamically on the fly.
 
     - Resolve the local_clock() vs. noinstr complications by rewriting
       the code: provide separate sched_clock_noinstr() and
       local_clock_noinstr() functions to be used in instrumentation code,
       and make sure it is all instrumentation-safe.
 
 - Fixes:
 
     - Fix a kthread_park() race with wait_woken()
 
     - Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
        - Fix UP PREEMPT bug by unifying the SMP and UP implementations.
        - Fix task_struct::saved_state handling.
 
     - Fix various rq clock update bugs, unearthed by turning on the rq clock
       debugging code.
 
     - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger by
       creating enough cgroups, by removing the warnign and restricting
       window size triggers to PSI file write-permission or CAP_SYS_RESOURCE.
 
     - Propagate SMT flags in the topology when removing degenerate domain
 
     - Fix grub_reclaim() calculation bug in the deadline scheduler code
 
     - Avoid resetting the min update period when it is unnecessary, in
       psi_trigger_destroy().
 
     - Don't balance a task to its current running CPU in load_balance(),
       which was possible on certain NUMA topologies with overlapping
       groups.
 
     - Fix the sched-debug printing of rq->nr_uninterruptible
 
 - Cleanups:
 
     - Address various -Wmissing-prototype warnings, as a preparation
       to (maybe) enable this warning in the future.
 
     - Remove unused code
 
     - Mark more functions __init
 
     - Fix shadow-variable warnings
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSatWQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j62xAAuGOx1LcDfRGC6WGQzp1zOdlsVQtnDvlS
 qL58zYSHgizprpVQ3j87SBaG4CHCdvd2Bo36yW0lNZS4nd203qdq7fkrMb3hPP/w
 egUQUzMegf5fF6BWldKeMjuHSt+twFQz/ZAKK8iSbAir6CHNAqbNst1oL0i/+Tyk
 o33hBs1hT5tnbFb1NSVZkX4k+qT3LzTW4K2QgjjGtkScr6yHh2BdEVefyigWOjdo
 9s02d00ll9a2r+F5txlN7Dnw6TN7rmTXGMOJU5bZvBE90/anNiAorMXHJdEKCyUR
 u9+JtBdJWiCplGa/tSRcxT16ZW1VdtTnd9q66TDhXREd2UNDFqBEyg5Wl77K4Tlf
 vKFajmj/to+cTbuv6m6TVR+zyXpdEpdL6F04P44U3qiJvDobBqeDNKHHIqpmbHXl
 AXUXcPWTVAzXX1Ce5M+BeAgTBQ1T7C5tELILrTNQHJvO1s9VVBRFZ/l65Ps4vu7T
 wIZ781IFuopk0zWqHovNvgKrJ7oFmOQQZFttQEe8n6nafkjI7u+IZ8FayiGaUMRr
 4GawFGUCEdYh8z9qyslGKe8Q/Rphfk6hxMFRYUJpDmubQ0PkMeDjDGq77jDGl1PF
 VqwSDEyOaBJs7Gqf/mem00JtzBmXhkhm1SEjggHMI2IQbr/eeBXoLQOn3CDapO/N
 PiDbtX760ic=
 =EWQA
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Scheduler SMP load-balancer improvements:

   - Avoid unnecessary migrations within SMT domains on hybrid systems.

     Problem:

        On hybrid CPU systems, (processors with a mixture of
        higher-frequency SMT cores and lower-frequency non-SMT cores),
        under the old code lower-priority CPUs pulled tasks from the
        higher-priority cores if more than one SMT sibling was busy -
        resulting in many unnecessary task migrations.

     Solution:

        The new code improves the load balancer to recognize SMT cores
        with more than one busy sibling and allows lower-priority CPUs
        to pull tasks, which avoids superfluous migrations and lets
        lower-priority cores inspect all SMT siblings for the busiest
        queue.

   - Implement the 'runnable boosting' feature in the EAS balancer:
     consider CPU contention in frequency, EAS max util & load-balance
     busiest CPU selection.

     This improves CPU utilization for certain workloads, while leaves
     other key workloads unchanged.

  Scheduler infrastructure improvements:

   - Rewrite the scheduler topology setup code by consolidating it into
     the build_sched_topology() helper function and building it
     dynamically on the fly.

   - Resolve the local_clock() vs. noinstr complications by rewriting
     the code: provide separate sched_clock_noinstr() and
     local_clock_noinstr() functions to be used in instrumentation code,
     and make sure it is all instrumentation-safe.

  Fixes:

   - Fix a kthread_park() race with wait_woken()

   - Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
       - Fix UP PREEMPT bug by unifying the SMP and UP implementations
       - Fix task_struct::saved_state handling

   - Fix various rq clock update bugs, unearthed by turning on the rq
     clock debugging code.

   - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger
     by creating enough cgroups, by removing the warnign and restricting
     window size triggers to PSI file write-permission or
     CAP_SYS_RESOURCE.

   - Propagate SMT flags in the topology when removing degenerate domain

   - Fix grub_reclaim() calculation bug in the deadline scheduler code

   - Avoid resetting the min update period when it is unnecessary, in
     psi_trigger_destroy().

   - Don't balance a task to its current running CPU in load_balance(),
     which was possible on certain NUMA topologies with overlapping
     groups.

   - Fix the sched-debug printing of rq->nr_uninterruptible

  Cleanups:

   - Address various -Wmissing-prototype warnings, as a preparation to
     (maybe) enable this warning in the future.

   - Remove unused code

   - Mark more functions __init

   - Fix shadow-variable warnings"

* tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()
  sched/core: Avoid double calling update_rq_clock() in __balance_push_cpu_stop()
  sched/core: Fixed missing rq clock update before calling set_rq_offline()
  sched/deadline: Update GRUB description in the documentation
  sched/deadline: Fix bandwidth reclaim equation in GRUB
  sched/wait: Fix a kthread_park race with wait_woken()
  sched/topology: Mark set_sched_topology() __init
  sched/fair: Rename variable cpu_util eff_util
  arm64/arch_timer: Fix MMIO byteswap
  sched/fair, cpufreq: Introduce 'runnable boosting'
  sched/fair: Refactor CPU utilization functions
  cpuidle: Use local_clock_noinstr()
  sched/clock: Provide local_clock_noinstr()
  x86/tsc: Provide sched_clock_noinstr()
  clocksource: hyper-v: Provide noinstr sched_clock()
  clocksource: hyper-v: Adjust hv_read_tsc_page_tsc() to avoid special casing U64_MAX
  x86/vdso: Fix gettimeofday masking
  math64: Always inline u128 version of mul_u64_u64_shr()
  s390/time: Provide sched_clock_noinstr()
  loongarch: Provide noinstr sched_clock_read()
  ...
2023-06-27 14:03:21 -07:00
Linus Torvalds
e8f75c0270 - A fix to avoid using a list iterator variable after the loop it is
used in
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSa9coACgkQEsHwGGHe
 VUoKuA/6ApbRnAYvnwv6/o6Jnw9Jafzv1RpN6zIH9sjBX/4ghap5OI03tOO2maag
 qQRnYZrjx5jCIctc/i4XS/is51b0Gnv6Bu6uesvuC8xynYnS0jIO4ONqvous/3Jj
 7BsduLzFptvmwQV4jf17hv3OD4CqzMDKyKFDIX7zBeq6xdc66AqB+ba+oBmvNVDI
 wHCkESdPnzSsjqsSQvfhxTasbBVV/exBpQst6oPT1WBiDscxgV/ArMsF7ZgzQjuF
 +WHmyc6KfEgY41iPxBJ6FkUOGBM0fpJl2QmkgS3+WAHxFF3QDK/oKrwo+OPtih8P
 Lec3y/JUncvfaKHcmqNEkI4GCfaZEOqQaP4/MUluJasetbzeLrDr+NEbtBPEXSTi
 cUfKMdwoaALoGE0YJL8wIr9TxfgmU9kONstQIm9Crl3XJ4e/zHMyRtSoMMv4vOCx
 E6amvwaBMsEbs/BRPjP/5f5aYfVI2e81b6QFOlXIjhjogN5AYlBms9sY+CfCw7Fm
 LBP1h6OH6FZrAKfDHNSpLNLFllplmN5sVImrFotSdtCJe37WGbMXNEULJJklp7dZ
 rEJCdxC0B66YOiYhbSIxQOUcEI9db56qXRfDHq6udXEovenGSV2Ke8SUTyq/QIDP
 YWdUKuyxGtW6eKVMo4ineOQTrZ5e5kNpeSt2NOinSiCOl4RAUbs=
 =haGW
 -----END PGP SIGNATURE-----

Merge tag 'x86_sgx_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SGX update from Borislav Petkov:

 - A fix to avoid using a list iterator variable after the loop it is
   used in

* tag 'x86_sgx_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Avoid using iterator after loop in sgx_mmu_notifier_release()
2023-06-27 13:49:33 -07:00
Linus Torvalds
12dc010071 - Some SEV and CC platform helpers cleanup and simplifications now that
the usage patterns are becoming apparent
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSa9CoACgkQEsHwGGHe
 VUr1NhAAjmOq/T41u3FCSU6fZ8gXo5UkIUT13a6cx+6Omx9waJn5G0xdf/380vQN
 RPRTcc4cfQHCdnIeHgiz1YtCh1ljxXswOSbexHgWHjEcqadgxkZTlKaBEbqyEwLI
 lnRRsowfk7J/8RsYqtzuBvGaWNliiszWE8iayruI1IL+FoEDLlLx1GNYqusP5WIs
 0KYm919Zozl8FEZjP47nH4bab1RcE+HGmLG7UEBmR0zHl4cc7iN3wpv2o/vDxVzR
 /KP8a2G7J/xjllGW+OP81dFCS7iklHpNuaxQS73fDIL7ll2VDqNemh4ivykCrplo
 93twODBwKboKmZhnKc0M2axm5JGGx7IC3KTqEUHzb2Wo4bZCYnrj+9Utzxsa3FxB
 m0BSUcmBqzZCsHCbu62N66l1NlB32EnMO80/45NrgGi62YiGP8qQNhy3TiceUNle
 NHFkQmRZwLyW5YC1ntSK8fwSu4GrMG1MG/eRfMPDmmsYogiZUm2KIj7XKy3dKXR6
 maqifh/raPk3rL+7cl9BQleCDLjnkHNxFxFa329P9K4wrtqn7Ley4izWYAUYhNrl
 VOjLs+thTwEmPPgpo/K1wAbR/PjLdmSQ/fQR6w7eUrzNVnm5ndXzhTUcIawycG3T
 haVry+xPMIRlLCIg+dkrwwNcTW1y/X3K4SmgnjLLF57lZFn9UJc=
 =Aj6p
 -----END PGP SIGNATURE-----

Merge tag 'x86_sev_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:

 - Some SEV and CC platform helpers cleanup and simplifications now that
   the usage patterns are becoming apparent

[ I'm sure I'm the only one that has gets confused by all the TLAs, but
  in case there are others: here SEV is AMD's "Secure Encrypted
  Virtualization" and CC is generic "Confidential Computing".

  There's also Intel SGX (Software Guard Extensions) and TDX (Trust
  Domain Extensions), along with all the vendor memory encryption
  extensions (SME, TSME, TME, and WTF).

  And then we have arm64 with RMA and CCA, and I probably forgot another
  dozen or so related acronyms    - Linus ]

* tag 'x86_sev_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/coco: Get rid of accessor functions
  x86/sev: Get rid of special sev_es_enable_key
  x86/coco: Mark cc_platform_has() and descendants noinstr
2023-06-27 13:26:30 -07:00
Linus Torvalds
dc43fc753b - A serious scrubbing of the MTRR code including adding a new map
mechanism in order to look up the memory type of a region easily. Also
   address memory range lookup issues like returning an invalid memory
   type. Furthermore, this handles the decoupling of PAT from MTRR more
   naturally. All work by Juergen Gross
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSazOIACgkQEsHwGGHe
 VUqltQ/8D1oA4LrgnbFO25J/27U/MwKo7ZI3hN6/OkH2FfdBgqeOlOV4TnndDL88
 l/UrzOfWJQxpVLTO3SLMtDla0VrT24B4HZ4hvzDEdJZ8f1DLZ+gLN7sKOMjoIcO9
 fvBZ5+/gFtVxSquwZmWvM0qKiCkKxmznJfpOx1/lt9UtKyKmpPSMVdrpqOeufL7k
 xxWqGRh2s104ZKwfMOj4dgvCVK9ZUlsPqqiARzkqc0bCg7SeIyPea/S2eljhTl15
 BTOA/wW/lcVQ9yWmDD8inzxrZI4EHEohEaNMfof3AqFyYCOU4RzvE9tpAFEK3GXp
 NilxYkZ+JbEljq2QiEt0Ll8XEVKedi7YC1oN3ciiy9RS6+rWSPIvuMFV9tgPRjr1
 AbWYmDoiLz+5ePI+0fckStRRntWKiao+hOaXb5RbEcg+85hkDHZZC7b0tCAUvnh7
 OwuQfbzAqipn2G1hg+LThHDSjI4qHfHJlpeuPcsAxWef1diJbe15StdVWm+ttRE0
 MTXSn3J9qT9MoY5y6m4KSybp0c1nSFlCK/ZkNvzwWHmkAG6M7wuFmBn3pVzEaCew
 fneGZcX9Ija4MY8Ygajp8GI1aQ4mBNif+uVE7UUY17hH9qAf8vI8Joqs+4L35u8h
 SZl/IqJO9ziEmVLdy9ajgm1xW04AFE1RYRfa6aH6K6tRaIoh8bE=
 =Dmx5
 -----END PGP SIGNATURE-----

Merge tag 'x86_mtrr_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mtrr updates from Borislav Petkov:
 "A serious scrubbing of the MTRR code including adding a new map
  mechanism in order to look up the memory type of a region easily.

  Also address memory range lookup issues like returning an invalid
  memory type. Furthermore, this handles the decoupling of PAT from MTRR
  more naturally.

  All work by Juergen Gross"

* tag 'x86_mtrr_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen: Set default memory type for PV guests to WB
  x86/mtrr: Unify debugging printing
  x86/mtrr: Remove unused code
  x86/mm: Only check uniform after calling mtrr_type_lookup()
  x86/mtrr: Don't let mtrr_type_lookup() return MTRR_TYPE_INVALID
  x86/mtrr: Use new cache_map in mtrr_type_lookup()
  x86/mtrr: Add mtrr=debug command line option
  x86/mtrr: Construct a memory map with cache modes
  x86/mtrr: Add get_effective_type() service function
  x86/mtrr: Allocate mtrr_value array dynamically
  x86/mtrr: Move 32-bit code from mtrr.c to legacy.c
  x86/mtrr: Have only one set_mtrr() variant
  x86/mtrr: Replace vendor tests in MTRR code
  x86/xen: Set MTRR state when running as Xen PV initial domain
  x86/hyperv: Set MTRR state when running as SEV-SNP Hyper-V guest
  x86/mtrr: Support setting MTRR state for software defined MTRRs
  x86/mtrr: Replace size_or_mask and size_and_mask with a much easier concept
  x86/mtrr: Remove physical address size calculation
2023-06-27 13:11:32 -07:00
Linus Torvalds
4aacacee86 - Load late on both SMT threads on AMD, just like it is being done in
the early loading procedure
 
 - Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSap2cACgkQEsHwGGHe
 VUobhA/7B78dVDsm4yoOwx3TiPP1Md/i9vazbR6fyZ0EJQBMqPeujtmuKAd6L0hQ
 u+LU3j5FN4MBmdy3elqpTKW0NXTQkPdu/Syc9gwIEMtrsBj/9XbgjiHz70FgkPog
 nTkWyIqxiHC0krlXLQfD3yIeDWqSKMZbpIE33ckSdJs6xCvh78uc8cbXGjpG1+dl
 QWPjxXNWUVtz5eUqI52tQK2DN5jxzUZVb5/G4kB8icQUFLj2ji+5JB/4zJkgq59r
 nNNZ7E9kDAifPH5qS+NRBgoCrZw51l9EBClRQb2xKXaejloMVT+rK9rQrOStEgct
 gIEXRdLJeCDzW8OGrDM+FzPZqHx+IEpHduBDoOqyDLxgTpPSxbaeT3zJiAeh/+ox
 BDDh2+OrHrMRTSkEGUoILVz5Hr/UwYM7FQkscuweex9mJWKBnXC4jJggXA93w+Ej
 USJloYEyMl0sOkMKpzrNsfACl2rpH29Yefg1CJ6aJiX6lkbqSNntyHo8XIrB8jFG
 0a3r79kRhIG6AflZ5PK2DGr/KKVBi70K61+9h1vEeSNWjr31eOfyCS82T9ukcs4b
 Gmmj5lKHcJMjoD6IGjV7CZA/F6SPXwZjdoz4Py5GzDDf4uVfvr0P92h2KUhULAP6
 GlfkSX9D5pLYF0Q0TEllKAzBBFf/0GTYmXq5Pz7BAxr8+srxBu8=
 =2i55
 -----END PGP SIGNATURE-----

Merge tag 'x86_microcode_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode loader updates from Borislav Petkov:

 - Load late on both SMT threads on AMD, just like it is being done in
   the early loading procedure

  - Cleanups

* tag 'x86_microcode_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Load late on both threads too
  x86/microcode/amd: Remove unneeded pointer arithmetic
  x86/microcode/AMD: Get rid of __find_equiv_id()
2023-06-27 12:03:44 -07:00
Linus Torvalds
19300488c9 - Address -Wmissing-prototype warnings
- Remove repeated 'the' in comments
  - Remove unused current_untag_mask()
  - Document urgent tip branch timing
  - Clean up MSR kernel-doc notation
  - Clean up paravirt_ops doc
  - Update Srivatsa S. Bhat's maintained areas
  - Remove unused extern declaration acpi_copy_wakeup_routine()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmSZ6xIACgkQaDWVMHDJ
 krB9aQ/+NjB4CiWLbrnOYj9QYG6p1GE7lfu2dzIDdmcNuiai8htopXys54Igy3Rq
 BbIoW4E0SGK5E2OD7nLe4fBA/LpsYZTwDhGUu3SiovxLOoC5qkF0Q+6aVypPJE5o
 q7kn0Eo9IDL1dO0EbJptFDJRjk3K5caEoyXJRelarjIfPRbDEhUFaybVRykMZN9I
 4AOxrlb9WFggT4gUE4+N0kWyEqdgI9/aguavmasaG4lBHZ5JAHNQPNIa8bkVSAPL
 wULAzsrGp96V3tVxdjDCzD9aumk4xlJq7gk+v7mfx013dg7Cjs074Xoi2Y+TmaC7
 fdIZiGPJIkNToW+nENVO7BYtACSQhXeVTGxLQO/HNTDc//ZWiIUoJT2U4qu/6e6F
 aAIGoLwv68H4BghS2qx6Gz+BTIfl35mcPUb75MQhu+D84QZoZWrdamCYhsvHeZzc
 uC3nojrb6PBOth9nJsRae+j1zpRe/DT2LvHSWPJgK6EygOAi05ZfYUll/6sb0vze
 IXkUrVV1BvDDVpY9/HnE8RpDCDolP0/ezK9zsw48arZtkc+Qmw2WlD/2D98E+pSb
 MJPelbVmpzWTaoR4jDzXJCXkWe7CQJ5uPQj5azAE9l7YvnxgCQP5xnm5sLU9eyLu
 RsOwRzss0+3z44x5rJi9nSxQJ0LHfTAzW8/ZmNSZGHzi0ClszK0=
 =N82i
 -----END PGP SIGNATURE-----

Merge tag 'x86_cleanups_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Dave Hansen:
 "As usual, these are all over the map. The biggest cluster is work from
  Arnd to eliminate -Wmissing-prototype warnings:

   - Address -Wmissing-prototype warnings

   - Remove repeated 'the' in comments

   - Remove unused current_untag_mask()

   - Document urgent tip branch timing

   - Clean up MSR kernel-doc notation

   - Clean up paravirt_ops doc

   - Update Srivatsa S. Bhat's maintained areas

   - Remove unused extern declaration acpi_copy_wakeup_routine()"

* tag 'x86_cleanups_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  x86/acpi: Remove unused extern declaration acpi_copy_wakeup_routine()
  Documentation: virt: Clean up paravirt_ops doc
  x86/mm: Remove unused current_untag_mask()
  x86/mm: Remove repeated word in comments
  x86/lib/msr: Clean up kernel-doc notation
  x86/platform: Avoid missing-prototype warnings for OLPC
  x86/mm: Add early_memremap_pgprot_adjust() prototype
  x86/usercopy: Include arch_wb_cache_pmem() declaration
  x86/vdso: Include vdso/processor.h
  x86/mce: Add copy_mc_fragile_handle_tail() prototype
  x86/fbdev: Include asm/fb.h as needed
  x86/hibernate: Declare global functions in suspend.h
  x86/entry: Add do_SYSENTER_32() prototype
  x86/quirks: Include linux/pnp.h for arch_pnpbios_disabled()
  x86/mm: Include asm/numa.h for set_highmem_pages_init()
  x86: Avoid missing-prototype warnings for doublefault code
  x86/fpu: Include asm/fpu/regset.h
  x86: Add dummy prototype for mk_early_pgtbl_32()
  x86/pci: Mark local functions as 'static'
  x86/ftrace: Move prepare_ftrace_return prototype to header
  ...
2023-06-26 16:43:54 -07:00
Linus Torvalds
5dfe7a7e52 - Fix a race window where load_unaligned_zeropad() could cause
a fatal shutdown during TDX private<=>shared conversion
  - Annotate sites where VM "exit reasons" are reused as hypercall
    numbers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmSZ50cACgkQaDWVMHDJ
 krAPEg//bAR0SzrjIir2eiQ7p3ktr4L7iae0odzFkW/XHam5ZJP+v9cCMLzY6zNO
 44x9Z85jZ9w34GcZ4D7D0OmTHbcDpcxckPXSFco/dK4IyYeLzUImYXEYo41YJEx9
 O4sSQMBqIyjMXej/oKBhgKHSWaV60XimvQvTvhpjXGD/45bt9sx4ZNVTi8+xVMbw
 jpktMsQcsjHctcIY2D2eUR61Ma/Vg9t6Qih51YMtbq6Nqcyhw2IKvDwIx3kQuW34
 qSW7wsyn+RfHQDpwjPgDG/6OE815Pbtzlxz+y6tB9pN88IWkA1H5Jh2CQRlMBud2
 2nVQRpqPgr9uOIeNnNI7FFd1LgTIc/v7lDPfpUH9KelOs7cGWvaRymkuhPSvWxRI
 tmjlMdFq8XcjrOPieA9WpxYKXinqj4wNXtnYGyaM+Ur/P3qWaj18PMCYMbeN6pJC
 eNYEJVk2Mt8GmiPL55aYG5+Z1F8sciLKbz8TFq5ya2z0EnSbyVvR+DReqd7zRzh6
 Bmbmx9isAzN6wWNszNt7f8XSgRPV2Ri1tvb1vixk3JLxyx2iVCUL6KJ0cZOUNy0x
 nQqy7/zMtBsFGZ/Ca8f2kpVaGgxkUFy7n1rI4psXTGBOVlnJyMz3WSi9N8F/uJg0
 Ca5W4493+txdyHSAmWBQAQuZp3RJOlhTkXe5dfjukv6Rnnw1MZU=
 =Hr3D
 -----END PGP SIGNATURE-----

Merge tag 'x86_tdx_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 tdx updates from Dave Hansen:

 - Fix a race window where load_unaligned_zeropad() could cause a fatal
   shutdown during TDX private<=>shared conversion

   The race has never been observed in practice but might allow
   load_unaligned_zeropad() to catch a TDX page in the middle of its
   conversion process which would lead to a fatal and unrecoverable
   guest shutdown.

 - Annotate sites where VM "exit reasons" are reused as hypercall
   numbers.

* tag 'x86_tdx_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix enc_status_change_finish_noop()
  x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()
  x86/mm: Allow guest.enc_status_change_prepare() to fail
  x86/tdx: Wrap exit reason with hcall_func()
2023-06-26 16:32:47 -07:00
Linus Torvalds
36db314440 Add UV platform support for sub-NUMA clustering
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmSZ5QIACgkQaDWVMHDJ
 krDAMBAAqAgqY963jaLMMRZ79IjR6hK7oqP7H2DJplWwiA9R/a67ECcEORJf5o33
 DFS4W9CWAnBsStmR3AwCOGLOA/LB8n7mbo7w4/V9X4HeMV3u3rZfr+FV7e4EkCf9
 1YLwjRzuQeNPXzgM49Tn0ncsL2LsSGX6zdedvpEJfAfHrnKQJqNAx3xnmWBnBqV5
 Wrp7qVfHgyxPo2V0+8O8eXrSbVPHnzDb5YwqF5dqPpDuZooGxWbMm6MYHXbObqmN
 wyU2TjShNXXigBzZvSdpxu7Kdke0Rp2xPmsmxBjBfqYn4I1UcjHiXGLMxGxspMSt
 RF+PzpYAybfgJbHMFojPULqI+XVfMv98U+QZ8l7DvSIB6S82DWLNyPrEeBf/QrA1
 keW4/AK8pRRtvmQcV667CbzXJ/4vb0Ox/5jAGVTSBwfs9RDyr1YMFitHSstD+L5+
 PNuFN59JW8FR+TF4wUNUMGe/XIcf06IpaQttwYfv+OsM4D0O5a8SSXbvpn8PHPnl
 o71z91W6PYglufpLj9yI04e/61oE7q2u8qskH0UuEZbk+seK0tLTFgQbO82C0hBP
 Th9Rg9fHVe1QhpAkSf3DKUi07WJ9s7JAv6LNd5qD8jOv1ynCaXV4w2uPCq8HNTdv
 h1h8GftBsdNq7C0BOaQ1zlJ0YWM7LeFyZhFhLHRUygdgtLjx0iw=
 =17jK
 -----END PGP SIGNATURE-----

Merge tag 'x86_platform_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Dave Hansen:
 "Allow CPUs in SGX/HPE Ultraviolet to start using Sub-NUMA clustering
  (SNC) mode. SNC has been around outside the UV world for a while but
  evidently never worked on UV systems.

  SNC is rather notorious for breaking bad assumptions of a 1:1
  relationship between physical sockets and NUMA nodes. The UV code was
  rather prolific with these assumptions and took quite a bit of
  refactoring to remove them"

* tag 'x86_platform_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/uv: Update UV[23] platform code for SNC
  x86/platform/uv: Remove remaining BUG_ON() and BUG() calls
  x86/platform/uv: UV support for sub-NUMA clustering
  x86/platform/uv: Helper functions for allocating and freeing conversion tables
  x86/platform/uv: When searching for minimums, start at INT_MAX not 99999
  x86/platform/uv: Fix printed information in calc_mmioh_map
  x86/platform/uv: Introduce helper function uv_pnode_to_socket.
  x86/platform/uv: Add platform resolving #defines for misc GAM_MMIOH_REDIRECT*
2023-06-26 16:26:44 -07:00
Linus Torvalds
a3d763f0b3 Add Hyper-V interrupts to /proc/stat
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmSZ2QoACgkQaDWVMHDJ
 krAx4g/8D2B9vjL+1QoyUyCF0VUU8BZeYhWXe/wDQw0v8Zhdv4ksSYIDMZtWkP2z
 Opake9E+bfbM1edrplFOPhz3J2avzrILhc4pmhv2XvKctuj3HoVa0icEIFIK+CWA
 wBbxDP2byG0h42n+qfAA25NHh3VNrVYk0Tzs+tmZ3NL4LxC8HuNCMwhdYjVk89r3
 iFIRH4dPjkOyiBTFygmGT8sSwNJddQHZuscNCPXl5VdNh9S/OnFHwkcUNMpb+CSE
 QoaSgvYbNEHjgGb2pBBQm37Ia27a/qFMx12is1V8+z1HGZhpIYefWDqg9nPj9YTQ
 Tiih3O+F4C8GxmT2egtQ9+O7jh034iIvIMkK3zkIO707liFsTQtBc6JbjvCvoA9X
 QAcG3ngjFxb0DY/5TTaFApld4gPDvGKE/ekYKCcoFtuNXznVbgmt4OWAHAeA8twZ
 RoRFpkSAEqeueYpGq3KzBv9nZU1Jk/wuJs6MFOrbqcvtiG+M8x1lcx9DcUk/QGnR
 AHxaP+IbzdxaTP1L1J5BYCce1n0zV7GF7sfWslC9jlbdevR5/Yk0pVxFczwzQjri
 X7bhBT/IwkkPkdQM1ANSlp0CaVcC1vZI0MJmk+6y4X0nh5gkde9ZXhHUGTEkqgXC
 cQPJv+/BJZ7Tz666X7znQBbMX66GjuEeQhK2pxXkWaW1gR4Zioo=
 =yy3F
 -----END PGP SIGNATURE-----

Merge tag 'x86_irq_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq updates from Dave Hansen:
 "Add Hyper-V interrupts to /proc/stat"

* tag 'x86_irq_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Add hardcoded hypervisor interrupts to /proc/stat
2023-06-26 16:24:40 -07:00
Linus Torvalds
941d77c773 - Compute the purposeful misalignment of zen_untrain_ret automatically
and assert __x86_return_thunk's alignment so that future changes to
   the symbol macros do not accidentally break them.
 
 - Remove CONFIG_X86_FEATURE_NAMES Kconfig option as its existence is
   pointless
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZ1wgACgkQEsHwGGHe
 VUrXlRAAhIonFM1suIHo6w085jY5YA1XnsziJr/bT3e16FdHrF1i3RBEX4ml0m3O
 ADwa9dMsC9UJIa+/TKRNFfQvfRcLE/rsUKlS1Rluf/IRIxuSt/Oa4bFQHGXFRwnV
 eSlnWTNiaWrRs/vJEYAnMOe98oRyElHWa9kZ7K5FC+Ksfn/WO1U1RQ2NWg2A2wkN
 8MHJiS41w2piOrLU/nfUoI7+esHgHNlib222LoptDGHuaY8V2kBugFooxAEnTwS3
 PCzWUqCTgahs393vbx6JimoIqgJDa7bVdUMB0kOUHxtpbBiNdYYVy6e7UKnV1yjB
 qP3v9jQW4+xIyRmlFiErJXEZx7DjAIP5nulGRrUMzRfWEGF8mdRZ+ugGqFMHCeC8
 vXI+Ixp2vvsfhG3N/algsJUdkjlpt3hBpElRZCfR08M253KAbAmUNMOr4sx4RPi5
 ymC+pLIHd1K0G9jiZaFnOMaY71gAzWizwxwjFKLQMo44q+lpNJvsVO00cr+9RBYj
 LQL2APkONVEzHPMYR/LrXCslYaW//DrfLdRQjNbzUTonxFadkTO2Eu8J90B/5SFZ
 CqC1NYKMQPVFeg4XuGWCgZEH+jokCGhl8vvmXClAMcOEOZt0/s4H89EKFkmziyon
 L1ZrA/U72gWV8EwD7GLtuFJmnV4Ayl/hlek2j0qNKaj6UUgTFg8=
 =LcUq
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Borislav Petkov:

 - Compute the purposeful misalignment of zen_untrain_ret automatically
   and assert __x86_return_thunk's alignment so that future changes to
   the symbol macros do not accidentally break them.

 - Remove CONFIG_X86_FEATURE_NAMES Kconfig option as its existence is
   pointless

* tag 'x86_cpu_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/retbleed: Add __x86_return_thunk alignment checks
  x86/cpu: Remove X86_FEATURE_NAMES
  x86/Kconfig: Make X86_FEATURE_NAMES non-configurable in prompt
2023-06-26 15:42:34 -07:00