linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-28 09:22:08 +00:00

Author	SHA1	Message	Date
Ard Biesheuvel	ad3d09b547	ARM: memset: clean up unwind annotations The memset implementation carves up the code in different sections, each covered with their own unwind info. In this case, it is done in a way similar to how the compiler might do it, to disambiguate between parts where the return address is in LR and the SP is unmodified, and parts where a stack frame is live, and the unwinder needs to know the size of the stack frame and the location of the return address within it. Only the placement of the unwind directives is slightly odd: the stack pushes are placed in the wrong sections, which may confuse the unwinder when attempting to unwind with PC pointing at the stack push in question. So let's fix this up, by reordering the directives and instructions as appropriate. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:32 +01:00
Ard Biesheuvel	ccb81601ac	ARM: memmove: use frame pointer as unwind anchor The memmove routine is a bit unusual in the way it manages the stack pointer: depending on the execution path through the function, the SP assumes different values as different subsets of the register file are preserved and restored again. This is problematic when it comes to EHABI unwind info, as it is not instruction accurate, and does not allow tracking the SP value as it changes. Commit `207a6cb069` ("ARM: 8224/1: Add unwinding support for memmove function") addressed this by carving up the function in different chunks as far as the unwinder is concerned, and keeping a set of unwind directives for each of them, each corresponding with the state of the stack pointer during execution of the chunk in question. This not only duplicates unwind info unnecessarily, but it also complicates unwinding the stack upon overflow. Instead, let's do what the compiler does when the SP is updated halfway through a function, which is to use a frame pointer and emit the appropriate unwind directives to communicate this to the unwinder. Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's avoid touching R7 in the body of the function, so that Thumb-2 can use it as the frame pointer. R11 was not modified in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:32 +01:00
Ard Biesheuvel	ba999a0402	ARM: memcpy: use frame pointer as unwind anchor The memcpy template is a bit unusual in the way it manages the stack pointer: depending on the execution path through the function, the SP assumes different values as different subsets of the register file are preserved and restored again. This is problematic when it comes to EHABI unwind info, as it is not instruction accurate, and does not allow tracking the SP value as it changes. Commit `279f487e0b` ("ARM: 8225/1: Add unwinding support for memory copy functions") addressed this by carving up the function in different chunks as far as the unwinder is concerned, and keeping a set of unwind directives for each of them, each corresponding with the state of the stack pointer during execution of the chunk in question. This not only duplicates unwind info unnecessarily, but it also complicates unwinding the stack upon overflow. Instead, let's do what the compiler does when the SP is updated halfway through a function, which is to use a frame pointer and emit the appropriate unwind directives to communicate this to the unwinder. Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's avoid touching R7 in the body of the template, so that Thumb-2 can use it as the frame pointer. R11 was not modified in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Keith Packard <keithpac@amazon.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:32 +01:00
Ard Biesheuvel	0b78f2e92d	ARM: call_with_stack: add unwind support Restructure the code and add the unwind annotations so that both the frame pointer unwinder as well as the EHABI unwind info based unwinder will be able to follow the call stack through call_with_stack(). Since GCC and Clang use different formats for the stack frame, two methods are implemented: a GCC version that pushes fp, sp, lr and pc for compatibility with the frame pointer unwinder, and a second version that works with Clang, as well as with the EHABI unwinder both in ARM and Thumb2 modes. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Keith Packard <keithpac@amazon.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:32 +01:00
Ard Biesheuvel	d4664b6c98	ARM: implement IRQ stacks Now that we no longer rely on the stack pointer to access the current task struct or thread info, we can implement support for IRQ stacks cleanly as well. Define a per-CPU IRQ stack and switch to this stack when taking an IRQ, provided that we were not already using that stack in the interrupted context. This is never the case for IRQs taken from user space, but ones taken while running in the kernel could fire while one taken from user space has not completed yet. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Keith Packard <keithpac@amazon.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:31 +01:00
Ard Biesheuvel	eae9523fdd	ARM: backtrace-clang: avoid crash on bogus frame pointer The Clang backtrace code dereferences the link register value pulled from the stack to decide whether the caller was a branch-and-link instruction, in order to subsequently decode the offset to find the start of the calling function. Unlike other loads in this routine, this one is not protected by a fixup, and may therefore cause a crash if the address in question is bogus. So let's fix this, by treating the fault as a failure to decode the 'bl' instruction. To avoid a label renum, reuse a fixup label that guards an instruction that cannot fault to begin with. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M	2021-12-03 15:11:31 +01:00
Linus Torvalds	35776f1051	ARM development updates for 5.15: - Rename "mod_init" and "mod_exit" so that initcall debug output is actually useful (Randy Dunlap) - Update maintainers entries for linux-arm-kernel to indicate it is moderated for non-subscribers (Randy Dunlap) - Move install rules to arch/arm/Makefile (Masahiro Yamada) - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij) - Don't warn about atags_to_fdt() stack size (David Heidelberg) - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann) - Get rid of set_fs() usage (Arnd Bergmann) - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmE6GkAACgkQ9OeQG+St rGS7HhAAokcdC80ZOJJ+vT/J4sqpTdfTnJmImhkKOKgcw9yBFt7JBuA/6mp6/EV0 2Jd2RpeKG3S8PRlMWE4hGmyIla94r0olDvdh57+4AB/xrSfPO7l7EiaW2xLi0i3F KMysXxxKgbfckoNqPtiYF71cKkUKbZa169t8PyiiW5XYVQncnVGIbmEy69MJCg9n 08NUtkKoDgHkS6hXDVDLoFsGJX5P7X5IDPx6og233qBWRzWgcn1NURfJKD0F7/l+ UPnftUAF8JZp0rhtF2RH1IOu2v2MOVUsrK7D5OjzUEdMSleTN2oX3hmF4HPsG8eJ LeTKJfxoiX3JdWRlmUjomRU6eDqLAIMKsZ0wWoupQTaCq3WHs/mnxEOKY9n/UYGk eQdgb/EQQ5gDUok2WQOxG+Q85s29d14isQnoNa1D0O2YzTK7JiQ6YrASkZWVNLnT Zuw5vDtKk+7NV7QczTl9nHnPWIsRaZr40MXbTIROUO+aPoTxt6lPkv/dqUltrbEg 6Ix/8XsbtAgz8/UEDNz69RYA2DyzDBTO5VLdJutDsXliTAkY+HkqcORTFd72BvWX JEO/xg037a8x5vGpu/t0s+nmDgfy79Yi21u7i3MSjf2FiH09bOUhf7tiuhHVzb97 3po8S/YRiIsJWC1NpMpYFBYeCtJonMJycM05ff6MrLyvLYU2xbs= =Tx+y -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM development updates from Russell King: - Rename "mod_init" and "mod_exit" so that initcall debug output is actually useful (Randy Dunlap) - Update maintainers entries for linux-arm-kernel to indicate it is moderated for non-subscribers (Randy Dunlap) - Move install rules to arch/arm/Makefile (Masahiro Yamada) - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij) - Don't warn about atags_to_fdt() stack size (David Heidelberg) - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann) - Get rid of set_fs() usage (Arnd Bergmann) - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven) * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9118/1: div64: Remove always-true __div64_const32_is_OK() duplicate ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK() ARM: 9116/1: unified: Remove check for gcc < 4 ARM: 9110/1: oabi-compat: fix oabi epoll sparse warning ARM: 9113/1: uaccess: remove set_fs() implementation ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault ARM: 9111/1: oabi-compat: rework fcntl64() emulation ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation ARM: 9107/1: syscall: always store thread_info->abi_syscall ARM: 9109/1: oabi-compat: add epoll_pwait handler ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs() ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault ARM: 9105/1: atags_to_fdt: don't warn about stack size ARM: 9103/1: Drop ARCH_NR_GPIOS definition ARM: 9102/1: move theinstall rules to arch/arm/Makefile ARM: 9100/1: MAINTAINERS: mark all linux-arm-kernel@infradead list as moderated ARM: 9099/1: crypto: rename 'mod_init' & 'mod_exit' functions to be module-specific	2021-09-09 13:25:49 -07:00
Arnd Bergmann	8ac6f5d7f8	ARM: 9113/1: uaccess: remove set_fs() implementation There are no remaining callers of set_fs(), so just remove it along with all associated code that operates on thread_info->addr_limit. There are still further optimizations that can be done: - In get_user(), the address check could be moved entirely into the out of line code, rather than passing a constant as an argument, - I assume the DACR handling can be simplified as we now only change it during user access when CONFIG_CPU_SW_DOMAIN_PAN is set, but not during set_fs(). Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2021-08-20 11:39:27 +01:00
Chris Down	3370155737	printk: Userspace format indexing support We have a number of systems industry-wide that have a subset of their functionality that works as follows: 1. Receive a message from local kmsg, serial console, or netconsole; 2. Apply a set of rules to classify the message; 3. Do something based on this classification (like scheduling a remediation for the machine), rinse, and repeat. As a couple of examples of places we have this implemented just inside Facebook, although this isn't a Facebook-specific problem, we have this inside our netconsole processing (for alarm classification), and as part of our machine health checking. We use these messages to determine fairly important metrics around production health, and it's important that we get them right. While for some kinds of issues we have counters, tracepoints, or metrics with a stable interface which can reliably indicate the issue, in order to react to production issues quickly we need to work with the interface which most kernel developers naturally use when developing: printk. Most production issues come from unexpected phenomena, and as such usually the code in question doesn't have easily usable tracepoints or other counters available for the specific problem being mitigated. We have a number of lines of monitoring defence against problems in production (host metrics, process metrics, service metrics, etc), and where it's not feasible to reliably monitor at another level, this kind of pragmatic netconsole monitoring is essential. As one would expect, monitoring using printk is rather brittle for a number of reasons -- most notably that the message might disappear entirely in a new version of the kernel, or that the message may change in some way that the regex or other classification methods start to silently fail. One factor that makes this even harder is that, under normal operation, many of these messages are never expected to be hit. For example, there may be a rare hardware bug which one wants to detect if it was to ever happen again, but its recurrence is not likely or anticipated. This precludes using something like checking whether the printk in question was printed somewhere fleetwide recently to determine whether the message in question is still present or not, since we don't anticipate that it should be printed anywhere, but still need to monitor for its future presence in the long-term. This class of issue has happened on a number of occasions, causing unhealthy machines with hardware issues to remain in production for longer than ideal. As a recent example, some monitoring around blk_update_request fell out of date and caused semi-broken machines to remain in production for longer than would be desirable. Searching through the codebase to find the message is also extremely fragile, because many of the messages are further constructed beyond their callsite (eg. btrfs_printk and other module-specific wrappers, each with their own functionality). Even if they aren't, guessing the format and formulation of the underlying message based on the aesthetics of the message emitted is not a recipe for success at scale, and our previous issues with fleetwide machine health checking demonstrate as much. This provides a solution to the issue of silently changed or deleted printks: we record pointers to all printk format strings known at compile time into a new .printk_index section, both in vmlinux and modules. At runtime, this can then be iterated by looking at <debugfs>/printk/index/<module>, which emits the following format, both readable by humans and able to be parsed by machines: $ head -1 vmlinux; shuf -n 5 vmlinux # <level[,flags]> filename:line function "format" <5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n" <4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n" <6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n" <6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n" <6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n" This mitigates the majority of cases where we have a highly-specific printk which we want to match on, as we can now enumerate and check whether the format changed or the printk callsite disappeared entirely in userspace. This allows us to catch changes to printks we monitor earlier and decide what to do about it before it becomes problematic. There is no additional runtime cost for printk callers or printk itself, and the assembly generated is exactly the same. Signed-off-by: Chris Down <chris@chrisdown.name> Cc: Petr Mladek <pmladek@suse.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h} Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name	2021-07-19 11:57:48 +02:00
Fangrui Song	735e8d93dc	ARM: 9022/1: Change arch/arm/lib/mem.S to use WEAK instead of .weak Commit `d6d51a96c7` ("ARM: 9014/2: Replace string mem functions for KASan") add .weak directives to memcpy/memmove/memset to avoid collision with KASAN interceptors. This does not work with LLVM's integrated assembler (the assembly snippet `.weak memcpy ... .globl memcpy` produces a STB_GLOBAL memcpy while GNU as produces a STB_WEAK memcpy). LLVM 12 (since https://reviews.llvm.org/D90108) will error on such an overridden symbol binding. Use the appropriate WEAK macro instead. Link: https://github.com/ClangBuiltLinux/linux/issues/1190 -- Fixes: `d6d51a96c7` ("ARM: 9014/2: Replace string mem* functions for KASan") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Fangrui Song <maskray@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2020-11-12 14:53:19 +00:00
Linus Walleij	d6d51a96c7	ARM: 9014/2: Replace string mem* functions for KASan Functions like memset()/memmove()/memcpy() do a lot of memory accesses. If a bad pointer is passed to one of these functions it is important to catch this. Compiler instrumentation cannot do this since these functions are written in assembly. KASan replaces these memory functions with instrumented variants. The original functions are declared as weak symbols so that the strong definitions in mm/kasan/kasan.c can replace them. The original functions have aliases with a '__' prefix in their name, so we can call the non-instrumented variant if needed. We must use __memcpy()/__memset() in place of memcpy()/memset() when we copy .data to RAM and when we clear .bss, because kasan_early_init cannot be called before the initialization of .data and .bss. For the kernel compression and EFI libstub's custom string libraries we need a special quirk: even if these are built without KASan enabled, they rely on the global headers for their custom string libraries, which means that e.g. memcpy() will be defined to __memcpy() and we get link failures. Since these implementations are written i C rather than assembly we use e.g. __alias(memcpy) to redirected any users back to the local implementation. Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: kasan-dev@googlegroups.com Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # QEMU/KVM/mach-virt/LPAE/8G Tested-by: Florian Fainelli <f.fainelli@gmail.com> # Brahma SoCs Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> # i.MX6Q Reported-by: Russell King - ARM Linux <rmk+kernel@armlinux.org.uk> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Abbott Liu <liuwenliang@huawei.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2020-10-27 12:11:06 +00:00
Al Viro	1d60be3c25	arm: propagate the calling convention changes down to csum_partial_copy_from_user() ... and get rid of the "clean the destination on error" crap. Simplifies the fault handlers and the function itself... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-08-20 15:45:16 -04:00
Al Viro	cc44c17baf	csum_partial_copy_nocheck(): drop the last argument It's always 0. Note that we theoretically could use ~0U as well - result will be the same modulo 0xffff, _if_ the damn thing did the right thing for any value of initial sum; later we'll make use of that when convenient. However, unlike csum_and_copy_..._user(), there are instances that did not work for arbitrary initial sums; c6x is one such. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-08-20 15:45:14 -04:00
Michel Lespinasse	d8ed45c5dc	mmap locking API: use coccinelle to convert mmap_sem rwsem call sites This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock \| -down_write +mmap_write_lock \| -down_write_killable +mmap_write_lock_killable \| -down_write_trylock +mmap_write_trylock \| -up_write +mmap_write_unlock \| -downgrade_write +mmap_write_downgrade \| -down_read +mmap_read_lock \| -down_read_killable +mmap_read_lock_killable \| -down_read_trylock +mmap_read_trylock \| -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:14 -07:00
Dmitry Safonov	5489ab50c2	arm/asm: add loglvl to c_backtrace() Currently, the log-level of show_stack() depends on a platform realization. It creates situations where the headers are printed with lower log level or higher than the stacktrace (depending on a platform or user). Furthermore, it forces the logic decision from user to an architecture side. In result, some users as sysrq/kdb/etc are doing tricks with temporary rising console_loglevel while printing their messages. And in result it not only may print unwanted messages from other CPUs, but also omit printing at all in the unlucky case where the printk() was deferred. Introducing log-level parameter and KERN_UNSUPPRESSED [1] seems an easier approach than introducing more printk buffers. Also, it will consolidate printings with headers. Add log level argument to c_backtrace() as a preparation for introducing show_stack_loglvl(). [1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/T/#u Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200418201944.482088-5-dima@arista.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:10 -07:00
Mike Rapoport	84e6ffb2c4	arm: add support for folded p4d page tables Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate, and remove __ARCH_USE_5LEVEL_HACK. [rppt@linux.ibm.com: fix kexec] Link: http://lkml.kernel.org/r/20200508174232.GA759899@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: James Morse <james.morse@arm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200414153455.21744-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-04 19:06:21 -07:00
Kees Cook	f87b1c49bc	ARM: 8958/1: rename missed uaccess .fixup section When the uaccess .fixup section was renamed to .text.fixup, one case was missed. Under ld.bfd, the orphaned section was moved close to .text (since they share the "ax" bits), so things would work normally on uaccess faults. Under ld.lld, the orphaned section was placed outside the .text section, making it unreachable. Link: https://github.com/ClangBuiltLinux/linux/issues/282 Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1020633#c44 Link: https://lore.kernel.org/r/nycvar.YSQ.7.76.1912032147340.17114@knanqh.ubzr Link: https://lore.kernel.org/lkml/202002071754.F5F073F1D@keescook/ Fixes: `c4a84ae39b` ("ARM: 8322/1: keep .text and .fixup regions closer together") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2020-02-21 17:03:21 +00:00
Linus Torvalds	8808cf8cbc	ARM updates for 5.4-rc1: - fix various clang build and cppcheck issues - switch ARM to use new common outgoing-CPU-notification code - add some additional explanation about the boot code - kbuild "make clean" fixes - get rid of another "(____ptrval____)", this time for the VDSO code - avoid treating cache maintenance faults as a write - add a frame pointer unwinder implementation for clang - add EDAC support for Aurora L2 cache - improve robustness of adjust_lowmem_bounds() finding the bounds of lowmem. - add reset control for AMBA primecell devices -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAXYZCdvTnkBvkraxkAQK7vQ//UO0XJ1InSLnWzPYuNwJGcCmzHIg6p40A VxnvDTVxZH6UKDhBg8xx+gpPOhwZElGyc0H563p5jgmjzbIesESS5Xy3hUUMkQ9y A6Ta9Nk+NhL+j9O9VtcOk90oQJsLuVyYtHTfk6Wl9xaVLjM1OALWNzCSDqXIPTjF qEhTRahlv9Nc9aisFJAPduf/zQx9ULaZVvDzTo6clXSD7ieSy0MZRiRbcH3MJwiY Q5AbImF49NGcNtlknPh8Gnz/4P3q+bxQDmrzki9d4Fcy2brko845q9Ca5PC+iXro fZHvs8q2+8xz4PuOddBrYebqPIIv+3W6uPlJAPjO0MQrxPTUxRBxqAkYXxwTZBx/ A79AQsbnmUSyOV4EI2lk9USmN/GF2QwGOusRoiA/XMbSVfqnVZWH5mE98dr+2vn+ rUnTq9yvSz2y6QH7+UI+7Q7T8jg4QFBBmPDfCP+yTOWqPb8u070h+VgLBr28g1JL t6VtzOeI4wyl7E/WInmoM/d8SXnjv/1yNzLBcCdvgBV94fUQAV5EP+cDGJ0hv1SJ TGywm8adf3zAa7ZUAOhBoAK3gkNqjJB28ynsH4QmBUmsKkozxoKwwb4jjbGgcoUY rYII4VyoQB/0eX5/i8u69krA+3QNRhehLWC/zM4ZK5lKfFRCnNDvLgiIEM5b59JW nBywRtpyw2I= =Evmc -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: - fix various clang build and cppcheck issues - switch ARM to use new common outgoing-CPU-notification code - add some additional explanation about the boot code - kbuild "make clean" fixes - get rid of another "(____ptrval____)", this time for the VDSO code - avoid treating cache maintenance faults as a write - add a frame pointer unwinder implementation for clang - add EDAC support for Aurora L2 cache - improve robustness of adjust_lowmem_bounds() finding the bounds of lowmem. - add reset control for AMBA primecell devices * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (24 commits) ARM: 8906/1: drivers/amba: add reset control to amba bus probe ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer ARM: 8904/1: skip nomap memblocks while finding the lowmem/highmem boundary ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address ARM: 8891/1: EDAC: armada_xp: Add support for more SoCs ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper ARM: 8890/1: l2x0: add marvell,ecc-enable property for aurora ARM: 8889/1: dt-bindings: document marvell,ecc-enable binding ARM: 8886/1: l2x0: support parity-enable/disable on aurora ARM: 8885/1: aurora-l2: add defines for parity and ECC registers ARM: 8887/1: aurora-l2: add prefix to MAX_RANGE_SIZE ARM: 8902/1: l2c: move cache-aurora-l2.h to asm/hardware ARM: 8900/1: UNWINDER_FRAME_POINTER implementation for Clang ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes ARM: 8896/1: VDSO: Don't leak kernel addresses ARM: 8895/1: visit mach-* and plat-* directories when cleaning ARM: 8894/1: boot: Replace open-coded nop with macro ARM: 8893/1: boot: Explain the 8 nops ARM: 8876/1: fix O= building with CONFIG_FPE_FASTFPE ...	2019-09-22 09:39:09 -07:00
Nathan Huckleberry	6dc5fd93b2	ARM: 8900/1: UNWINDER_FRAME_POINTER implementation for Clang The stackframe setup when compiled with clang is different. Since the stack unwinder expects the gcc stackframe setup it fails to print backtraces. This patch adds support for the clang stackframe setup. Link: https://github.com/ClangBuiltLinux/linux/issues/35 Cc: clang-built-linux@googlegroups.com Suggested-by: Tri Vo <trong@google.com> Signed-off-by: Nathan Huckleberry <nhuck@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-08-29 07:58:01 +01:00
Lvqiang Huang	6938983717	ARM: 8897/1: check stmfd instruction using right shift In the commit `ef41b5c924` ("ARM: make kernel oops easier to read"), - .word 0xe92d0000 >> 10 @ stmfd sp!, {} + .word 0xe92d0000 >> 11 @ stmfd sp!, {} then the shift need to change to 11. Signed-off-by: Lvqiang Huang <Lvqiang.Huang@unisoc.com> Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-08-23 11:32:37 +01:00
Linus Torvalds	24e44913aa	ARM: SoC platform updates SoC platform changes. Main theme this merge window: - The Netx platform (Netx 100/500) platform is removed by Linus Walleij-- the SoC doesn't have active maintainers with hardware, and in discussions with the vendor the agreement was that it's OK to remove. - Russell King has a series of patches that cleans up and refactors SA1101 and RiscPC support. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl0yKOgPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3SNEP/iJsMeeunX0P7Ym3zNFjykhspkkUmo7sEKuz NBcexnQpkm+OLgjfwT7j3kXvOs4mzMzH56J6h7dEDSHbQP1MDIgpMw6OEzMMsQTV XL1AWz1IO7Sq/mG17daPs9c75o6NYQ7pSEd/ncbjKuJQOpGsi4DyrVrhk9WdzYl2 hcs4XOzOMZgDTsXHVdWkfpHazpWxEXPCD7v5bt6ueU0YnT3csUbzOTTvw+55JxRV fYz0lg4wTMRYMQMOejpx1HXwdmbVOHLUYkCxcLUaVqMnm88q/IddJVklBbPGWAU5 Z4gFpL+FxcFhEZtu28CoubPYzf/mHDk8Ry2UWwBiRwiGoKfblomI1fpnbyrX53aE lpO5p7MfOVVV2WNxpbUND+ilbgKOREfRHd314GLPUjwudp2sTuDRZ1GAbt3JwsIM L1HesyjCtb6itCSwGsmmGsX2Wvu+WT7slpsYwHs2qklE/X1zQq0R4Jf2xUNpwqPb FqGZAtc6CCQtyF/Mcpp6OQd+cV0tgQVIw7teKol/xR1dSzN/+1zO1J9UHk9/dWUU sb5lGa/AtBrIbWxS1qLuA5bgyDqxXYDZi0y/Bu1qMHYebRW37z9kvomtzBiMNX2o SAxvr9iGPlTxTjGjRCyBVFmsbCMYLabNoL9tuuXvo+DnjFoOilTef+qePOv7ZYZX kwUyS2eu =FX6e -----END PGP SIGNATURE----- Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC platform updates from Olof Johansson: "SoC platform changes. Main theme this merge window: - The Netx platform (Netx 100/500) platform is removed by Linus Walleij-- the SoC doesn't have active maintainers with hardware, and in discussions with the vendor the agreement was that it's OK to remove. - Russell King has a series of patches that cleans up and refactors SA1101 and RiscPC support" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (47 commits) ARM: stm32: use "depends on" instead of "if" after prompt ARM: sa1100: convert to common clock framework ARM: exynos: Cleanup cppcheck shifting warning ARM: pxa/lubbock: remove lubbock_set_misc_wr() from global view ARM: exynos: Only build MCPM support if used arm: add missing include platform-data/atmel.h ARM: davinci: Use GPIO lookup table for DA850 LEDs ARM: OMAP2: drop explicit assembler architecture ARM: use arch_extension directive instead of arch argument ARM: imx: Switch imx7d to imx-cpufreq-dt for speed-grading ARM: bcm: Enable PINCTRL for ARCH_BRCMSTB ARM: bcm: Enable ARCH_HAS_RESET_CONTROLLER for ARCH_BRCMSTB ARM: riscpc: enable chained scatterlist support ARM: riscpc: reduce IRQ handling code ARM: riscpc: move RiscPC assembly files from arch/arm/lib to mach-rpc ARM: riscpc: parse video information from tagged list ARM: riscpc: add ecard quirk for Atomwide 3port serial card MAINTAINERS: mvebu: Add git entry soc: ti: pm33xx: Add a print while entering RTC only mode with DDR in self-refresh ARM: OMAP2+: Make some variables static ...	2019-07-19 17:05:08 -07:00
Olof Johansson	7cba7cacee	Merge branch 'for-arm-soc' of git://git.armlinux.org.uk/~rmk/linux-arm into arm/soc * 'for-arm-soc' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits) ARM: sa1100: convert to common clock framework ARM: riscpc: enable chained scatterlist support ARM: riscpc: reduce IRQ handling code ARM: riscpc: move RiscPC assembly files from arch/arm/lib to mach-rpc ARM: riscpc: parse video information from tagged list ARM: riscpc: add ecard quirk for Atomwide 3port serial card ARM: sa1100/neponset: convert serial to use gpiod APIs ARM: sa1100/hackkit: remove empty serial mctrl functions ARM: sa1100/badge4: remove commented out modem control initialisers ARM: sa1100/h3xxx: convert serial to gpiod APIs ARM: sa1100/assabet: convert serial to gpiod APIs serial: sa1100: add note about modem control signals serial: sa1100: add support for mctrl gpios ARM: riscpc: dma: use __iomem pointers for writing DMA ARM: riscpc: dma: improve address/length writing ARM: riscpc: dma: make state a local variable ARM: riscpc: dma: eliminate "cur_sg" scatterlist usage ARM: riscpc: fix DMA ARM: riscpc: fix ecard printing ARM: riscpc: fix lack of keyboard interrupts after irq conversion ... Signed-off-by: Olof Johansson <olof@lixom.net>	2019-07-15 17:29:45 -07:00
Thomas Gleixner	d2912cb15b	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:55 +02:00
Russell King	12290cc462	ARM: riscpc: move RiscPC assembly files from arch/arm/lib to mach-rpc Move the assembly files for RiscPC from arch/arm/lib to mach-rpc so that we contain RiscPC bits in one subdirectory. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-06-11 17:42:36 +01:00
Thomas Gleixner	4505153954	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 136 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:06 +02:00
Stefan Agner	e44fc38818	ARM: 8844/1: use unified assembler in assembly files Use unified assembler syntax (UAL) in assembly files. Divided syntax is considered deprecated. This will also allow to build the kernel using LLVM's integrated assembler. Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-02-26 11:26:07 +00:00
Stefan Agner	c001899a5d	ARM: 8843/1: use unified assembler in headers Use unified assembler syntax (UAL) in headers. Divided syntax is considered deprecated. This will also allow to build the kernel using LLVM's integrated assembler. Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-02-26 11:26:06 +00:00
Stefan Agner	a216376add	ARM: 8841/1: use unified assembler in macros Use unified assembler syntax (UAL) in macros. Divided syntax is considered deprecated. This will also allow to build the kernel using LLVM's integrated assembler. Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-02-26 11:25:43 +00:00
Nathan Chancellor	de9c0d49d8	ARM: 8833/1: Ensure that NEON code always compiles with Clang While building arm32 allyesconfig, I ran into the following errors: arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with '-mfloat-abi=softfp -mfpu=neon' In file included from lib/raid6/neon1.c:27: /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2: error: "NEON support not enabled" Building V=1 showed NEON_FLAGS getting passed along to Clang but __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang only defining __ARM_NEON__ when targeting armv7, rather than armv6k, which is the '-march' value for allyesconfig. >From lib/Basic/Targets/ARM.cpp in the Clang source: // This only gets set when Neon instructions are actually available, unlike // the VFP define, hence the soft float and arch check. This is subtly // different from gcc, we follow the intent which was that it should be set // when Neon instructions are actually available. if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) { Builder.defineMacro("__ARM_NEON", "1"); Builder.defineMacro("__ARM_NEON__"); // current AArch32 NEON implementations do not support double-precision // floating-point even when it is present in VFP. Builder.defineMacro("__ARM_NEON_FP", "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP)); } Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets definined by Clang. This doesn't functionally change anything because that code will only run where NEON is supported, which is implicitly armv7. Link: https://github.com/ClangBuiltLinux/linux/issues/287 Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-02-12 15:20:09 +00:00
Stefan Agner	baf2df8e15	ARM: 8827/1: fix argument count to match macro definition The macro str8w takes 10 arguments, abort being the 10th. In this particular instantiation the abort argument is passed as 11th argument leading to an error when using LLVM's integrated assembler: <instantiation>:46:47: error: too many positional arguments str8w r0, r3, r4, r5, r6, r7, r8, r9, ip, , abort=19f ^ arch/arm/lib/copy_template.S:277:5: note: while in macro instantiation 18: forward_copy_shift pull=24 push=8 ^ The argument is not used in the macro hence this does not change code generation. Signed-off-by: Stefan Agner <stefan@agner.ch> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-02-01 21:44:13 +00:00
Vincent Whitchurch	344eb5539a	ARM: 8813/1: Make aligned 2-byte getuser()/putuser() atomic on ARMv6+ getuser() and putuser() (and there underscored variants) use two strb[t]/ldrb[t] instructions when they are asked to get/put 16-bits. This means that the read/write is not atomic even when performed to a 16-bit-aligned address. This leads to problems with vhost: vhost uses __getuser() to read the vring's 16-bit avail.index field, and if it happens to observe a partial update of the index, wrong descriptors will be used which will lead to a breakdown of the virtio communication. A similar problem exists for __putuser() which is used to write to the vring's used.index field. The reason these functions use strb[t]/ldrb[t] is because strht/ldrht instructions did not exist until ARMv6T2/ARMv7. So we should be easily able to fix this on ARMv7. Also, since all ARMv6 processors also don't actually use the unprivileged instructions anymore for uaccess (since CONFIG_CPU_USE_DOMAINS is not used) we can easily fix them too. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-11-12 10:52:04 +00:00
Vincent Whitchurch	f441882a52	ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS ARMv6+ processors do not use CONFIG_CPU_USE_DOMAINS and use privileged ldr/str instructions in copy_{from/to}_user. They are currently unnecessarily using single ldr/str instructions and can use ldm/stm instructions instead like memcpy does (but with appropriate fixup tables). This speeds up a "dd if=foo of=bar bs=32k" on a tmpfs filesystem by about 4% on my Cortex-A9. before:134217728 bytes (128.0MB) copied, 0.543848 seconds, 235.4MB/s before:134217728 bytes (128.0MB) copied, 0.538610 seconds, 237.6MB/s before:134217728 bytes (128.0MB) copied, 0.544356 seconds, 235.1MB/s before:134217728 bytes (128.0MB) copied, 0.544364 seconds, 235.1MB/s before:134217728 bytes (128.0MB) copied, 0.537130 seconds, 238.3MB/s before:134217728 bytes (128.0MB) copied, 0.533443 seconds, 240.0MB/s before:134217728 bytes (128.0MB) copied, 0.545691 seconds, 234.6MB/s before:134217728 bytes (128.0MB) copied, 0.534695 seconds, 239.4MB/s before:134217728 bytes (128.0MB) copied, 0.540561 seconds, 236.8MB/s before:134217728 bytes (128.0MB) copied, 0.541025 seconds, 236.6MB/s after:134217728 bytes (128.0MB) copied, 0.520445 seconds, 245.9MB/s after:134217728 bytes (128.0MB) copied, 0.527846 seconds, 242.5MB/s after:134217728 bytes (128.0MB) copied, 0.519510 seconds, 246.4MB/s after:134217728 bytes (128.0MB) copied, 0.527231 seconds, 242.8MB/s after:134217728 bytes (128.0MB) copied, 0.525030 seconds, 243.8MB/s after:134217728 bytes (128.0MB) copied, 0.524236 seconds, 244.2MB/s after:134217728 bytes (128.0MB) copied, 0.523659 seconds, 244.4MB/s after:134217728 bytes (128.0MB) copied, 0.525018 seconds, 243.8MB/s after:134217728 bytes (128.0MB) copied, 0.519249 seconds, 246.5MB/s after:134217728 bytes (128.0MB) copied, 0.518527 seconds, 246.9MB/s Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-11-12 10:51:59 +00:00
Russell King	3e98d24098	Merge branches 'fixes', 'misc' and 'spectre' into for-next	2018-10-10 13:53:33 +01:00
Julien Thierry	a1d09e0742	ARM: 8797/1: spectre-v1.1: harden __copy_to_user Sanitize user pointer given to __copy_to_user, both for standard version and memcopy version of the user accessor. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-10-05 10:51:15 +01:00
Julien Thierry	afaf6838f4	ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization Introduce C and asm helpers to sanitize user address, taking the address range they target into account. Use asm helper for existing sanitization in __copy_from_user(). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-10-05 10:51:15 +01:00
Russell King	c61b466d4f	Merge branches 'fixes', 'misc' and 'spectre' into for-linus Conflicts: arch/arm/include/asm/uaccess.h Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-08-13 16:28:50 +01:00
Russell King	a3c0f84765	ARM: spectre-v1: mitigate user accesses Spectre variant 1 attacks are about this sequence of pseudo-code: index = load(user-manipulated pointer); access(base + index * stride); In order for the cache side-channel to work, the access() must me made to memory which userspace can detect whether cache lines have been loaded. On 32-bit ARM, this must be either user accessible memory, or a kernel mapping of that same user accessible memory. The problem occurs when the load() speculatively loads privileged data, and the subsequent access() is made to user accessible memory. Any load() which makes use of a user-maniplated pointer is a potential problem if the data it has loaded is used in a subsequent access. This also applies for the access() if the data loaded by that access is used by a subsequent access. Harden the get_user() accessors against Spectre attacks by forcing out of bounds addresses to a NULL pointer. This prevents get_user() being used as the load() step above. As a side effect, put_user() will also be affected even though it isn't implicated. Also harden copy_from_user() by redoing the bounds check within the arm_copy_from_user() code, and NULLing the pointer if out of bounds. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-08-02 17:41:38 +01:00
Masami Hiramatsu	0d73c3f8e7	ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions Since do_undefinstr() uses get_user to get the undefined instruction, it can be called before kprobes processes recursive check. This can cause an infinit recursive exception. Prohibit probing on get_user functions. Fixes: `24ba613c9d` ("ARM kprobes: core code") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-05-19 11:35:56 +01:00
Russell King	3a175cdf43	Merge branches 'fixes', 'misc', 'sa1111' and 'sa1100-for-next' into for-next	2018-01-21 15:38:10 +00:00
Nicolas Pitre	ff5fdafc9e	ARM: 8745/1: get rid of __memzero() The __memzero assembly code is almost identical to memset's except for two orr instructions. The runtime performance of __memset(p, n) and memset(p, 0, n) is accordingly almost identical. However, the memset() macro used to guard against a zero length and to call __memzero at compile time when the fill value is a constant zero interferes with compiler optimizations. Arnd found tha the test against a zero length brings up some new warnings with gcc v8: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82103 And successively rremoving the test against a zero length and the call to __memzero optimization produces the following kernel sizes for defconfig with gcc 6: text data bss dec hex filename 12248142 6278960 413588 18940690 1210312 vmlinux.orig 12244474 6278960 413588 18937022 120f4be vmlinux.no_zero_test 12239160 6278960 413588 18931708 120dffc vmlinux.no_memzero So it is probably not worth keeping __memzero around given that the compiler can do a better job at inlining trivial memset(p,0,n) on its own. And the memset code already handles a zero length just fine. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2018-01-21 15:37:56 +00:00
Chunyan Zhang	36b0cb84ee	ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch An additional 'ip' will be pushed to the stack, for restoring the DACR later, if CONFIG_CPU_SW_DOMAIN_PAN defined. However, the fixup still get the err_ptr by add #8*4 to sp, which results in the fact that the code area pointed by the LR will be overwritten, or the kernel will crash if CONFIG_DEBUG_RODATA is enabled. This patch fixes the stack mismatch. Fixes: `a5e090acbf` ("ARM: software-based priviledged-no-access support") Signed-off-by: Lvqiang Huang <Lvqiang.Huang@spreadtrum.com> Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2017-12-17 22:20:39 +00:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Matthew Wilcox	fd1d362600	ARM: implement memset32 & memset64 Reuse the existing optimised memset implementation to implement an optimised memset32 and memset64. Link: http://lkml.kernel.org/r/20170720184539.31609-5-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Minchan Kim <minchan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-08 18:26:48 -07:00
Al Viro	db68ce10c4	new helper: uaccess_kernel() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-03-28 16:43:25 -04:00
Kees Cook	9e34404818	ARM: 8658/1: uaccess: fix zeroing of 64-bit get_user() The 64-bit get_user() wasn't clearing the high word due to a typo in the error handler. The exception handler entry was already correct, though. Noticed during recent usercopy test additions in lib/test_user_copy.c. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2017-02-16 15:58:32 +00:00
Russell King	41884629fe	Merge branches 'clkdev', 'fixes', 'misc' and 'sa1100-base' into for-linus	2016-12-14 11:13:46 +00:00
Russell King	8478132a87	Revert "arm: move exports to definitions" This reverts commit `4dd1837d75`. Moving the exports for assembly code into the assembly files breaks KSYM trimming, but also breaks modversions. While fixing the KSYM trimming is trivial, fixing modversions brings us to a technically worse position that we had prior to the above change: - We end up with the prototype definitions divorsed from everything else, which means that adding or removing assembly level ksyms become more fragile: * if adding a new assembly ksyms export, a missed prototype in asm-prototypes.h results in a successful build if no module in the selected configuration makes use of the symbol. * when removing a ksyms export, asm-prototypes.h will get forgotten, with armksyms.c, you'll get a build error if you forget to touch the file. - We end up with the same amount of include files and prototypes, they're just in a header file instead of a .c file with their exports. As for lines of code, we don't get much of a size reduction: (original commit) 47 files changed, 131 insertions(+), 208 deletions(-) (fix for ksyms trimming) 7 files changed, 18 insertions(+), 5 deletions(-) (two fixes for modversions) 1 file changed, 34 insertions(+) 3 files changed, 7 insertions(+), 2 deletions(-) which results in a net total of only 25 lines deleted. As there does not seem to be much benefit from this change of approach, revert the change. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2016-11-23 10:00:03 +00:00
Russell King	24c66dfd56	ARM: fix backtrace Recent kernels have changed their behaviour to be more inconsistent when handling printk continuations. With todays kernels, the output looks sane on the console, but dmesg splits individual printk()s which do not have the KERN_CONT prefix into separate lines. Since the assembly code is not trivial to add the KERN_CONT, and we ideally want to avoid using KERN_CONT (as multiple printk()s can race between different threads), convert the assembly dumping the register values to C code, and have the C code build the output a line at a time before dumping to the console. This avoids the KERN_CONT issue, and also avoids situations where the output is intermixed with other console activity. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2016-11-15 15:25:39 +00:00
Nicolas Pitre	207b1150c0	ARM: 8619/1: udelay: document the various constants Explain where the value for UDELAY_MULT and UDELAY_SHIFT come from. Also fix/clarify some comments pertaining to their usage in the assembly code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-10-19 10:52:36 +01:00
Linus Torvalds	b26b5ef5ec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more misc uaccess and vfs updates from Al Viro: "The rest of the stuff from -next (more uaccess work) + assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: score: traps: Add missing include file to fix build error fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths fs/super.c: fix race between freeze_super() and thaw_super() overlayfs: Fix setting IOP_XATTR flag iov_iter: kernel-doc import_iovec() and rw_copy_check_uvector() blackfin: no access_ok() for __copy_{to,from}_user() arm64: don't zero in __copy_from_user{,_inatomic} arm: don't zero in __copy_from_user_inatomic()/__copy_from_user() arc: don't leak bits of kernel stack into coredump alpha: get rid of tail-zeroing in __copy_user()	2016-10-14 18:19:05 -07:00
Al Viro	2692a71bbd	Merge branch 'work.uaccess' into for-linus	2016-10-14 20:42:44 -04:00
Linus Torvalds	84d69848c9	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - EXPORT_SYMBOL for asm source by Al Viro. This does bring a regression, because genksyms no longer generates checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is working on a patch to fix this. Plus, we are talking about functions like strcpy(), which rarely change prototypes. - Fixes for PPC fallout of the above by Stephen Rothwell and Nick Piggin - fixdep speedup by Alexey Dobriyan. - preparatory work by Nick Piggin to allow architectures to build with -ffunction-sections, -fdata-sections and --gc-sections - CONFIG_THIN_ARCHIVES support by Stephen Rothwell - fix for filenames with colons in the initramfs source by me. * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits) initramfs: Escape colons in depfile ppc: there is no clear_pages to export powerpc/64: whitelist unresolved modversions CRCs kbuild: -ffunction-sections fix for archs with conflicting sections kbuild: add arch specific post-link Makefile kbuild: allow archs to select link dead code/data elimination kbuild: allow architectures to use thin archives instead of ld -r kbuild: Regenerate genksyms lexer kbuild: genksyms fix for typeof handling fixdep: faster CONFIG_ search ia64: move exports to definitions sparc32: debride memcpy.S a bit [sparc] unify 32bit and 64bit string.h sparc: move exports to definitions ppc: move exports to definitions arm: move exports to definitions s390: move exports to definitions m68k: move exports to definitions alpha: move exports to actual definitions x86: move exports to actual definitions ...	2016-10-14 14:26:58 -07:00
Al Viro	91344493b7	arm: don't zero in __copy_from_user_inatomic()/__copy_from_user() adjust copy_from_user(), obviously Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-09-15 19:51:56 -04:00
Kees Cook	7619751f8c	ARM: 8595/2: apply more __ro_after_init Guided by grsecurity's analogous __read_only markings in arch/arm, this applies several uses of __ro_after_init to structures that are only updated during __init. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-08-12 16:47:06 +01:00
Al Viro	4dd1837d75	arm: move exports to definitions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-08-07 23:47:21 -04:00
Nicolas Pitre	215e362daf	ARM: 8306/1: loop_udelay: remove bogomips value limitation Now that we don't support ARMv3 anymore, the loop based delay code can convert microsecs into number of loops using a 64-bit multiplication and more precision. This allows us to lift the hard limit of 3355 on the bogomips value as loops_per_jiffy may now safely span the full 32-bit range. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-06-22 19:55:12 +01:00
Kirill A. Shutemov	0ebd744615	arm, thp: remove infrastructure for handling splitting PMDs With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. [arnd@arndb.de: fix unterminated ifdef in header file] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Russell King	598bcc6ea6	Merge branches 'misc' and 'misc-rc6' into for-linus	2016-01-05 11:07:28 +00:00
Nicolas Pitre	42f25bddd0	ARM: 8477/1: runtime patch udiv/sdiv instructions into __aeabi_{u}idiv() The ARM compiler inserts calls to __aeabi_idiv() and __aeabi_uidiv() when it needs to perform division on signed and unsigned integers. If a processor has support for the sdiv and udiv instructions, the kernel may overwrite the beginning of those functions with those instructions and a "bx lr" to get better performance. To ensure that those functions are aligned to a 32-bit word for easier patching (which might not always be the case in Thumb mode) and that the two patched instructions end up in the same cache line, a 8-byte alignment is enforced when ARM_PATCH_IDIV is selected. This was heavily inspired by a previous patch from Stephen Boyd. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-12-17 10:29:01 +00:00
Russell King	c014953d84	ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN The uaccess_with_memcpy() code is currently incompatible with the SW PAN code: it takes locks within the region that we've changed the DACR, potentially sleeping as a result. As we do not save and restore the DACR across co-operative sleep events, can lead to an incorrect DACR value later in this code path. Reported-by: Peter Rosin <peda@axentia.se> Tested-by: Peter Rosin <peda@axentia.se> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-12-15 11:51:02 +00:00
Stephen Boyd	6f56a68d0b	ARM: 8438/1: Add unwinding to __clear_user_std() Add unwinding annotations so that unwinding from this function works properly. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-10-03 16:36:45 +01:00
Russell King	40d3f02851	Merge branches 'cleanup', 'fixes', 'misc', 'omap-barrier' and 'uaccess' into for-linus	2015-09-03 15:28:37 +01:00
Russell King	a5e090acbf	ARM: software-based priviledged-no-access support Provide a software-based implementation of the priviledged no access support found in ARMv8.1. Userspace pages are mapped using a different domain number from the kernel and IO mappings. If we switch the user domain to "no access" when we enter the kernel, we can prevent the kernel from touching userspace. However, the kernel needs to be able to access userspace via the various user accessor functions. With the wrapping in the previous patch, we can temporarily enable access when the kernel needs user access, and re-disable it afterwards. This allows us to trap non-intended accesses to userspace, eg, caused by an inadvertent dereference of the LIST_POISON* values, which, with appropriate user mappings setup, can be made to succeed. This in turn can allow use-after-free bugs to be further exploited than would otherwise be possible. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-08-26 20:34:24 +01:00
Russell King	3fba7e23f7	ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() Provide uaccess_save_and_enable() and uaccess_restore() to permit control of userspace visibility to the kernel, and hook these into the appropriate places in the kernel where we need to access userspace. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-08-25 16:14:43 +01:00
Nicolas Pitre	0f64b247e6	ARM: 8414/1: __copy_to_user_memcpy: fix mmap semaphore usage The mmap semaphore should not be taken when page faults are disabled. Since pagefault_disable() no longer disables preemption, we now need to use faulthandler_disabled() in place of in_atomic(). Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-08-18 13:59:59 +01:00
Russell King	1bd46782d0	ARM: avoid unwanted GCC memset()/memcpy() optimisations for IO variants We don't want GCC optimising our memset_io(), memcpy_fromio() or memcpy_toio() variants, so we must not call one of the standard functions. Provide a separate name for our assembly memcpy() and memset() functions, and use that instead, thereby bypassing GCC's ability to optimise these operations. GCCs optimisation may introduce unaligned accesses which are invalid for device mappings. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-07-03 20:46:15 +01:00
Linus Torvalds	e8a0b37d28	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: "Bigger items included in this update are: - A series of updates from Arnd for ARM randconfig build failures - Updates from Dmitry for StrongARM SA-1100 to move IRQ handling to drivers/irqchip/ - Move ARMs SP804 timer to drivers/clocksource/ - Perf updates from Mark Rutland in preparation to move the ARM perf code into drivers/ so it can be shared with ARM64. - MCPM updates from Nicolas - Add support for taking platform serial number from DT - Re-implement Keystone2 physical address space switch to conform to architecture requirements - Clean up ARMv7 LPAE code, which goes in hand with the Keystone2 changes. - L2C cleanups to avoid unlocking caches if we're prevented by the secure support to unlock. - Avoid cleaning a potentially dirty cache containing stale data on CPU initialisation - Add ARM-only entry point for secondary startup (for machines that can only call into a Thumb kernel in ARM mode). Same thing is also done for the resume entry point. - Provide arch_irqs_disabled via asm-generic - Enlarge ARMv7M vector table - Always use BFD linker for VDSO, as gold doesn't accept some of the options we need. - Fix an incorrect BSYM (for Thumb symbols) usage, and convert all BSYM compiler macros to a "badr" (for branch address). - Shut up compiler warnings provoked by our cmpxchg() implementation. - Ensure bad xchg sizes fail to link" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (75 commits) ARM: Fix build if CLKDEV_LOOKUP is not configured ARM: fix new BSYM() usage introduced via for-arm-soc branch ARM: 8383/1: nommu: avoid deprecated source register on mov ARM: 8391/1: l2c: add options to overwrite prefetching behavior ARM: 8390/1: irqflags: Get arch_irqs_disabled from asm-generic ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap ARM: 8388/1: tcm: Don't crash when TCM banks are protected by TrustZone ARM: 8384/1: VDSO: force use of BFD linker ARM: 8385/1: VDSO: group link options ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations ARM: remove __bad_xchg definition ARM: 8369/1: ARMv7M: define size of vector table for Vybrid ARM: 8382/1: clocksource: make ARM_TIMER_SP804 depend on GENERIC_SCHED_CLOCK ARM: 8366/1: move Dual-Timer SP804 driver to drivers/clocksource ARM: 8365/1: introduce sp804_timer_disable and remove arm_timer.h inclusion ARM: 8364/1: fix BE32 module loading ARM: 8360/1: add secondary_startup_arm prototype in header file ARM: 8359/1: correct secondary_startup_arm mode ARM: proc-v7: sanitise and document registers around errata ARM: proc-v7: clean up MIDR access ...	2015-06-26 12:20:00 -07:00
Linus Torvalds	cb8a4deaf9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "As usual, mostly comment, kerneldoc and printk() fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: lpfc: Grammar s/an negative/a negative/ ARM: lib/lib1funcs.S: fix typo s/substractions/subtractions/ cx25821: cx25821-medusa-reg.h: fix 0x0x prefix lib: crc-itu-t.[ch] fix 0x0x prefix in integer constants rapidio: Fix kerneldoc and comment qla4xxx: Fix printk() in qla4_83xx_read_reset_template() and qla4_83xx_pre_loopback_config() treewide: Kconfig: fix wording / spelling usb/serial: fix grammar in Kconfig help text for FTDI_SIO megaraid_sas: fix kerneldoc netfilter: ebtables: fix comment grammar drm/radeon: fix comment isdn: fix grammar in comment ARM: KVM: fix comment	2015-06-23 14:08:54 -07:00
Antonio Ospite	82350ab167	ARM: lib/lib1funcs.S: fix typo s/substractions/subtractions/ Signed-off-by: Antonio Ospite <ao2@ao2.it> Cc: Russell King <linux@arm.linux.org.uk> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2015-05-26 15:28:34 +02:00
Russell King	14327c6628	ARM: replace BSYM() with badr assembly macro BSYM() was invented to allow us to work around a problem with the assembler, where local symbols resolved by the assembler for the 'adr' instruction did not take account of their ISA. Since we don't want BSYM() used elsewhere, replace BSYM() with a new macro 'badr', which is like the 'adr' pseudo-op, but with the BSYM() mechanics integrated into it. This ensures that the BSYM()-ification is only used in conjunction with 'adr'. Acked-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-05-08 17:33:50 +01:00
Russell King	57ca654bef	ARM: ensure delay timer has sufficient accuracy for delays We have recently had an example of someone wanting to use a 90kHz timer for the software delay loop. udelay() needs to have at least microsecond resolution to allow drivers access to a delay mechanism with a reasonable chance of delaying the period they requested within at least a 50% marging of error, especially for small delays. Discussion about the udelay() accuracy can be found at: https://lkml.org/lkml/2011/1/9/37 Reject timers which are unable to supply this level of resolution. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-04-14 22:28:07 +01:00
Ard Biesheuvel	c4a84ae39b	ARM: 8322/1: keep .text and .fixup regions closer together This moves all fixup snippets to the .text.fixup section, which is a special section that gets emitted along with the .text section for each input object file, i.e., the snippets are kept much closer to the code they refer to, which helps prevent linker failure on large kernels. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-03-29 23:11:56 +01:00
Nicolas Pitre	c25630381c	ARM: 8285/1: remove ARMv3 user access code again This code was restored with commit `080fc66fb5` ("ARM: Bring back ARMv3 IO and user access code") because the RiscPC memory bus does not understand half-word load/stores. However only the IO code needed restoring since the alternative user access code contains no half-word accesses, is already used when CONFIG_PREEMPT is set and runs faster on a StrongARM. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-01-16 14:49:08 +00:00
Lin Yongting	279f487e0b	ARM: 8225/1: Add unwinding support for memory copy functions The memory copy functions(memcpy, __copy_from_user, __copy_to_user) never had unwinding annotations added. Currently, when accessing invalid pointer by these functions occurs the backtrace shown will stop at these functions or some completely unrelated function. Add unwinding annotations in hopes of getting a more useful backtrace in following cases: 1. die on accessing invalid pointer by these functions 2. kprobe trapped at any instruction within these functions 3. interrupted at any instruction within these functions Signed-off-by: Lin Yongting <linyongting@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-11-27 16:00:25 +00:00
Lin Yongting	207a6cb069	ARM: 8224/1: Add unwinding support for memmove function The memmove function never had unwinding annotations added. Currently, when accessing invalid pointer by memmove occurs the backtrace shown will stop at memmove or some completely unrelated function. Add unwinding annotations in hopes of getting a more useful backtrace in following cases: 1. die on accessing invalid pointer by memmove 2. kprobe trapped at any instruction within memmove 3. interrupted at any instruction within memmove Signed-off-by: Lin Yongting <linyongting@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-11-27 16:00:24 +00:00
Lin Yongting	20cb6abfe0	ARM: 8223/1: Add unwinding support for __memzero function The __memzero function never had unwinding annotations added. Currently, when accessing invalid pointer by __memzero occurs the backtrace shown will stop at __memzero or some completely unrelated function. Add unwinding annotations in hopes of getting a more useful backtrace in following cases: 1. die on accessing invalid pointer by __memzero 2. kprobe trapped at any instruction within __memzero 3. interrupted at any instruction within __memzero Signed-off-by: Lin Yongting <linyongting@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-11-27 16:00:23 +00:00
Lin Yongting	c2459d35f5	ARM: 8204/1: Add unwinding support for memset function The memset function never had unwinding annotations added. Currently, when accessing NULL pointer by memset occurs the backtrace shown will stop at memset or some completely unrelated function. Add unwinding annotations in hopes of getting a more useful backtrace when accessing NULL pointer by memset, kprobe or interrupt. Signed-off-by: Lin Yongting <linyongting@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-11-21 15:24:49 +00:00
Victor Kamensky	d9981380b4	ARM: 8137/1: fix get_user BE behavior for target variable with size of 8 bytes `e38361d` 'ARM: 8091/2: add get_user() support for 8 byte types' commit broke V7 BE get_user call when target var size is 64 bit, but '*ptr' size is 32 bit or smaller. `e38361d` changed type of __r2 from 'register unsigned long' to 'register typeof(x) __r2 asm("r2")' i.e before the change even when target variable size was 64 bit, __r2 was still 32 bit. But after `e38361d` commit, for target var of 64 bit size, __r2 became 64 bit and now it should occupy 2 registers r2, and r3. The issue in BE case that r3 register is least significant word of __r2 and r2 register is most significant word of __r2. But __get_user_4 still copies result into r2 (most significant word of __r2). Subsequent code copies from __r2 into x, but for situation described it will pick up only garbage from r3 register. Special __get_user_64t_(124) functions are introduced. They are similar to corresponding __get_user_(124) function but result stored in r3 register (lsw in case of 64 bit __r2 in BE image). Those function are used by get_user macro in case of BE and target var size is 64bit. Also changed __get_user_lo8 name into __get_user_32t_8 to get consistent naming accross all cases. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Suggested-by: Daniel Thompson <daniel.thompson@linaro.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-09-12 17:38:59 +01:00
Linus Torvalds	b3345d7c57	ARM: SoC platform changes for 3.17 This is the bulk of new SoC enablement and other platform changes for 3.17: * Samsung S5PV210 has been converted to DT and multiplatform * Clock drivers and bindings for some of the lower-end i.MX 1/2 platforms * Kirkwood, one of the popular Marvell platforms, is folded into the mvebu platform code, removing mach-kirkwood. * Hwmod data for TI AM43xx and DRA7 platforms. * More additions of Renesas shmobile platform support * Removal of plat-samsung contents that can be removed with S5PV210 being multiplatform/DT-enabled and the other two old platforms being removed. New platforms (most with only basic support right now): * Hisilicon X5HD2 settop box chipset is introduced * Mediatek MT6589 (mobile chipset) is introduced * Broadcom BCM7xxx settop box chipset is introduced + as usual a lot other pieces all over the platform code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJT5Dp+AAoJEIwa5zzehBx3w1sP/0vjT/LQOmC8Lv8RW2Ley2ua hNu3HcNPnT/N40JEdU9YNv3q0fdxGgcfKj011CNN+49zPSUf1xduk2wfCAk9yV50 8Sbt1PfDGm1YyUugGN420CzI431pPoM1OGXHZHkAmg+2J286RtUi3NckB//QDbCY QhEjhpYc9SXhAOCGwmB4ab7thOljOFSPzKTLMTu3+PNI5zRPRgkDkt6w9XlsAYmB nuR271BnzsROkMzAjycwaJ3kdim7wqrMRfk8g96o0jHSF5qf4zsT5uWYYAjTxdUQ 8Ajz6zjeHe4+95TwTDcq+lCX6rDLZgwkvCAc6hFbeg0uR7Dyek0h6XMEYtwdjaiU KNPwOENrYdENNDAGRpkFp1x4h/rY9Plfru0bBo5o6t7aPBvmNeCDzRtlTtLiUNDV dG8sfDMtrS/wFHVjylDSQ60Mb+wuW0XneC8D7chY/iRhIllUYi6YXXvt+/tH5C20 oYDOWqqcDFSb0sJhE5pn4KBV82ZaHx9jMBWGLl+erg2sDX/SK8SxOkLqKYZKtKB5 0leOGE3Y+C70xt3G9HftLz2sAvvt+C8UPsApPT+dHNE401TWJOYx6LphPkQKjeeK P1iwKi+It3l+FaBypgJy/LeMQRy7EyvDBK2I5WoVL/R2qq14EmP1ui3Tthjj0bhq tBBof6P9c8OnRVj1Lz3R =5TJ6 -----END PGP SIGNATURE----- Merge tag 'soc-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform changes from Olof Johansson: "This is the bulk of new SoC enablement and other platform changes for 3.17: - Samsung S5PV210 has been converted to DT and multiplatform - Clock drivers and bindings for some of the lower-end i.MX 1/2 platforms - Kirkwood, one of the popular Marvell platforms, is folded into the mvebu platform code, removing mach-kirkwood - Hwmod data for TI AM43xx and DRA7 platforms - More additions of Renesas shmobile platform support - Removal of plat-samsung contents that can be removed with S5PV210 being multiplatform/DT-enabled and the other two old platforms being removed New platforms (most with only basic support right now): - Hisilicon X5HD2 settop box chipset is introduced - Mediatek MT6589 (mobile chipset) is introduced - Broadcom BCM7xxx settop box chipset is introduced + as usual a lot other pieces all over the platform code" * tag 'soc-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (240 commits) ARM: hisi: remove smp from machine descriptor power: reset: move hisilicon reboot code ARM: dts: Add hix5hd2-dkb dts file. ARM: debug: Rename Hi3716 to HIX5HD2 ARM: hisi: enable hix5hd2 SoC ARM: hisi: add ARCH_HISI MAINTAINERS: add entry for Broadcom ARM STB architecture ARM: brcmstb: select GISB arbiter and interrupt drivers ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs ARM: configs: enable SMP in bcm_defconfig ARM: add SMP support for Broadcom mobile SoCs Documentation: arm: misc updates to Marvell EBU SoC status Documentation: arm: add URLs to public datasheets for the Marvell Armada XP SoC ARM: mvebu: fix build without platforms selected ARM: mvebu: add cpuidle support for Armada 38x ARM: mvebu: add cpuidle support for Armada 370 cpuidle: mvebu: add Armada 38x support cpuidle: mvebu: add Armada 370 support cpuidle: mvebu: rename the driver from armada-370-xp to mvebu-v7 ARM: mvebu: export the SCU address ...	2014-08-08 11:14:29 -07:00
Daniel Thompson	e38361d032	ARM: 8091/2: add get_user() support for 8 byte types Recent contributions, including to DRM and binder, introduce 64-bit values in their interfaces. A common motivation for this is to allow the same ABI for 32- and 64-bit userspaces (and therefore also a shared ABI for 32/64 hybrid userspaces). Anyhow, the developers would like to avoid gotchas like having to use copy_from_user(). This feature is already implemented on x86-32 and the majority of other 32-bit architectures. The current list of get_user_8 hold out architectures are: arm, avr32, blackfin, m32r, metag, microblaze, mn10300, sh. Credit: My name sits rather uneasily at the top of this patch. The v1 and v2 versions of the patch were written by Rob Clark and to produce v4 I mostly copied code from Russell King and H. Peter Anvin. However I have mangled the patch sufficiently that blame is rightfully mine even if credit should more widely shared. Changelog: v5: updated to use the ret macro (requested by Russell King) v4: remove an inlined add on big endian systems (spotted by Russell King), used __ARMEB__ rather than BIG_ENDIAN (to match rest of file), cleared r3 on EFAULT during __get_user_8. v3: fix a couple of checkpatch issues v2: pass correct size to check_uaccess, and better handling of narrowing double word read with __get_user_xb() (Russell King's suggestion) v1: original Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-18 12:29:34 +01:00
Russell King	6ebbf2ce43	ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-18 12:29:04 +01:00
Peter De Schrijver	5930c1a1f7	ARM: choose highest resolution delay timer In case there are several possible delay timers, choose the one with the highest resolution. This code relies on the fact secondary CPUs have not yet been brought online when register_current_timer_delay() is called. This is ensured by implementing calibration_delay_done(), Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Stephen Warren <swarren@nvidia.com>	2014-06-16 12:48:07 -06:00
Victor Kamensky	d98b90ea22	ARM: 7990/1: asm: rename logical shift macros push pull into lspush lspull Renames logical shift macros, 'push' and 'pull', defined in arch/arm/include/asm/assembler.h, into 'lspush' and 'lspull'. That eliminates name conflict between 'push' logical shift macro and 'push' instruction mnemonic. That allows assembler.h to be included in .S files that use 'push' instruction. Suggested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-02-25 11:33:57 +00:00
Will Deacon	c32ffce0f6	ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics After a bunch of benchmarking on the interaction between dmb and pldw, it turns out that issuing the pldw after the dmb instruction can give modest performance gains (~3% atomic_add_return improvement on a dual A15). This patch adds prefetchw invocations to our barriered atomic operations including cmpxchg, test_and_xxx and futexes. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-02-25 11:30:20 +00:00
Kim Phillips	017f161a55	ARM: 7877/1: use built-in byte swap function Enable the compiler intrinsic for byte swapping on arch ARM. This allows the compiler to detect and be able to optimize out byte swappings, and has a very modest benefit on vmlinux size (Linaro gcc 4.8): text data bss dec hex filename 2840310 123932 61960 3026202 2e2d1a vmlinux-lart #orig 2840152 123932 61960 3026044 2e2c7c vmlinux-lart #builtin-bswap 6473120 314840 5616016 12403976 bd4508 vmlinux-mxs #orig 6472586 314848 5616016 12403450 bd42fa vmlinux-mxs #builtin-bswap 7419872 318372 379556 8117800 7bde28 vmlinux-imx_v6_v7 #orig 7419170 318364 379556 8117090 7bdb62 vmlinux-imx_v6_v7 #builtin-bswap Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-12-29 12:32:45 +00:00
Russell King	ef41b5c924	ARM: make kernel oops easier to read We don't need the offset for the first function name in each backtrace entry; this needlessly consumes screen space. This is virtually always the first or second instruction in the called function. Also, recognise stmfd instructions which include r10 as a valid stack saving instruction, and when dumping the registers, dump six registers per line rather than five, and fix the wrapping. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-12-29 12:32:30 +00:00
Fabio Estevam	11d4bb1bd0	ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation Currently mx53 (CortexA8) running at 1GHz reports: Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760) Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop) The original object code looks like this: 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 0000003c <__loop_delay>: 3c: e2500001 subs r0, r0, #1 40: 8afffffe bhi 3c <__loop_delay> 44: e1a0f00e mov pc, lr After adding the 'align 3' directive to __loop_delay (align to 8 bytes): 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 3c: e320f000 nop {0} 00000040 <__loop_delay>: 40: e2500001 subs r0, r0, #1 44: 8afffffe bhi 40 <__loop_delay> 48: e1a0f00e mov pc, lr 4c: e320f000 nop {0} , which now reports: Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736) Some more test results: On mx31 (ARM1136) running at 532 MHz, before the patch: Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184) On mx31 (ARM1136) running at 532 MHz after the patch: Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968) Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same BogoMIPS value before and after this patch. Reported-by: Tom Evans <tom_usenet@optusnet.com.au> Suggested-by: Tom Evans <tom_usenet@optusnet.com.au> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-11-30 22:21:03 +00:00
Will Deacon	b7ec699405	ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP Uwe reported a build failure when targetting a NOMMU platform with my recent prefetch changes: arch/arm/lib/changebit.S: Assembler messages: arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture This is due to use of the .arch_extension mp directive immediately prior to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to nothing if !CONFIG_SMP, gas will still choke on the directive. This patch fixes the issue by only emitting the sequence (including the directive) if CONFIG_SMP=y. Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-11-20 23:05:53 +00:00
Linus Torvalds	f47671e2d8	Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm Pull ARM updates from Russell King: "Included in this series are: 1. BE8 (modern big endian) changes for ARM from Ben Dooks 2. big.Little support from Nicolas Pitre and Dave Martin 3. support for LPAE systems with all system memory above 4GB 4. Perf updates from Will Deacon 5. Additional prefetching and other performance improvements from Will. 6. Neon-optimised AES implementation fro Ard. 7. A number of smaller fixes scattered around the place. There is a rather horrid merge conflict in tools/perf - I was never notified of the conflict because it originally occurred between Will's tree and other stuff. Consequently I have a resolution which Will forwarded me, which I'll forward on immediately after sending this mail. The other notable thing is I'm expecting some build breakage in the crypto stuff on ARM only with Ard's AES patches. These were merged into a stable git branch which others had already pulled, so there's little I can do about this. The problem is caused because these patches have a dependency on some code in the crypto git tree - I tried requesting a branch I can pull to resolve these, and all I got each time from the crypto people was "we'll revert our patches then" which would only make things worse since I still don't have the dependent patches. I've no idea what's going on there or how to resolve that, and since I can't split these patches from the rest of this pull request, I'm rather stuck with pushing this as-is or reverting Ard's patches. Since it should "come out in the wash" I've left them in - the only build problems they seem to cause at the moment are with randconfigs, and since it's a new feature anyway. However, if by -rc1 the dependencies aren't in, I think it'd be best to revert Ard's patches" I resolved the perf conflict roughly as per the patch sent by Russell, but there may be some differences. Any errors are likely mine. Let's see how the crypto issues work out.. * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (110 commits) ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h" ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg(). ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h ARM: 7871/1: amba: Extend number of IRQS ARM: 7887/1: Don't smp_cross_call() on UP devices in arch_irq_work_raise() ARM: 7872/1: Support arch_irq_work_raise() via self IPIs ARM: 7880/1: Clear the IT state independent of the Thumb-2 mode ARM: 7878/1: nommu: Implement dummy early_paging_init() ARM: 7876/1: clear Thumb-2 IT state on exception handling ARM: 7874/2: bL_switcher: Remove cpu_hotplug_driver_{lock,unlock}() ARM: footbridge: fix build warnings for netwinder ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu ARM: fix misplaced arch_virt_to_idmap() ARM: 7848/1: mcpm: Implement cpu_kill() to synchronise on powerdown ARM: 7847/1: mcpm: Factor out logical-to-physical CPU translation ARM: 7869/1: remove unused XSCALE_PMU Kconfig param ARM: 7864/1: Handle 64-bit memory in case of 32-bit phys_addr_t ARM: 7863/1: Let arm_add_memory() always use 64-bit arguments ARM: 7862/1: pcpu: replace __get_cpu_var_uses ARM: 7861/1: cacheflush: consolidate single-CPU ARMv7 cache disabling code ...	2013-11-14 08:51:29 +09:00
Russell King	df762eccba	Merge branch 'devel-stable' into for-next Conflicts: arch/arm/include/asm/atomic.h arch/arm/include/asm/hardirq.h arch/arm/kernel/smp.c	2013-11-12 10:58:59 +00:00
Steven Capper	a3a9ea656d	ARM: 7858/1: mm: make UACCESS_WITH_MEMCPY huge page aware The memory pinning code in uaccess_with_memcpy.c does not check for HugeTLB or THP pmds, and will enter an infinite loop should a __copy_to_user or __clear_user occur against a huge page. This patch adds detection code for huge pages to pin_page_for_write. As this code can be executed in a fast path it refers to the actual pmds rather than the vma. If a HugeTLB or THP is found (they have the same pmd representation on ARM), the page table spinlock is taken to prevent modification whilst the page is pinned. On ARM, huge pages are only represented as pmds, thus no huge pud checks are performed. (For huge puds one would lock the page table in a similar manner as in the pmd case). Two helper functions are introduced; pmd_thp_or_huge will check whether or not a page is huge or transparent huge (which have the same pmd layout on ARM), and pmd_hugewillfault will detect whether or not a page fault will occur on write to the page. Running the following test (with the chunking from read_zero removed): $ dd if=/dev/zero of=/dev/null bs=10M count=1024 Gave: 2.3 GB/s backed by normal pages, 2.9 GB/s backed by huge pages, 5.1 GB/s backed by huge pages, with page mask=HPAGE_MASK. After some discussion, it was decided not to adopt the HPAGE_MASK, as this would have a significant detrimental effect on the overall system latency due to page_table_lock being held for too long. This could be revisited if split huge page locks are adopted. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-29 11:06:15 +00:00
Will Deacon	d779c07dd7	ARM: bitops: prefetch the destination word for write prior to strex The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch prefixes our atomic bitops implementation with prefetchw, to try and grab the line in exclusive state from the start. The testop macro is left alone, since the barrier semantics limit the usefulness of prefetching data. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2013-09-30 16:42:56 +01:00
Linus Walleij	136dfa5eda	ARM: delete mach-shark The Shark machine sub-architecture (also known as DNARD, the DIGITAL Network Appliance Reference Design) lacks a maintainer able to apply and test patches to modernize the architecture. It is suspected that the current kernel, while it compiles, does not even boot on this machine. The listed maintainer has expressed that he will not be able to spend any time on the maintenance for the coming year. So let's delete it from the kernel for now. It can always be resurrected with git revert if maintenance is resumed. As the VIA82c505 PCI adapter was only used by this architecture, that gets deleted too. Cc: arm@kernel.org Cc: Alexander Schulz <alex@shark-linux.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2013-09-17 12:34:36 +02:00
Ard Biesheuvel	9319206d71	ARM: 7835/2: fix modular build of xor_blocks() with NEON enabled Commit `0195659` introduced a NEON accelerated version of the xor_blocks() function, but it needs the changes in this patch to allow it to be built as a module rather than statically into the kernel. This patch creates a separate module xor-neon.ko which exports the NEON inner xor_blocks() functions depended upon by the regular xor.ko if it is built with CONFIG_KERNEL_MODE_NEON=y Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-09-09 15:24:47 +01:00
Russell King	b4f656eea6	Pull branch 'for-rmk' of git://git.linaro.org/people/ardbiesheuvel/linux-arm into devel-stable Comments from Ard Biesheuvel: I have included two use cases that I have been using, XOR and RAID-6 checksumming. The former gets a 60% performance boost on the NEON, the latter over 400%. ARM: add support for kernel mode NEON Adds kernel_neon_begin/end (renamed from kernel_vfp_begin/end in the previous version to de-emphasize the VFP part as VFP code that needs software assistance is not supported currently.) Introduces <asm/neon.h> and the Kconfig symbol KERNEL_MODE_NEON. This has been aligned with Catalin for arm64, so any NEON code that does not use assembly but intrinsics or the GCC vectorizer (such as my examples) can potentially be shared between arm and arm64 archs. ARM: move VFP init to an earlier boot stage This is needed so the NEON is enabled when the XOR and RAID-6 algo boot time benchmarks are run. ARM: be strict about FP exceptions in kernel mode This adds a check to vfp_support_entry() to flag unsupported uses of the NEON/VFP in kernel mode. FP exceptions (bounces) are flagged as a bug, this is because of their potentially intermittent nature. Exceptions caused by the fact that kernel_neon_begin has not been called are just routed through the undef handler. ARM: crypto: add NEON accelerated XOR implementation This is the xor_blocks() implementation built with -ftree-vectorize, 60% faster than optimized ARM code. It calls in_interrupt() to check whether the NEON flavor can be used: this should really not be necessary, but due to xor_blocks'squite generic nature, there is no telling how exactly people may be using it in the real world. lib/raid6: add ARM-NEON accelerated syndrome calculation This is a port of the RAID-6 checksumming code in altivec.uc ported to use NEON intrinsics. It is about 4x faster than the sequential code.	2013-07-22 17:46:40 +01:00
Paul Gortmaker	8bd26e3a7e	arm: delete __cpuinit/__CPUINIT usage from all ARM users The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit `5e427ec2d0` ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) and are flagged as __cpuinit -- so if we remove the __cpuinit from the arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit related content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the ARM uses of the __cpuinit macros from C code, and all __CPUINIT from assembly code. It also had two ".previous" section statements that were paired off against __CPUINIT (aka .section ".cpuinit.text") that also get removed here. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2013-07-14 19:36:52 -04:00
Ard Biesheuvel	01956597cb	ARM: crypto: add NEON accelerated XOR implementation Add a source file xor-neon.c (which is really just the reference C implementation passed through the GCC vectorizer) and hook it up to the XOR framework. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org>	2013-07-08 22:09:06 +01:00
Will Deacon	6f3d90e556	ARM: 7685/1: delay: use private ticks_per_jiffy field for timer-based delay ops Commit `70264367a2` ("ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock") fixed a problem with our timer-based delay loop, where loops_per_jiffy is scaled by cpufreq yet used directly by the timer delay ops. This patch fixes the problem in a more elegant way by keeping a private ticks_per_jiffy field in the delay ops, independent of loops_per_jiffy and therefore not subject to scaling. The loop-based delay continues to use loops_per_jiffy directly, as it should. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-04-03 16:45:50 +01:00
Nicolas Pitre	418df63ada	ARM: 7670/1: fix the memset fix Commit `455bd4c430` ("ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations") attempted to fix a compliance issue with the memset return value. However the memset itself became broken by that patch for misaligned pointers. This fixes the above by branching over the entry code from the misaligned fixup code to avoid reloading the original pointer. Also, because the function entry alignment is wrong in the Thumb mode compilation, that fixup code is moved to the end. While at it, the entry instructions are slightly reworked to help dual issue pipelines. Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-03-12 12:18:47 +00:00
Ivan Djelic	455bd4c430	ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on assumptions about the implementation of memset and similar functions. The current ARM optimized memset code does not return the value of its first argument, as is usually expected from standard implementations. For instance in the following function: void debug_mutex_lock_common(struct mutex lock, struct mutex_waiter waiter) { memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter)); waiter->magic = waiter; INIT_LIST_HEAD(&waiter->list); } compiled as: 800554d0 <debug_mutex_lock_common>: 800554d0: e92d4008 push {r3, lr} 800554d4: e1a00001 mov r0, r1 800554d8: e3a02010 mov r2, #16 ; 0x10 800554dc: e3a01011 mov r1, #17 ; 0x11 800554e0: eb04426e bl 80165ea0 <memset> 800554e4: e1a03000 mov r3, r0 800554e8: e583000c str r0, [r3, #12] 800554ec: e5830000 str r0, [r3] 800554f0: e5830004 str r0, [r3, #4] 800554f4: e8bd8008 pop {r3, pc} GCC assumes memset returns the value of pointer 'waiter' in register r0; causing register/memory corruptions. This patch fixes the return value of the assembly version of memset. It adds a 'mov' instruction and merges an additional load+store into existing load/store instructions. For ease of review, here is a breakdown of the patch into 4 simple steps: Step 1 ====== Perform the following substitutions: ip -> r8, then r0 -> ip, and insert 'mov ip, r0' as the first statement of the function. At this point, we have a memset() implementation returning the proper result, but corrupting r8 on some paths (the ones that were using ip). Step 2 ====== Make sure r8 is saved and restored when (! CALGN(1)+0) == 1: save r8: - str lr, [sp, #-4]! + stmfd sp!, {r8, lr} and restore r8 on both exit paths: - ldmeqfd sp!, {pc} @ Now <64 bytes to go. + ldmeqfd sp!, {r8, pc} @ Now <64 bytes to go. (...) tst r2, #16 stmneia ip!, {r1, r3, r8, lr} - ldr lr, [sp], #4 + ldmfd sp!, {r8, lr} Step 3 ====== Make sure r8 is saved and restored when (! CALGN(1)+0) == 0: save r8: - stmfd sp!, {r4-r7, lr} + stmfd sp!, {r4-r8, lr} and restore r8 on both exit paths: bgt 3b - ldmeqfd sp!, {r4-r7, pc} + ldmeqfd sp!, {r4-r8, pc} (...) tst r2, #16 stmneia ip!, {r4-r7} - ldmfd sp!, {r4-r7, lr} + ldmfd sp!, {r4-r8, lr} Step 4 ====== Rewrite register list "r4-r7, r8" as "r4-r8". Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-03-07 16:14:22 +00:00

1 2 3 4 5 ...

263 Commits