A number of irqchip implementations are (ab)using the irqdomain allocator
by passing a fwnode that is neither a FWNODE_OF or a FWNODE_IRQCHIP.
This is pretty bad, but it also feels pretty crap to force these drivers to
allocate their own irqchip_fwid when they already have a proper fwnode.
Instead, let's teach the irqdomain allocator about ACPI device nodes, and
add some lovely name generation code... Tested on an arm64 D05 system.
Reported-and-tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ma Jun <majun258@huawei.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Link: http://lkml.kernel.org/r/20170707083959.10349-1-marc.zyngier@arm.com
debugfs_remove() can be called with a NULL pointer.
Fixes: 087cdfb662 ("genirq/debugfs: Add proper debugfs interface")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq timings infrastructure tracks when interrupts occur in order to
statistically predict te next interrupt event.
There is no point to track timer interrupts and try to predict them because
the next expiration time is already known. This can be avoided via the
IRQF_TIMER flag which is passed by timer drivers in request_irq(). It marks
the interrupt as timer based which alloes to ignore these interrupts in the
timings code.
Per CPU interrupts which are requested via request_percpu_+irq() have no
flag argument, so marking per cpu timer interrupts is not possible and they
get tracked pointlessly.
Add __request_percpu_irq() as a variant of request_percpu_irq() with a
flags argument and make request_percpu_irq() an inline wrapper passing
flags = 0.
The flag parameter is restricted to IRQF_TIMER as all other IRQF_ flags
make no sense for per cpu interrupts.
The next step is to convert all existing users of request_percpu_irq() and
then remove the wrapper and the underscores.
[ tglx: Massaged changelog ]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: nicolas.pitre@linaro.org
Cc: vincent.guittot@linaro.org
Cc: rafael@kernel.org
Link: http://lkml.kernel.org/r/1499344144-3964-1-git-send-email-daniel.lezcano@linaro.org
No point to do memory management from a interrupt disabled spin locked
region.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.196130646@linutronix.de
Aside of being conceptually wrong, there is also an actual (hard to
trigger and mostly theoretical) problem.
CPU0 CPU1
free_irq(X) interrupt X
spin_lock(desc->lock)
wake irq thread()
spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()
release_resources() thread_handler()
spin_unlock(desc->lock) access released resources.
synchronize_irq()
Move the release resources invocation after synchronize_irq() so it's
guaranteed that the threaded handler has finished.
Move the resource request call out of the desc->lock held region as well,
so the invocation context is the same for both request and release.
This solves the problems with those functions on RT as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.117028181@linutronix.de
The irq_request/release_resources() callbacks ar currently invoked under
desc->lock with interrupts disabled. This is a source of problems on RT and
conceptually not required.
Add a seperate mutex to struct irq_desc which allows to serialize
request/free_irq(), which can be used to move the resource functions out of
the desc->lock held region.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.039220922@linutronix.de
There is no point in having the irq_bus_lock() protection around all
callers to __setup_irq().
Move it into __setup_irq(). This is also a preparatory patch for addressing
the issues with the irq resource callbacks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214343.960949031@linutronix.de
If CONFIG_SMP=n, and gcc (e.g. 4.1.2) decides not to inline
__irq_startup_managed(), the build fails with:
kernel/built-in.o: In function `irq_startup':
(.text+0x38ed8): undefined reference to `irq_set_affinity_locked'
Fix this by forcing inlining of __irq_startup_managed().
Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1499162761-12398-1-git-send-email-geert@linux-m68k.org
Fix this build error:
kernel/irq/internals.h:440:20: error: inlining failed in call to always_inline
'irq_domain_debugfs_init': function body not available
kernel/irq/debugfs.c:202:2: note: called from here
irq_domain_debugfs_init(root_dir);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1707041124000.1712@schleppi
around. Highlights include:
- Conversion of a bunch of security documentation into RST
- The conversion of the remaining DocBook templates by The Amazing
Mauro Machine. We can now drop the entire DocBook build chain.
- The usual collection of fixes and minor updates.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZWkGAAAoJEI3ONVYwIuV6rf0P/0B3JTiVPKS/WUx53+jzbAi4
1BN7dmmuMxE1bWpgdEq+ac4aKxm07iAojuntuMj0qz/ZB1WARcmvEqqzI5i4wfq9
5MrLduLkyuWfr4MOPseKJ2VK83p8nkMOiO7jmnBsilu7fE4nF+5YY9j4cVaArfMy
cCQvAGjQzvej2eiWMGUSLHn4QFKh00aD7cwKyBVsJ08b27C9xL0J2LQyCDZ4yDgf
37/MH3puEd3HX/4qAwLonIxT3xrIrrbDturqLU7OSKcWTtGZNrYyTFbwR3RQtqWd
H8YZVg2Uyhzg9MYhkbQ2E5dEjUP4mkegcp6/JTINH++OOPpTbdTJgirTx7VTkSf1
+kL8t7+Ayxd0FH3+77GJ5RMj8LUK6rj5cZfU5nClFQKWXP9UL3IelQ3Nl+SpdM8v
ZAbR2KjKgH9KS6+cbIhgFYlvY+JgPkOVruwbIAc7wXVM3ibk1sWoBOFEujcbueWh
yDpQv3l1UX0CKr3jnevJoW26LtEbGFtC7gSKZ+3btyeSBpWFGlii42KNycEGwUW0
ezlwryDVHzyTUiKllNmkdK4v73mvPsZHEjgmme4afKAIiUilmcUF4XcqD86hISFT
t+UJLA/zEU+0sJe26o2nK6GNJzmo4oCtVyxfhRe26Ojs1n80xlYgnZRfuIYdd31Z
nwLBnwDCHAOyX91WXp9G
=cVjZ
-----END PGP SIGNATURE-----
Merge tag 'docs-4.13' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"There has been a fair amount of activity in the docs tree this time
around. Highlights include:
- Conversion of a bunch of security documentation into RST
- The conversion of the remaining DocBook templates by The Amazing
Mauro Machine. We can now drop the entire DocBook build chain.
- The usual collection of fixes and minor updates"
* tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
scripts/kernel-doc: handle DECLARE_HASHTABLE
Documentation: atomic_ops.txt is core-api/atomic_ops.rst
Docs: clean up some DocBook loose ends
Make the main documentation title less Geocities
Docs: Use kernel-figure in vidioc-g-selection.rst
Docs: fix table problems in ras.rst
Docs: Fix breakage with Sphinx 1.5 and upper
Docs: Include the Latex "ifthen" package
doc/kokr/howto: Only send regression fixes after -rc1
docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
doc: Document suitability of IBM Verse for kernel development
Doc: fix a markup error in coding-style.rst
docs: driver-api: i2c: remove some outdated information
Documentation: DMA API: fix a typo in a function name
Docs: Insert missing space to separate link from text
doc/ko_KR/memory-barriers: Update control-dependencies example
Documentation, kbuild: fix typo "minimun" -> "minimum"
docs: Fix some formatting issues in request-key.rst
doc: ReSTify keys-trusted-encrypted.txt
doc: ReSTify keys-request-key.txt
...
An interrupt behaves with a burst of activity with periodic interval of time
followed by one or two peaks of longer interval.
As the time intervals are periodic, statistically speaking they follow a normal
distribution and each interrupts can be tracked individually.
Add a mechanism to compute the statistics on all interrupts, except the
timers which are deterministic from a prediction point of view, as their
expiry time is known.
The goal is to extract the periodicity for each interrupt, with the last
timestamp and sum them, so the next event can be predicted to a certain
extent.
Taking the earliest prediction gives the expected wakeup on the system
(assuming a timer won't expire before).
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-2-git-send-email-daniel.lezcano@linaro.org
The interrupt framework gives a lot of information about each interrupt. It
does not keep track of when those interrupts occur though, which is a
prerequisite for estimating the next interrupt arrival for power management
purposes.
Add a mechanism to record the timestamp for each interrupt occurrences in a
per-CPU circular buffer to help with the prediction of the next occurrence
using a statistical model.
Each CPU can store up to IRQ_TIMINGS_SIZE events <irq, timestamp>, the
current value of IRQ_TIMINGS_SIZE is 32.
Each event is encoded into a single u64, where the high 48 bits are used
for the timestamp and the low 16 bits are for the irq number.
A static key is introduced so when the irq prediction is switched off at
runtime, the overhead is near to zero.
It results in most of the code in internals.h for inline reasons and a very
few in the new file timings.c. The latter will contain more in the next patch
which will provide the statistical model for the next event prediction.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-1-git-send-email-daniel.lezcano@linaro.org
debugfs_remove() has it's own NULL pointer check. Remove the conditional
and make irq_remove_debugfs_entry() an inline helper
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It did seem like a good idea at the time, but it never really
caught on, and auto-recursive domains remain unused 3 years after
having been introduced.
Oh well, time for a late spring cleanup.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We can have irq domains that are identified by the same fwnode
(because they are serviced by the same HW), and yet have different
functionnality (because they serve different busses, for example).
This is what we use the bus_token field.
Since we don't use this field when generating the domain name,
all the aliasing domains will get the same name, and the debugfs
file creation fails. Also, bus_token is updated by individual drivers,
and the core code is unaware of that update.
In order to sort this mess, let's introduce a helper that takes care
of updating bus_token, and regenerate the debugfs file.
A separate patch will update all the individual users.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently the irq vector spread algorithm is restricted to online CPUs,
which ties the IRQ mapping to the currently online devices and doesn't deal
nicely with the fact that CPUs could come and go rapidly due to e.g. power
management.
Instead assign vectors to all present CPUs to avoid this churn.
Build a map of all possible CPUs for a given node, as the architectures
only provide a map of all onlines CPUs. Do this dynamically on each call
for the vector assingments, which is a bit suboptimal and could be
optimized in the future by provinding a mapping from the arch code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-nvme@lists.infradead.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170603140403.27379-5-hch@lst.de
Avoid trying to add a newly online CPU to the effective affinity mask of an
started up interrupt. That interrupt will either stay on the already online
CPU or move around for no value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.431321047@linutronix.de
Many interrupt chips allow only a single CPU as interrupt target. The core
code has no knowledge about that. That's unfortunate as it could avoid
trying to readd a newly online CPU to the effective affinity mask.
Add the status flag and the necessary accessors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.352343969@linutronix.de
If a CPU goes offline, interrupts affine to the CPU are moved away. If the
outgoing CPU is the last CPU in the affinity mask the migration code breaks
the affinity and sets it it all online cpus.
This is a problem for affinity managed interrupts as CPU hotplug is often
used for power management purposes. If the affinity is broken, the
interrupt is not longer affine to the CPUs to which it was allocated.
The affinity spreading allows to lay out multi queue devices in a way that
they are assigned to a single CPU or a group of CPUs. If the last CPU goes
offline, then the queue is not longer used, so the interrupt can be
shutdown gracefully and parked until one of the assigned CPUs comes online
again.
Add a graceful shutdown mechanism into the irq affinity breaking code path,
mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.
In the online path, scan the active interrupts for managed interrupts and
if the interrupt is functional and the newly online CPU is part of the
affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
the interrupts is started up, try to add the CPU back to the effective
affinity mask.
Originally-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently and set these interrupts into a managed shutdown
state when the last CPU of the assigned affinity mask goes offline. The
interrupt will be restarted when one of the CPUs in the assigned affinity
mask comes back online.
Add the necessary logic to irq_startup(). If an interrupt is requested and
started up, the code checks whether it is affinity managed and if so, it
checks whether a CPU in the interrupts affinity mask is online. If not, it
puts the interrupt into managed shutdown state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.189851170@linutronix.de
In order to handle managed interrupts gracefully on irq_startup() so they
won't lose their assigned affinity, it's necessary to allow startups which
keep the interrupts in managed shutdown state, if none of the assigend CPUs
is online. This allows drivers to request interrupts w/o the CPUs being
online, which avoid online/offline churn in drivers.
Add a force argument which can override that decision and let only
request_irq() and enable_irq() allow the managed shutdown
handling. enable_irq() is required, because the interrupt might be
requested with IRQF_NOAUTOEN and enable_irq() invokes irq_startup() which
would then wreckage the assignment again. All other callers force startup
and potentially break the assigned affinity.
No functional change as this only adds the function argument.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.112094565@linutronix.de
Split out the inner workings of irq_startup() so it can be reused to handle
managed interrupts gracefully.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.033235144@linutronix.de
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently. This will set these interrupts into a managed
shutdown state when the last CPU of the assigned affinity mask goes
offline. The interrupt will be restarted when one of the CPUs in the
assigned affinity mask comes back online.
Introduce the necessary state flag and the accessor functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.954523476@linutronix.de
If the architecture supports the effective affinity mask, migrating
interrupts away which are not targeted by the effective mask is
pointless.
They can stay in the user or system supplied affinity mask, but won't be
targetted at any given point as the affinity setter functions need to
validate against the online cpu mask anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.328488490@linutronix.de
There is currently no way to evaluate the effective affinity mask of a
given interrupt. Many irq chips allow only a single target CPU or a subset
of CPUs in the affinity mask.
Updating the mask at the time of setting the affinity to the subset would
be counterproductive because information for cpu hotplug about assigned
interrupt affinities gets lost. On CPU hotplug it's also pointless to force
migrate an interrupt, which is not targeted at the CPU effectively. But
currently the information is not available.
Provide a seperate mask to be updated by the irq_chip->irq_set_affinity()
implementations. Implement the read only proc files so the user can see the
effective mask as well w/o trying to deduce it from /proc/interrupts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de
The proc file setup repeats the same ugly type cast for the irq number over
and over. Do it once and hand in the local void pointer.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.160866358@linutronix.de
All callers hand in GPF_KERNEL. No point to have an extra argument for
that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.082544752@linutronix.de
The third argument of the internal helper function is unused. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.004958600@linutronix.de
Now that x86 uses the generic code, the function declaration and inline
stub can move to the core internal header.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.928156166@linutronix.de
Set the force migration flag when migrating interrupts away from an
outgoing CPU.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.681874648@linutronix.de
Interrupts which cannot be migrated in process context, need to be masked
before the affinity is changed forcefully.
Add support for that. Will be compiled out for architectures which do not
have this x86 specific issue.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.604565591@linutronix.de
In order to move x86 to the generic hotplug migration code, add support for
cleaning up move in progress bits.
On architectures which have this x86 specific (mis)feature not enabled,
this is optimized out by the compiler.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.525817311@linutronix.de
Interrupts, which are shut down are tried to be migrated as well. That's
pointless because the interrupt cannot fire and the next startup will move
it to the proper place anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.447550992@linutronix.de
Move the checks for a valid irq chip and the irq_set_affinity() callback
right in front of the whole migration logic. No point in doing a gazillion
of other things when the interrupt cannot be migrated at all.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.354181630@linutronix.de
In case the affinity of an interrupt was broken, a printk is emitted.
But if the affinity cannot be set at all due to a missing
irq_set_affinity() callback or due to a failing callback, the message is
still printed preceeded by a warning/error.
That makes no sense whatsoever.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.274852976@linutronix.de
This is called from stop_machine() with interrupts disabled. No point in
disabling them some more.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.198042748@linutronix.de
So that the affinity code can reuse them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619235445.109426284@linutronix.de
The startup vs. setaffinity ordering of interrupts depends on the
IRQF_NOAUTOEN flag. Chained interrupts are not getting any affinity
assignment at all.
A regular interrupt is started up and then the affinity is set. A
IRQF_NOAUTOEN marked interrupt is not started up, but the affinity is set
nevertheless.
Move the affinity setup to startup_irq() so the ordering is always the same
and chained interrupts get the proper default affinity assigned as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.020534783@linutronix.de
Rename it with a proper irq_ prefix and make it available for other files
in the core code. Preparatory patch for moving the irq affinity setup
around.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.928501004@linutronix.de
No point to have this alloc/free dance of cpumasks. Provide a static mask
for setup_affinity() and protect it proper.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.851571573@linutronix.de
If an CPU goes offline, the interrupts are migrated away, but a eventually
pending interrupt move, which has not yet been made effective is kept
pending even if the outgoing CPU is the sole target of the pending affinity
mask. What's worse is, that the pending affinity mask is discarded even if
it would contain a valid subset of the online CPUs.
Implement a helper function which allows to avoid these issues.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.691345468@linutronix.de
Debugging (hierarchical) interupt domains is tedious as there is no
information about the hierarchy and no information about states of
interrupts in the various domain levels.
Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'.
The domains directory contains the domain files. The content is information
about the domain. If the domain is part of a hierarchy then the parent
domains are printed as well.
# ls /sys/kernel/debug/irq/domains/
default INTEL-IR-2 INTEL-IR-MSI-2 IO-APIC-IR-2 PCI-MSI
DMAR-MSI INTEL-IR-3 INTEL-IR-MSI-3 IO-APIC-IR-3 unknown-1
INTEL-IR-0 INTEL-IR-MSI-0 IO-APIC-IR-0 IO-APIC-IR-4 VECTOR
INTEL-IR-1 INTEL-IR-MSI-1 IO-APIC-IR-1 PCI-HT
# cat /sys/kernel/debug/irq/domains/VECTOR
name: VECTOR
size: 0
mapped: 216
flags: 0x00000041
# cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0
name: IO-APIC-IR-0
size: 24
mapped: 19
flags: 0x00000041
parent: INTEL-IR-3
name: INTEL-IR-3
size: 65536
mapped: 167
flags: 0x00000041
parent: VECTOR
name: VECTOR
size: 0
mapped: 216
flags: 0x00000041
Unfortunately there is no per cpu information about the VECTOR domain (yet).
The irqs directory contains detailed information about mapped interrupts.
# cat /sys/kernel/debug/irq/irqs/3
handler: handle_edge_irq
status: 0x00004000
istate: 0x00000000
ddepth: 1
wdepth: 0
dstate: 0x01018000
IRQD_IRQ_DISABLED
IRQD_SINGLE_TARGET
IRQD_MOVE_PCNTXT
node: 0
affinity: 0-143
effectiv: 0
pending:
domain: IO-APIC-IR-0
hwirq: 0x3
chip: IR-IO-APIC
flags: 0x10
IRQCHIP_SKIP_SET_WAKE
parent:
domain: INTEL-IR-3
hwirq: 0x20000
chip: INTEL-IR
flags: 0x0
parent:
domain: VECTOR
hwirq: 0x3
chip: APIC
flags: 0x0
This was developed to simplify the debugging of the managed affinity
changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.537566163@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a map counter instead of counting radix tree entries for
diagnosis. That also gives correct information for linear domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.459397746@linutronix.de
In order to provide proper debug interface it's required to have domain
names available when the domain is added. Non fwnode based architectures
like x86 have no way to do so.
It's not possible to use domain ops or host data for this as domain ops
might be the same for several instances, but the names have to be unique.
Extend the irqchip fwnode to allow transporting the domain name. If no node
is supplied, create a 'unknown-N' placeholder.
Warn if an invalid node is supplied and treat it like no node. This happens
e.g. with i2 devices on x86 which hand in an ACPI type node which has no
interface for retrieving the name.
[ Folded a fix from Marc to make DT name parsing work ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.588784933@linutronix.de
Prevent overwriting an already assigned domain name. Remove the extra check
for chip->name, because if domain->name is NULL overwriting it with NULL is
not a problem.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.510684976@linutronix.de
In case __irq_set_trigger() fails the resources requested via
irq_request_resources() are not released.
Add the missing release call into the error handling path.
Fixes: c1bacbae81 ("genirq: Provide irq_request/release_resources chip callbacks")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/655538f5-cb20-a892-ff15-fbd2dd1fa4ec@gmail.com
Shared interrupts do not go well with disabling auto enable:
1) The sharing interrupt might request it while it's still disabled and
then wait for interrupts forever.
2) The interrupt might have been requested by the driver sharing the line
before IRQ_NOAUTOEN has been set. So the driver which expects that
disabled state after calling request_irq() will not get what it wants.
Even worse, when it calls enable_irq() later, it will trigger the
unbalanced enable_irq() warning.
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dianders@chromium.org
Cc: jeffy <jeffy.chen@rock-chips.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: tfiga@chromium.org
Link: http://lkml.kernel.org/r/20170531100212.210682135@linutronix.de
If an interrupt is marked NOAUTOEN then request_irq() installs the action,
but does not enable the interrupt via startup_irq(). The interrupt is
enabled via enable_irq() later from the driver. enable_irq() calls
irq_enable().
That means that for interrupts which have a irq_startup() callback this
callback is never invoked. Neither is irq_domain_activate_irq() invoked for
such interrupts.
If an interrupt depends on irq_startup() or irq_domain_activate_irq() then
the enable via irq_enable() is not enough.
Add a status flag IRQD_IRQ_STARTED_UP and use this to select the proper
mechanism in enable_irq(). Use the flag also to avoid pointless calls into
the low level functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: dianders@chromium.org
Cc: jeffy <jeffy.chen@rock-chips.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: tfiga@chromium.org
Link: http://lkml.kernel.org/r/20170531100212.130986205@linutronix.de
The printk in early_irq_init() is cryptic and badly formatted:
NR_IRQS:33024 nr_irqs:968 16
The last number is the number of preallocated interrupts, so add a prefix
to it:
NR_IRQS: 33024, nr_irqs: 968, preallocated irqs: 16
Cleanup the formatting for better readability as well.
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1494318849-6733-1-git-send-email-vincent.legoll@gmail.com
In order to ease debug, let's populate the domain name upfront, before any
MSI gets requested. This allows the domain to appear in the
irq_domain_mapping, and the user to easily find the expected data.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-4-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If the system is using ACPI, there is no of_node to display. But ACPI can
use a struct irqchip_fwid as a domain identifier, and it can be used to
display the name contained in that structure.
The output on such a system will look like this:
pMSI 0 0 0 irqchip@00000000e1180000
MSI 37 0 0 irqchip@00000000e1180000
GICv2m 37 0 0 irqchip@00000000e1180000
GICv2 448 448 0 irqchip@ffff000008003000
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-3-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Hierarchical domains seem to be hard to grasp, and a number of
aspiring kernel hackers find them utterly discombobulating.
In order to ease their pain, let's make them appear in
/sys/kernel/debug/irq_domain_mapping, such as the following:
96 0x81808 MSI 0x (null) RADIX MSI
96+ 0x00063 GICv2m 0xffff8003ee116980 RADIX GICv2m
96+ 0x00063 GICv2 0xffff00000916bfd8 LINEAR GICv2
[output compressed to fit in a commit log]
This shows that IRQ96 is implemented by a stack of three domains,
the + sign indicating the stacking.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-2-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
min_vecs is the minimum amount of vectors needed to operate in MSI-X mode
which may just include the vectors that don't need affinity.
Disabling affinity settings causes the qla2xxx driver scsi_add_host() to fail
when blk_mq is enabled as the blk_mq_pci_map_queues() expects affinity masks
on each vector.
Fixes: dfef358bd1 ("PCI/MSI: Don't apply affinity if there aren't enough vectors left")
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # v4.10+
Pull irq fixes from Thomas Gleixner:
"A set of small fixes for the irq subsystem:
- Cure a data ordering problem with chained interrupts
- Three small fixlets for the mbigen irq chip"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Fix chained interrupt data ordering
irqchip/mbigen: Fix the clear register offset calculation
irqchip/mbigen: Fix potential NULL dereferencing
irqchip/mbigen: Fix memory mapping code
irq_set_chained_handler_and_data() sets up the chained interrupt and then
stores the handler data.
That's racy against an immediate interrupt which gets handled before the
store of the handler data happened. The handler will dereference a NULL
pointer and crash.
Cure it by storing handler data before installing the chained handler.
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
This book got converted from DocBook. Update its references to
point to the current location.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZEHmsAAoJEFmIoMA60/r88SgQAJbFddueb0+DfJ+USDud4b/Z
akfS+G1UAm+TgtMyh1wM49dHzFssp36uWJxtWI+bPqBzuy94PMCbz7JVUV28gX9G
tFhFuc5YH94I/3y85rbZnolb6uZN9MhLjzTFqDC9ilW6HFqmwK4t4wlHSCjQN1St
svLYvs2G6n6/VK3Fre7/wOvdZ1erG4Qod+kn5Tx3K5TQydmRlaSBfK+DRANuDBkM
KzGO7Bkc/Cx8hb9pHmaey/wxmNrrgmVjTtWrEnb2tEq833zP4h6GhUIJEKodMSi5
gXPNZgKlu3n5L592M0UCh4EoHejzkv9wrcsoDm+djmsc5Zg2Howq4kAdHP8k4hUG
0gt8n0ni9vhJN56jikrGi7cAdHCKSNnx2Ue/qTCbX0ncB3XUMuJxJwCsgW/6wa9f
oU7tRtTS03UltnKoFAcyYclS4TaSY4SA4ySaK6Hi+cRkdVFDdyHQYbHHNSU7MsA+
IS2tXvGoIdSYyrZMHSRcl2rRTfYQUkmPEvBF3LvqZr32M4mJMmUNAPLZaly373ZE
iwq0ZJlrLeM0cqdFIG3S60RtJyQk/HBN1NMqrYHArWOxvWIgNd5F8NCsTTxY3wU3
IxgBIuUFcbVwVkqEHGs8K5AvB3oghqdnA3eGOV79799eMtLn3LOvyIlpHMSw9WUq
ags00JtMLitfNPBH3eSl
=eE4D
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add framework for supporting PCIe devices in Endpoint mode (Kishon
Vijay Abraham I)
- use non-postable PCI config space mappings when possible (Lorenzo
Pieralisi)
- clean up and unify mmap of PCI BARs (David Woodhouse)
- export and unify Function Level Reset support (Christoph Hellwig)
- avoid FLR for Intel 82579 NICs (Sasha Neftin)
- add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)
- short-circuit config access failures for disconnected devices (Keith
Busch)
- remove D3 sleep delay when possible (Adrian Hunter)
- freeze PME scan before suspending devices (Lukas Wunner)
- stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)
- disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)
- add arch-specific alignment control to improve device passthrough by
avoiding multiple BARs in a page (Yongji Xie)
- add sysfs sriov_drivers_autoprobe to control VF driver binding
(Bodong Wang)
- allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)
- fix crashes when unbinding host controllers that don't support
removal (Brian Norris)
- add driver for MicroSemi Switchtec management interface (Logan
Gunthorpe)
- add driver for Faraday Technology FTPCI100 host bridge (Linus
Walleij)
- add i.MX7D support (Andrey Smirnov)
- use generic MSI support for Aardvark (Thomas Petazzoni)
- make Rockchip driver modular (Brian Norris)
- advertise 128-byte Read Completion Boundary support for Rockchip
(Shawn Lin)
- advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)
- convert atomic_t to refcount_t in HV driver (Elena Reshetova)
- add CPU IRQ affinity in HV driver (K. Y. Srinivasan)
- fix PCI bus removal in HV driver (Long Li)
- add support for ThunderX2 DMA alias topology (Jayachandran C)
- add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)
- add ITE 8893 bridge DMA alias quirk (Jarod Wilson)
- restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
(Manish Jaggi)
* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
PCI: Don't allow unbinding host controllers that aren't prepared
ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
MAINTAINERS: Add PCI Endpoint maintainer
Documentation: PCI: Add userguide for PCI endpoint test function
tools: PCI: Add sample test script to invoke pcitest
tools: PCI: Add a userspace tool to test PCI endpoint
Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
misc: Add host side PCI driver for PCI test function device
PCI: Add device IDs for DRA74x and DRA72x
dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
PCI: dwc: dra7xx: Workaround for errata id i870
dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
PCI: dwc: dra7xx: Add EP mode support
PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
dt-bindings: PCI: Add DT bindings for PCI designware EP mode
PCI: dwc: designware: Add EP mode support
Documentation: PCI: Add binding documentation for pci-test endpoint function
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: imx6: Fix spelling mistake: "contol" -> "control"
...
Pull irq updates from Thomas Gleixner:
"Nothing exciting from the irq side for this merge window:
- a new driver for a Mediatek SoC
- ACPI support for ARM GICV3
- support for shared nested interrupts
- the usual pile of fixes and updates all over te place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
irqchip/mbigen: Fix return value check in mbigen_device_probe()
irqchip/mips-gic: Replace static map with dynamic
irqchip/mips-gic: Remove device IRQ domain
irqchip/mips-gic: Separate IPI reservation & usage tracking
genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs
genirq: Use cpumask_available() for check of cpumask variable
cpumask: Add helper cpumask_available()
irqchip/irq-imx-gpcv2: Clear OF_POPULATED flag
irqchip/atmel-aic5: Handle suspend to RAM
irqchip: Add Mediatek mtk-cirq driver
dt-bindings: mtk-cirq: Add binding document
irqchip/gic-v3-its: Add IORT hook for platform MSI support
irqchip/mbigen: Add ACPI support
irqchip/mbigen: Introduce mbigen_of_create_domain()
irqchip/mbigen: Drop module owner
platform-msi: Make platform_msi_create_device_domain() ACPI aware
irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi domain
irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare for ACPI
irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare()
irqchip/gic-v3-its: Keep the include header files in alphabetic order
...
The vectors_per_node is calculated from the remaining available vectors.
The current vector starts after pre_vectors, so we need to subtract that
from the current to properly account for the number of remaining vectors
to assign.
Fixes: 3412386b53 ("irq/affinity: Fix extra vecs calculation")
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Link: http://lkml.kernel.org/r/1492645870-13019-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This allows callers to get back at them instead of having to store it in
another variable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
flags get filled with the trigger type from the irq_data:
if (!(new->flags & IRQF_TRIGGER_MASK))
new->flags |= irqd_get_trigger_type(&desc->irq_data);
On the first setup_irq() the trigger type in irq_data is NONE when the
above code executes, then the irq is started up for the first time and
then the actual trigger type gets established, but that's too late to fix
up new->flags.
When then a second user of the irq requests the irq with IRQF_TRIGGER_NONE
its irqaction's triggertype gets set to the actual trigger type and the
following check fails:
if (!((old->flags ^ new->flags) & IRQF_TRIGGER_MASK))
Resulting in the request_irq failing with -EBUSY even though both
users requested the irq with IRQF_SHARED | IRQF_TRIGGER_NONE
Fix this by comparing the new irqaction's trigger type to the trigger type
stored in the irq_data which correctly reflects the actual trigger type
being used for the irq.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170415100831.17073-1-hdegoede@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This fixes the following clang warning when CONFIG_CPUMASK_OFFSTACK=n:
kernel/irq/manage.c:839:28: error: address of array
'desc->irq_common_data.affinity' will always evaluate to 'true'
[-Werror,-Wpointer-bool-conversion]
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Grant Grundler <grundler@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: Michael Davidson <md@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170412182030.83657-2-mka@chromium.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This fixes a math error calculating the extra_vecs. The error assumed
only 1 cpu per vector, but the value needs to account for the actual
number of cpus per vector in order to get the correct remainder for
extra CPU assignment.
Fixes: 7bf8222b9b ("irq/affinity: Fix CPU spread for unbalanced nodes")
Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Link: http://lkml.kernel.org/r/1492104492-19943-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq_create_affinity_masks routine is responsible for assigning a
number of interrupt vectors to CPUs. The optimal assignemnet will spread
requested vectors to all CPUs, with the fewest CPUs sharing a vector.
The algorithm may fail to assign some vectors to any CPUs if a node's
CPU count is lower than the average number of vectors per node. These
vectors are unusable and create an un-optimal spread.
Recalculate the number of vectors to assign at each node iteration by using
the remaining number of vectors and nodes to be assigned, not exceeding the
number of CPUs in that node. This will guarantee that every CPU is assigned
at least one vector.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Link: http://lkml.kernel.org/r/1491247553-7603-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On a specific audio system an interrupt input of an audio CODEC is used as a
shared interrupt. That interrupt input is handled by a CODEC specific irq
chip driver and triggers a CPU interrupt via the CODEC irq output line.
The CODEC interrupt handler demultiplexes the CODEC interrupt inputs and
the interrupt handlers for these demultiplexed inputs run nested in the
context of the CODEC interrupt handler.
The demultiplexed interrupts use handle_nested_irq() as their interrupt
handler, which unfortunately has no support for shared interrupts. So the
above hardware cannot be supported.
Add shared interrupt support to handle_nested_irq() by iterating over the
interrupt action chain.
[ tglx: Massaged changelog ]
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Cc: patches@opensource.wolfsonmicro.com
Link: http://lkml.kernel.org/r/1488904098-5350-1-git-send-email-ckeepax@opensource.wolfsonmicro.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
But first update usage sites with the new header dependency.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to move scheduler ABI details to <uapi/linux/sched/types.h>,
which will be used from a number of .c files.
Create empty placeholder header that maps to <linux/types.h>.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix typos and add the following to the scripts/spelling.txt:
an user||a user
an userspace||a userspace
I also added "userspace" to the list since it is a common word in Linux.
I found some instances for "an userfaultfd", but I did not add it to the
list. I felt it is endless to find words that start with "user" such as
"userland" etc., so must draw a line somewhere.
Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The changes include:
* KVM PCIe/MSI passthrough support on ARM/ARM64
* Introduction of a core representation for individual hardware
iommus
* Support for IOMMU privileged mappings as supported by some
ARM IOMMUS
* 16-bit SID support for ARM-SMMUv2
* Stream table optimization for ARM-SMMUv3
* Various fixes and other small improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYqw3hAAoJECvwRC2XARrjPy0P/35ykfHIAESJuF+72ziaoAYA
ZvMrli8rGq7n+ntaIGPx9rV+hZTUSF8V2bfsYV7SAn5iYuViXZqvOtC3BAEp6GNC
cdMeQfqXoHiWVMdXdOihzk+6YCQvBxqPOvUtYFqVhOo3Yrz8Dc71KsKvrTndEUVY
f7bXHKssVONkWMga9lIVDgEefG5VyJPEQaxJXB9ymLHXbwWOcISe1lgtkrzFSxSH
H9YNI07Tfcxfn6rN8jGmcYFYM58xwBicpB4HBw5uytMBYAsxqTEsx4X5dGpOF6RH
cFW9nby+9ZlcTMyuWXKAck3o8df2ZC1xiSjnz+DHQdBPFiFNqIL3PVUcaz9PnF2e
e6Y+DA3s+jykeiCvi2K0Z9RwTg7t8S5spel+UCeNVSnIjE9pqZNLF8vsDjF17zuR
+zcFm7RVI397QVQGp0dbqhtxnwqt/3CX/wlzpvuNdEZa4vwujpcnM9tfl6gyFrF8
awK9Fj5ryAn4DEiM+8yiRHwLrU5ij1cfc8jQdqleEB2ca7Wv3g1uhhS0QTXOFY9u
A7ygOna25U1EcOwjC6ebjiEL115ZEOrXo+eChhzCHoUEHCVxL+L/NAMEsUcMqPIw
3XsHhru0HbXgd5O5wHX39s2je8G3+ElqQwy8Ja3DimV6tvon7yaKCXy9QU+2aa1u
3r53R/0mW1ijtOfK+I0b
=5b3I
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU UPDATES from Joerg Roedel:
- KVM PCIe/MSI passthrough support on ARM/ARM64
- introduction of a core representation for individual hardware iommus
- support for IOMMU privileged mappings as supported by some ARM IOMMUS
- 16-bit SID support for ARM-SMMUv2
- stream table optimization for ARM-SMMUv3
- various fixes and other small improvements
* tag 'iommu-updates-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (61 commits)
vfio/type1: Fix error return code in vfio_iommu_type1_attach_group()
iommu: Remove iommu_register_instance interface
iommu/exynos: Make use of iommu_device_register interface
iommu/mediatek: Make use of iommu_device_register interface
iommu/msm: Make use of iommu_device_register interface
iommu/arm-smmu: Make use of the iommu_register interface
iommu: Add iommu_device_set_fwnode() interface
iommu: Make iommu_device_link/unlink take a struct iommu_device
iommu: Add sysfs bindings for struct iommu_device
iommu: Introduce new 'struct iommu_device'
iommu: Rename struct iommu_device
iommu: Rename iommu_get_instance()
iommu: Fix static checker warning in iommu_insert_device_resv_regions
iommu: Avoid unnecessary assignment of dev->iommu_fwspec
iommu/mediatek: Remove bogus 'select' statements
iommu/dma: Remove bogus dma_supported() implementation
iommu/ipmmu-vmsa: Restrict IOMMU Domain Geometry to 32-bit address space
iommu/vt-d: Don't over-free page table directories
iommu/vt-d: Tylersburg isoch identity map check is done too late.
iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu
...
Pull irq updates from Thomas Gleixner:
"This update provides:
- Yet another two irq controller chip drivers
- A few updates and fixes for GICV3
- A resource managed function for interrupt allocation
- Fixes, updates and enhancements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/qcom: Fix error handling
genirq: Clarify logic calculating bogus irqreturn_t values
genirq/msi: Add stubs for get_cached_msi_msg/pci_write_msi_msg
genirq/devres: Use dev_name(dev) as default for devname
genirq: Fix /proc/interrupts output alignment
irqdesc: Add a resource managed version of irq_alloc_descs()
irqchip/gic-v3-its: Zero command on allocation
irqchip/gic-v3-its: Fix command buffer allocation
irqchip/mips-gic: Fix local interrupts
irqchip: Add a driver for Cortina Gemini
irqchip: DT bindings for Cortina Gemini irqchip
irqchip/gic-v3: Remove duplicate definition of GICD_TYPER_LPIS
irqchip/gic-v3-its: Rename MAPVI to MAPTI
irqchip/gic-v3-its: Drop deprecated GITS_BASER_TYPE_CPU
irqchip/gic-v3-its: Refactor command encoding
irqchip/gic-v3-its: Enable cacheable attribute Read-allocate hints
irqchip/qcom: Add IRQ combiner driver
ACPI: Add support for ResourceSource/IRQ domain mapping
ACPI: Generic GSI: Do not attempt to map non-GSI IRQs during bus scan
irq/platform-msi: Fix comment about maximal MSIs
Although irqreturn_t is an enum, we treat it (and its enumeration
constants) as a bitmask.
However, bad_action_ret() uses a less-than operator to determine whether
an irqreturn_t falls within allowable bit values, which means we need to
know the signededness of an enum type to read the logic, which is
implementation-dependent.
This change explicitly uses an unsigned type for the comparison. We do
this instead of changing to a bitwise test, as the latter compiles to
increased instructions in this hot path.
It looks like we get the correct behaviour currently (bad_action_ret(-1)
returns 1), so this is purely a readability fix.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Link: http://lkml.kernel.org/r/1487219049-4061-1-git-send-email-jk@ozlabs.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Allow the devname parameter to be NULL and use dev_name(dev) in this case.
This should be an appropriate default for most use cases.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: http://lkml.kernel.org/r/05c63d67-30b4-7026-02d5-ce7fb7bc185f@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If the irq_desc being output does not have a domain associated the
information following the 'name' is not aligned correctly.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Link: http://lkml.kernel.org/r/20170210165416.5629-1-hsweeten@visionengravers.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since commit f3b0946d62 ("genirq/msi: Make sure PCI MSIs are
activated early"), we can end-up activating a PCI/MSI twice (once
at allocation time, and once at startup time).
This is normally of no consequences, except that there is some
HW out there that may misbehave if activate is used more than once
(the GICv3 ITS, for example, uses the activate callback
to issue the MAPVI command, and the architecture spec says that
"If there is an existing mapping for the EventID-DeviceID
combination, behavior is UNPREDICTABLE").
While this could be worked around in each individual driver, it may
make more sense to tackle the issue at the core level. In order to
avoid getting in that situation, let's have a per-interrupt flag
to remember if we have already activated that interrupt or not.
Fixes: f3b0946d62 ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-and-tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1484668848-24361-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This new function checks whether all MSI irq domains
implement IRQ remapping. This is useful to understand
whether VFIO passthrough is safe with respect to interrupts.
On ARM typically an MSI controller can sit downstream
to the IOMMU without preventing VFIO passthrough.
As such any assigned device can write into the MSI doorbell.
In case the MSI controller implements IRQ remapping, assigned
devices will not be able to trigger interrupts towards the
host. On the contrary, the assignment must be emphasized as
unsafe with respect to interrupts.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now we have a flag value indicating an IRQ domain implements MSI,
let's set it on msi_create_irq_domain().
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We introduce two new enum values for the irq domain flag:
- IRQ_DOMAIN_FLAG_MSI indicates the irq domain corresponds to
an MSI domain
- IRQ_DOMAIN_FLAG_MSI_REMAP indicates the irq domain has MSI
remapping capabilities.
Those values will be useful to check all MSI irq domains have
MSI remapping support when assessing the safety of IRQ assignment
to a guest.
irq_domain_hierarchical_is_msi_remap() allows to check if an
irq domain or any parent implements MSI remapping.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com>
Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 34c3d9819f ("genirq/affinity: Provide smarter irq spreading
infrastructure") introduced a better IRQ spreading mechanism, taking
account of the available NUMA nodes in the machine.
Problem is that the algorithm of retrieving the nodemask iterates
"linearly" based on the number of online nodes - some architectures
present non-linear node distribution among the nodemask, like PowerPC.
If this is the case, the algorithm lead to a wrong node count number
and therefore to a bad/incomplete IRQ affinity distribution.
For example, this problem were found in a machine with 128 CPUs and two
nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
distributed). This led to a wrong affinity distribution which then led to
a bad mq allocation for nvme driver.
Finally, we take the opportunity to fix a comment regarding the affinity
distribution when we have _more_ nodes than vectors.
Fixes: 34c3d9819f ("genirq/affinity: Provide smarter irq spreading infrastructure")
Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: linux-pci@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: hch@lst.de
Link: http://lkml.kernel.org/r/1481738472-2671-1-git-send-email-gpiccoli@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull irq updates from Thomas Gleixner:
"The irq department provides:
- a major update to the auto affinity management code, which is used
by multi-queue devices
- move of the microblaze irq chip driver into the common driver code
so it can be shared between microblaze, powerpc and MIPS
- a series of updates to the ARM GICV3 interrupt controller
- the usual pile of fixes and small improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
powerpc/virtex: Use generic xilinx irqchip driver
irqchip/xilinx: Try to fall back if xlnx,kind-of-intr not provided
irqchip/xilinx: Add support for parent intc
irqchip/xilinx: Rename get_irq to xintc_get_irq
irqchip/xilinx: Restructure and use jump label api
irqchip/xilinx: Clean up print messages
microblaze/irqchip: Move intc driver to irqchip
ARM: virt: Select ARM_GIC_V3_ITS
ARM: gic-v3-its: Add 32bit support to GICv3 ITS
irqchip/gic-v3-its: Specialise readq and writeq accesses
irqchip/gic-v3-its: Specialise flush_dcache operation
irqchip/gic-v3-its: Narrow down Entry Size when used as a divider
irqchip/gic-v3-its: Change unsigned types for AArch32 compatibility
irqchip/gic-v3: Use nops macro for Cavium ThunderX erratum 23154
irqchip/gic-v3: Convert arm64 GIC accessors to {read,write}_sysreg_s
genirq/msi: Drop artificial PCI dependency
irqchip/bcm7038-l1: Implement irq_cpu_offline() callback
genirq/affinity: Use default affinity mask for reserved vectors
genirq/affinity: Take reserved vectors into account when spreading irqs
PCI: Remove the irq_affinity mask from struct pci_dev
...
The generic MSI layer doesn't have any PCI ties anymore, and the
build hack should have been removed some time ago.
Fixes: d9109698be ("genirq: Introduce msi_domain_alloc/free_irqs()")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1479806476-20801-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The reserved vectors at the beginning and the end of the vector space get
cpu_possible_mask assigned as their affinity mask.
All other non-auto affine interrupts get the default irq affinity mask
assigned. Using cpu_possible_mask breaks that rule.
Treat them like any other interrupt and use irq_default_affinity as target
mask.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
The recent addition of reserved vectors at the beginning or the end of the
vector space did not take the reserved vectors at the beginning into
account for the various loop exit conditions. As a consequence the last
vectors of the spread area are not included into the spread algorithm and
are treated like the reserved vectors at the end of the vector space and
get the default affinity mask assigned.
Sum up the affinity vectors and the reserved vectors at the beginning and
use the sum as exit condition.
[ tglx: Fixed all conditions instead of only one and massaged changelog ]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/1479201178-29604-2-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Only calculate the affinity for the main I/O vectors, and skip the
pre or post vectors specified by struct irq_affinity.
Also remove the irq_affinity cpumask argument that has never been used.
If we ever need it in the future we can pass it through struct
irq_affinity.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/1478654107-7384-4-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Only calculate the affinity for the main I/O vectors, and skip the pre or
post vectors specified by struct irq_affinity.
Also remove the irq_affinity cpumask argument that has never been used. If
we ever need it in the future we can pass it through struct irq_affinity.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/1478654107-7384-3-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The type flags in the irq descriptor are there for historical reasons and
only updated via irq_modify_status() or irq_set_type(). Both functions also
update the type flags in irqdata. __setup_irq() is the only left over user
of the type flags in the irq descriptor.
If __setup_irq() is called with empty irq type flags, then the type flags
are retrieved from irqdata. If an interrupt is shared, then the type flags
are compared with the type flags stored in the irq descriptor.
On x86 the ioapic does not have a irq_set_type() callback because the type
is defined in the BIOS tables and cannot be changed. The type is stored in
irqdata at setup time without updating the type data in the irq
descriptor. As a result the comparison described above fails.
There is no point in updating the irq descriptor flags because the only
relevant storage is irqdata. Use the type flags from irqdata for both
retrieval and comparison in __setup_irq() instead.
Aside of that the print out in case of non matching type flags has the old
and new type flags arguments flipped. Fix that as well.
For correctness sake the flags stored in the irq descriptor should be
removed, but this is beyond the scope of this bugfix and will be done in a
later patch.
Fixes: 4b357daed6 ("genirq: Look-up trigger type if not specified by caller")
Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1611072020360.3501@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The TPS65217 driver grew interrupt support which uses
irq_set_parent(). While it's not yet clear why this is used in the first
place, building the driver as a module fails with:
ERROR: ".irq_set_parent" [drivers/mfd/tps65217.ko] undefined!
The correctness of the driver change is still investigated, but for now
it's less trouble to export irq_set_parent() than dealing with the build
wreckage.
[ tglx: Rewrote changelog and made the export GPL ]
Fixes: 6556bdacf6 ("mfd: tps65217: Add support for IRQs")
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Marcin Niestroj <m.niestroj@grinn-global.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Lee Jones <lee.jones@linaro.org>
Link: http://lkml.kernel.org/r/1475775403-27207-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fixes the following sparse warning:
kernel/irq/chip.c:786:1: warning:
symbol '__irq_do_set_handler' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: http://lkml.kernel.org/r/1474817799-18676-1-git-send-email-weiyj.lk@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is no point in trying to configure the trigger of a chained
interrupt if no trigger information has been configured. At best
this is ignored, and at the worse this confuses the underlying
irqchip (which is likely not to handle such a thing), and
unnecessarily alarms the user.
Only apply the configuration if type is not IRQ_TYPE_NONE.
Fixes: 1e12c4a939 ("genirq: Correctly configure the trigger on chained interrupts")
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/CAMuHMdVW1eTn20=EtYcJ8hkVwohaSuH_yQXrY2MGBEvZ8fpFOg@mail.gmail.com
Link: http://lkml.kernel.org/r/1474274967-15984-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current irq spreading infrastructure is just looking at a cpumask and
tries to spread the interrupts over the mask. Thats suboptimal as it does
not take numa nodes into account.
Change the logic so the interrupts are spread across numa nodes and inside
the nodes. If there are more cpus than vectors per node, then we set the
affinity to several cpus. If HT siblings are available we take that into
account and try to set all siblings to a single vector.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: axboe@fb.com
Cc: keith.busch@intel.com
Cc: agordeev@redhat.com
Cc: linux-block@vger.kernel.org
Link: http://lkml.kernel.org/r/1473862739-15032-3-git-send-email-hch@lst.de
For irq spreading want to store affinity masks in the msi_entry. Add the
infrastructure for it.
We allocate an array of cpumasks with an array size of the number of used
vectors in the entry, so we can hand in the information per linux interrupt
later.
As we hand in the number of used vectors, we assign them right
away. Convert all the call sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: axboe@fb.com
Cc: keith.busch@intel.com
Cc: agordeev@redhat.com
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/1473862739-15032-2-git-send-email-hch@lst.de
- ACPI IORT core code
- IORT support for the GICv3 ITS
- A few of GIC cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX2VxlAAoJECPQ0LrRPXpD7tMP/i696FakX2xYknlVNaBy5vww
eH1TLVpItkcfUCHbOyNzJcLcVVRnC4JuaiCphtgDQ0Zm4HJsl8LRFoKF9iBF/Wmt
CvL5YLfkl6ziMZyPa23a9ApH5gKTY2QzirDAl+noF+N6tSsmz5JCXArW0YFawZJs
o1tmh/XX0xb9cqB5f/jISxTNF7rPw/Dc1sDY3/p7DUch2TDjuTLOQljnJ2EFb8Mh
QltBs9EbklYCKaSBVIHXmhAOBCaW8Nwm2BNscgNEQAH1EtjBjGK/aqmFqGzxBWms
wXr8GHSNhnVsgpOzalG6yJzhtWcj4KNf1utZaNc0L8dT1bVc0yJEEUkxOEbi4pIO
sst+BWo2FAe/cDyWWxwiSBLaO7M5SCTvsBN25AYMfZw9AU6pw6SMCdbm6BCWSPmh
YUW7khrXObtNTl+0lroi/mlmIE3sq+UVpQWDzfwsvfWb1kgqIif2Fd6TqU+Jodl5
pK/UzMHP3YQnAZjcn6Yz4sEWkAGgK8RrIPje1Th0mxi20pyx8VbAucwfSCUi5hza
9mbaaidnLRvDVX38TNs/LzOejwfW3JKDJUPy9jLg3l07Fgron+jfLtCZOEm9XfJH
nj/MN3g+yil9OeosJ+6NfzZKonJrfccpkVKkZrNT+ReeM5YFQylgX0OHRQT9eUDo
z3P5/VhsoTcyqynxMcRp
=Xhg0
-----END PGP SIGNATURE-----
Merge tag 'irqchip-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Merge the first drop of irqchip updates for 4.9 from Marc Zyngier:
- ACPI IORT core code
- IORT support for the GICv3 ITS
- A few of GIC cleanups
Information about interrupts is exposed via /proc/interrupts, but the
format of that file has changed over kernel versions and differs across
architectures. It also has varying column numbers depending on hardware.
That all makes it hard for tools to parse.
To solve this, expose the information through sysfs so each irq attribute
is in a separate file in a consistent, machine parsable way.
This feature is only available when both CONFIG_SPARSE_IRQ and
CONFIG_SYSFS are enabled.
Examples:
/sys/kernel/irq/18/actions: i801_smbus,ehci_hcd:usb1,uhci_hcd:usb7
/sys/kernel/irq/18/chip_name: IR-IO-APIC
/sys/kernel/irq/18/hwirq: 18
/sys/kernel/irq/18/name: fasteoi
/sys/kernel/irq/18/per_cpu_count: 0,0
/sys/kernel/irq/18/type: level
/sys/kernel/irq/25/actions: ahci0
/sys/kernel/irq/25/chip_name: IR-PCI-MSI
/sys/kernel/irq/25/hwirq: 512000
/sys/kernel/irq/25/name: edge
/sys/kernel/irq/25/per_cpu_count: 29036,0
/sys/kernel/irq/25/type: edge
[ tglx: Moved kobject_del() under sparse_irq_lock, massaged code comments
and changelog ]
Signed-off-by: Craig Gallek <kraig@google.com>
Cc: David Decotigny <decot@google.com>
Link: http://lkml.kernel.org/r/1473783291-122873-1-git-send-email-kraigatgoog@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some callers of __irq_set_trigger() masks all flags except trigger mode
flags. This is unnecessary, ase __irq_set_trigger() already does this
before usage of flags.
[ tglx: Moved the flag mask and adjusted comment. Removed the hunk in
enable_percpu_irq() as it is required there ]
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: http://lkml.kernel.org/r/20160719095408.13778-1-kuleshovmail@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 1bf4ddc46c ("irqdomain: Introduce irq_domain_create_{linear,
tree}") introduced the use of fwnode_handle to identify the interrupt
controller when calling __irq_domain_add but missed updating the kernel
doc parameters for the function.
Update this comment. While we are touching this code, also consolidate
the declaration and assignment of of_node.
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zygnier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1464699409-23113-1-git-send-email-punit.agrawal@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Most (if not all) code here implicitly assumes that the maximum number of
IRQs per chip will be 32, and thus uses 'u32' or 'unsigned long' for many
tasks (for example "struct irq_data" declares its 'mask' field as 'u32',
and "struct irq_chip_generic" declares its 'installed' field as 'unsigned
long')
However, there is no check to verify that irqs_per_chip is <= 32. Hence,
calling irq_alloc_domain_generic_chips() with a bigger value will result in
unexpected results.
Provide a wrapper with a MAYBE_BUILD_BUG_ON(nrirqs >= 32) to catch such
cases.
[ tglx: Reduced changelog to the essential information ]
Signed-off-by: Sebastian Frias <sf84@laposte.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mason <slash.tmp@free.fr>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/57B31D94.5040701@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
According to the xlate() callback definition, the 'out_type' parameter
needs to be the "linux irq type".
A mask for such bits exists, IRQ_TYPE_SENSE_MASK, which is correctly
applied in irq_domain_xlate_twocell()
So use it for irq_domain_xlate_onetwocell() as well.
Signed-off-by: Sebastian Frias <sf84@laposte.net>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mason <slash.tmp@free.fr>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/57A05F5D.103@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Without this patch irq_domain_disassociate() cannot properly release the
interrupt. In fact, irq_map_generic_chip() checks a bit on 'gc->installed'
but said bit is never cleared, only set.
Commit 088f40b7b0 ("genirq: Generic chip: Add linear irq domain support")
added irq_map_generic_chip() function and also stated "This lacks a removal
function for now".
This commit provides an implementation of an unmap function that can be
called by irq_domain_disassociate().
[ tglx: Made the function static and removed the export as we have neither
a prototype nor a modular user. ]
Fixes: 088f40b7b0 ("genirq: Generic chip: Add linear irq domain support")
Signed-off-by: Sebastian Frias <sf84@laposte.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mason <slash.tmp@free.fr>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/579F5C5A.2070507@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_map_generic_chip() contains about the same code as
irq_get_domain_generic_chip() except for the return values.
Split out the irq_get_domain_generic_chip() implementation so it can be
reused.
[ tglx: Removed the extra churn in irq_get_domain_generic_chip() callers
and massaged changelog ]
Signed-off-by: Sebastian Frias <sf84@laposte.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mason <slash.tmp@free.fr>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/579F5C69.8070006@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The percpu_devid handler is not robust against spurious interrupts. If a
spurious interrupt happens and no action is installed then the handler
crashes with a NULL pointer dereference.
Add a sanity check for this and log the wreckage once in dmesg.
Reported-by: Majun <majun258@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: guohanjun@huawei.com
Cc: dingtianhong@huawei.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1609021436160.5647@nanos
Without locking out CPU mask operations we might end up with an inconsistent
view of the cpumask in the function.
Fixes: 5e385a6ef3: "genirq: Add a helper to spread an affinity mask for MSI/MSI-X vectors"
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/1470924405-25728-1-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Obviously we should free action here if irq_chip_pm_get failed.
Fixes: be45beb2df: "genirq: Add runtime power management support for IRQ chips"
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1471854112-13006-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
moved the trigger configuration call from the irqdomain mapping to
the interrupt being actually requested.
This patch failed to handle the case where we configure a chained
interrupt, which doesn't get requested through the usual path.
In order to solve this, let's call __irq_set_trigger just before
starting the cascade interrupt. Special care must be taken to
make the flow handler stick, as the .irq_set_type method could
have reset it (it doesn't know we're dealing with a chained
interrupt).
Based on an initial patch by Jon Hunter.
Fixes: 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Bharat Kumar Gogada reported issues with the generic MSI code, where the
end-point ended up with garbage in its MSI configuration (both for the vector
and the message).
It turns out that the two MSI paths in the kernel are doing slightly different
things:
generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI
And it turns out that end-points are allowed to latch the content of the MSI
configuration registers as soon as MSIs are enabled. In Bharat's case, the
end-point ends up using whatever was there already, which is not what you
want.
In order to make things converge, we introduce a new MSI domain flag
(MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for PCI/MSI. When set,
this flag forces the programming of the end-point as soon as the MSIs are
allocated.
A consequence of this is that we have an extra activate in irq_startup, but
that should be without much consequence.
tglx:
- Several people reported a VMWare regression with PCI/MSI-X passthrough. It
turns out that the patch also cures that issue.
- We need to have a look at the MSI disable interrupt path, where we write
the msg to all zeros without disabling MSI in the PCI device. Is that
correct?
Fixes: 52f518a3a7 "x86/MSI: Use hierarchical irqdomains to manage MSI interrupts"
Reported-and-tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Reported-and-tested-by: Foster Snowhill <forst@forstwoof.ru>
Reported-by: Matthias Prager <linux@matthiasprager.de>
Reported-by: Jason Taylor <jason.taylor@simplivity.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1468426713-31431-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The new affinity hint argument of __irq_domain_alloc_irqs() is missing in
irq_reserve_ipi(). Add it.
This fixes the following compilation error:
kernel/irq/ipi.c: In function ‘irq_reserve_ipi’:
kernel/irq/ipi.c:85:9: error: too few arguments to function ‘__irq_domain_alloc_irqs’
virq = __irq_domain_alloc_irqs(domain, virq, nr_irqs, NUMA_NO_NODE,
^
Fixes: 06ee6d571f ("genirq: Add affinity hint to irq allocation")
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Cc: linux-pci@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If an irq_domain is auto-recursive and irq_domain_alloc_irqs_recursive()
for its parent has returned an error, then do return and avoid calling
irq_domain_free_irqs_recursive() uselessly, because:
- if domain->ops->alloc() had failed for an auto-recursive irq_domain,
then irq_domain_free_irqs_recursive() had already been called;
- if domain->ops->alloc() had failed for a not auto-recursive irq_domain,
then there is nothing to free at all.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1467505448-2850-1-git-send-email-alex.popov@linux.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
virq is not required to be the same for all msi descs. Use the base irq number
from the desc in the debug printk.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the affinity hint in the irqdesc allocator. The hint is used to determine
the node for the allocation and to set the affinity of the interrupt.
If multiple interrupts are allocated (multi-MSI) then the allocator iterates
over the cpumask and for each set cpu it allocates on their node and sets the
initial affinity to that cpu.
If a single interrupt is allocated (MSI-X) then the allocator uses the first
cpu in the mask to compute the allocation node and uses the mask for the
initial affinity setting.
Interrupts set up this way are marked with the AFFINITY_MANAGED flag to
prevent userspace from messing with their affinity settings.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: axboe@fb.com
Cc: agordeev@redhat.com
Link: http://lkml.kernel.org/r/1467621574-8277-5-git-send-email-hch@lst.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The function irq_create_of_mapping() is used to create an interrupt
mapping. However, depending on whether the irqdomain, to which the
interrupt belongs, is part of a hierarchy, determines whether the
mapping is created via calling irq_domain_alloc_irqs() or
irq_create_mapping().
To dispose of the interrupt mapping, drivers call irq_dispose_mapping().
However, this function does not check to see if the irqdomain is part
of a hierarchy or not and simply assumes that it was mapped via calling
irq_create_mapping() so calls irq_domain_disassociate() to unmap the
interrupt.
Fix this by checking to see if the irqdomain is part of a hierarchy and
if so call irq_domain_free_irqs() to free/unmap the interrupt.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1466501002-16368-1-git-send-email-jonathanh@nvidia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds a software irq handler for controllers that multiplex
interrupts from multiple devices, but don't know which device generated
the interrupt. For these devices, the irq handler that demuxes must
check every action for every software irq using the same h/w irq in order
to find out which device generated the interrupt. This will inevitably
trigger spurious interrupt detection if we are noting the irq.
The new irq handler does not track the handling for spurious interrupt
detection. An irq that uses this also won't get stats tracked since it
didn't generate the interrupt, nor added to randomness since they are
not random.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: Jon Derrick <jonathan.derrick@intel.com>
Link: http://lkml.kernel.org/r/1466200821-29159-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Fix a few bugs in configuring the default trigger from the irqdomain layer
- Make the genirq layer PM aware
- Add PM capability to the ARM GIC driver
- Add support for 2-level translation tables to the GICv3 ITS driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXXqmFAAoJECPQ0LrRPXpDQa0QAJZGNjbgXHcIBx9TMTjOI3um
czWPUyzLxH+rXiyuFhn5iBgK/dzrBQhsV+0YHSBmGhRrnjKLankcKWjwuLbjYNik
4lRlIGhNcbJbIsU3CQ/1P9HlxrTel2GaSCCLffTkcq1XYEBxsCGyeXKWW71AE5fi
3H2nHtRKSKb9nn+8CcDro29WQhnoKPycuRpFO1a96CNF3HI7fTlRAcEY6cf8e97J
sak6O0s2Yo4Kf8xLJFLhN299LsWvhjtZPnzSoDrHjAYP5jF46Jp8Ku044jFKrV8P
hPMhAl3yC4DINZjfm4dVKBTwTnBOpLUkYcXAP+Yob9+mW2NnDtKDBfrIH7+1g3Ca
2keZ7waTN8P0ZlRPtrg56R+v7i37Lg9XifUcesAMswuWIVXsF8gE9rnAA20pgYI9
g9Vmq+fhaYTLl0VSc8FLyQMJi4BOIPeJcf2ZN+o9sfoXdaf030Afnpu3NTd1cKfk
EZLF9FAivjK1R1jjrnNW2gxLd8I4JTMmDmLNdmkoABqmxkhb2GSyq3VYdSCxGtBP
T6NHxILgKwssM16c2KObL33moZnmcFVczWmfX7cnXc44waHmRMXe4qelX4JV7nPN
65zIXWp3Nav+ZRFqIG6bD2/orJCs9fDDD/UZUJn3xBeBE5HXqn7MabtMMpYc4Bqo
oBx84pxTLoZmTYxNnmCN
=OnlS
-----END PGP SIGNATURE-----
Merge tag 'irqchip-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
First drop of irqchip updates for 4.8 from Marc Zyngier:
- Fix a few bugs in configuring the default trigger from the irqdomain layer
- Make the genirq layer PM aware
- Add PM capability to the ARM GIC driver
- Add support for 2-level translation tables to the GICv3 ITS driver
Some IRQ chips may be located in a power domain outside of the CPU
subsystem and hence will require device specific runtime power
management. In order to support such IRQ chips, add a pointer for a
device structure to the irq_chip structure, and if this pointer is
populated by the IRQ chip driver and CONFIG_PM is selected in the kernel
configuration, then the pm_runtime_get/put APIs for this chip will be
called when an IRQ is requested/freed, respectively.
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Some IRQ chips, such as GPIO controllers or secondary level interrupt
controllers, may require require additional runtime power management
control to ensure they are accessible. For such IRQ chips, it makes sense
to enable the IRQ chip when interrupts are requested and disabled them
again once all interrupts have been freed.
When mapping an IRQ, the IRQ type settings are read and then programmed.
The mapping of the IRQ happens before the IRQ is requested and so the
programming of the type settings occurs before the IRQ is requested. This
is a problem for IRQ chips that require additional power management
control because they may not be accessible yet. Therefore, when mapping
the IRQ, don't program the type settings, just save them and then program
these saved settings when the IRQ is requested (so long as if they are not
overridden via the call to request the IRQ).
Add a stub function for irq_domain_free_irqs() to avoid any compilation
errors when CONFIG_IRQ_DOMAIN_HIERARCHY is not selected.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we now do for non-percpu interrupt, perform a lookup of the
interrupt trigger if the user doesn't supply one. The difference
here is that we can only do it at enable time (trigger configuration
can be per-cpu as well).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
For some devices the IRQ trigger type for a device is read from
firmware, such as device-tree. The IRQ trigger type is typically read
when the mapping for IRQ is created, which is before the IRQ is
requested. Hence, the IRQ trigger type is programmed when mapping the
IRQ and not when requesting the IRQ.
Although this works for most cases, in order to support IRQ chips which
require runtime power management, which may not be accessible prior
to requesting the IRQ, it is desirable to look-up the IRQ trigger type
when it is requested. Therefore, if the IRQ trigger type is not
specified when __setup_irq() is called, look-up the saved IRQ trigger
type. This will allow us to defer the programming of the trigger type
from when the IRQ is mapped to when it is actually requested.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When mapping an IRQ, it is possible that a mapping for the IRQ already
exists. If mapping does exist then there are the following issues with
regard to the handling of the IRQ type settings ...
1. If the domain is part of a hierarchy, then:
a. We do not check that the type settings for the existing mapping
match those of the new mapping.
b. We do not check to see if the type settings have been programmed
yet (and they might not have been) and so we may never set the
type.
2. If the domain is NOT part of a hierarchy, we will overwrite the
current type settings programmed if they are different from the
previous mapping. Please note that irq_create_mapping()
calls irq_find_mapping() to check if a mapping already exists.
Although, it may be unlikely that the type settings for a shared
interrupt would not match, nonetheless we should check for this.
Therefore, to fix this check if a mapping exists (regardless of whether
the domain is part of a hierarchy or not) and if it does then:
1. Return the IRQ number if the type settings match or are not
specified.
2. Program the type settings and return the IRQ number if the type
settings have not been programmed yet.
3. Otherwise if the type setting do not match, then print a warning
and don't return the IRQ number.
Furthermore, add a warning if the type return by irq_domain_translate()
has bits outside the sense mask set and then clear these bits. If these
bits are not cleared then this will cause the comparision of the type
settings for an existing mapping to fail with that of the new mapping
even if the sense bit themselves match. The reason being is that the
existing type settings are read by calling irq_get_trigger_type() which
will clear any bits outside the sense mask. This will allow us to detect
irqchips that are not correctly clearing these bits and fix them.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
sprintf() and snprintf() implementation of kernel guarantees that
its result is terminated with null byte if size is larger than 0. So we
don't need to call memset() at all.
Signed-off-by: Weongyo Jeong <weongyo.linux@gmail.com>
Link: http://lkml.kernel.org/r/1459451703-5744-1-git-send-email-weongyo.linux@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- A number of embarassing buglets (GICv3, PIC32)
- A more substential errata workaround for Cavium's GICv3 ITS
(kept for post-rc1 due to its dependency on NUMA)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXUXcjAAoJECPQ0LrRPXpDGFcQAICkj5cIsQghW2dgR0eo2d7S
+ieyxr55tz3A0c1Cisw09ESHz6wCrQ+PmvXkKISIG1l2AHv9UOUZVrsB1cl2CvBg
C9ClvUyMafiZ+FlhxMO1QM1Vfa3EUV2EPIx3mh5klUp8ph/cT+aArJe+WmJApS25
nlYiobi2AE0+m2V5ekikMtVbM5xXWKHPRgzYqZUlNBV74k/FGgRlBk/bw1AWqnsd
TfUF+QZpEd/4GPglbQLvwJwjQg+qanl59CJqi403U00emLuvRqdqTeMoqPEfw5id
MgVPMtUF+N7fgtygo20oPFBriFBHFUTj+c5Oafd4ahgp6eU02HYX8A7w/jj84tTP
cPa9bcoyKyec5vpO1mbU2a/VzqXPDNL17Dg9tRaf0NpksMeLvBh14jXWp5v8vEqU
Qm4mFlmEYKivWTJhz6pGJmxFX/X5vMa2wrFY7xvOVYby5mSuEGD7+puuKuVNNBEa
THaElOYM4ogTrUBM39dzfzXxSEQN/bcLHXNd2IDuUUK49NvNFjnke3PvLxesiDuT
Pxk4mO912+/Ldk7K1LVVVWltVtOHFNG+I7a3R75gwftHXMYONuOpEiflA4QorxVk
9Rq9yUI4h/69V5fIthBN8BUYU5hxaTtLi0DI1fWSweugZb5PUXiagKnXKaIOLQ/4
A3pvoYEHdynDVO7nJ+Kd
=E+j+
-----END PGP SIGNATURE-----
Merge tag 'irqchip-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Merge irqchip updates from Marc Zyngier:
- A number of embarassing buglets (GICv3, PIC32)
- A more substential errata workaround for Cavium's GICv3 ITS
(kept for post-rc1 due to its dependency on NUMA)
Commit 7cec18a390 changed the return type of irq_destroy_ipi to int, but
missed adding a value to one return statement. Fix this to silence the
resulting compiler warning:
kernel/irq/ipi.c In function ‘irq_destroy_ipi’:
kernel/irq/ipi.c:128:3: warning: ‘return’ with no value, in function returning non-void [-Wreturn-type]
Fixes: 7cec18a390 "genirq: Add error code reporting to irq_{reserve,destroy}_ipi"
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/1464086550-24734-1-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit e614523653 ("radix_tree: add support for multi-order entries")
left the impression that the support for multiorder radix tree entries
was functional. As soon as Ross tried to use it, it became apparent
that my testing was completely inadequate, and it didn't even work a
little bit for orders that were not a multiple of shift.
This series of patches is the result of about 6 weeks of redesign,
reimplementation, testing, arguing and hair-pulling. The great news is
that the test-suite is now far better than it was. That's reflected in
the diffstat for the test-suite alone:
12 files changed, 436 insertions(+), 28 deletions(-)
The highlight for users of the tree is that the restriction on the order
of inserted entries being >= RADIX_TREE_MAP_SHIFT is now gone; the radix
tree now supports any order between 0 and 64.
For those who are interested in how the tree works, patch 9 is probably
the most interesting one as it introduces the new machinery for handling
sibling entries.
I've tried to be fair in attributing authorship to the person who
contributed the majority of the code in each patch; Ross has been an
invaluable partner in the development of this support and it's fair to
say that each of us has code in every commit.
I should also express my appreciation of the 0day testing. It prompted
me that I was bloating the tinyconfig in an unacceptable way, and it
bisected to a commit which contained a rather nasty memory-corruption
bug.
This patch (of 29):
The irqdomain code was checking for 0 or 1 entries, not 0 entries like
the comment said they were. Introduce a new helper that will actually
check for an empty tree.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Core infrastructural changes:
- Support for natively single-ended GPIO driver stages. This
means that if the hardware has registers to configure open
drain or open source configuration, we use that rather than
(as we did before) try to emulate it by switching the line
to an input to get high impedance. This is also documented
throughly in Documentation/gpio/driver.txt for those of you
who did not understand one word of what I just wrote.
- Start to do away with the unnecessarily complex and
unitelligible ARCH_REQUIRE_GPIOLIB and
ARCH_WANT_OPTIONAL_GPIOLIB, another evolutional artifact from
the time when the GPIO subsystem was unmaintained. Archs can
now just select GPIOLIB and be done with it, cleanups to
arches will trickle in for the next kernel. Some minor archs
ACKed the changes immediately so these are included in this
pull request.
- Advancing the use of the data pointer inside the GPIO device
for storing driver data by switching the PowerPC, Super-H
Unicore and a few other subarches or subsystem drivers in
ALSA SoC, Input, serial, SSB, staging etc to use it.
- The initialization now reads the input/output state of the
GPIO lines, so that each GPIO descriptor knows - if this
callback is implemented - whether the line is input or
output. This also reflects nicely in userspace "lsgpio".
- It is now possible to name GPIO producer names, line names,
from the device tree. (Platform data has been supported for
a while.) I bet we will get a similar mechanism for ACPI
one of those days. This makes is possible to get sensible
producer names for e.g. GPIO rails in "lsgpio" in userspace.
New drivers:
- New driver for the Loongson1.
- The XLP driver now supports Broadcom Vulcan ARM64.
- The IT87 driver now supports IT8620 and IT8628.
- The PCA953X driver now supports Galileo Gen2.
Driver improvements:
- MCP23S08 was switched to use the gpiolib irqchip helpers and
now also suppors level-triggered interrupts.
- 74x164 and RCAR now supports the .set_multiple() callback
- AMDPT was converted to use generic GPIO.
- TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
support the new single ended callback for open drain
and in some cases open source.
- Implement the .get_direction() callback for a few more drivers
like PL061, Xgene.
Cleanups:
- Paul Gortmaker combed through the drivers and de-modularized
those who are not really modules.
- Move the GPIO poweroff DT bindings to the power subdir where
they belong.
- Rename gpio-generic.c to gpio-mmio.c, which is much more to the
point. That's what it is handling, nothing more, nothing less.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXOuJ5AAoJEEEQszewGV1zNXsQAII5wtkP69WRJ3goYBKg1dZN
DkuLqZyVI4hCgRhptzUW10gDLHKKOCVubfetTJHSpyG/dWDJXPCyH6FHF+pW6lMX
y+em8kAvWctKpaosy4EM7O55/IohW0/fNCTOfzfrUNivjydFuA2XwPUiPqC7111O
DeKlC/t+W1JEvZTiKMi83pKq+9wqhiHmD0qxRHhV57S+MT8e7mdlSKOp7uUkKPkg
LPlerXosnmeFjL2emuSnKl/tq8pOyruU6uaIGG/uwpbo2W86Dok9GY2GWkQ4pANT
pDtprc4aJ/Clf6Q0CoKwQbmAozqTDeJo+Und9tRs2KuZRly2bWOcyVE0lyK+Y4s0
544LcKw2q6cB9ARZ6JExEVRJejPISGKMqo9TaHkyNSIJoiiatKYvNS4WVeFtTgbI
W+1WfM1svPymNRqVPO1PMLV+3m9dalDH2WjtaFF21uCAQ/G0AuPEHjEDbbx0HIpb
qrvWmYzZ97Rm/LdYROFRO53nEdCp2jh6c3n4/2kGYM8H0suvGxXZsB1g4i+Dm+B+
qKVTS282azlDuH9ohXeXizeb6atK6s8TC3Rmew97SmXDO00cUQzEQO/ZquRLHY9r
n83afQ4OL2Z9yruAxAk7pCshVSyheOsHuFPuZ7bwPW31VMdoWNRkhnaTUXMjGfYg
3y39IHrCKWNMCCVM1iNl
=z4d6
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for kernel cycle v4.7:
Core infrastructural changes:
- Support for natively single-ended GPIO driver stages.
This means that if the hardware has registers to configure open
drain or open source configuration, we use that rather than (as we
did before) try to emulate it by switching the line to an input to
get high impedance.
This is also documented throughly in Documentation/gpio/driver.txt
for those of you who did not understand one word of what I just
wrote.
- Start to do away with the unnecessarily complex and unitelligible
ARCH_REQUIRE_GPIOLIB and ARCH_WANT_OPTIONAL_GPIOLIB, another
evolutional artifact from the time when the GPIO subsystem was
unmaintained.
Archs can now just select GPIOLIB and be done with it, cleanups to
arches will trickle in for the next kernel. Some minor archs ACKed
the changes immediately so these are included in this pull request.
- Advancing the use of the data pointer inside the GPIO device for
storing driver data by switching the PowerPC, Super-H Unicore and
a few other subarches or subsystem drivers in ALSA SoC, Input,
serial, SSB, staging etc to use it.
- The initialization now reads the input/output state of the GPIO
lines, so that each GPIO descriptor knows - if this callback is
implemented - whether the line is input or output. This also
reflects nicely in userspace "lsgpio".
- It is now possible to name GPIO producer names, line names, from
the device tree. (Platform data has been supported for a while).
I bet we will get a similar mechanism for ACPI one of those days.
This makes is possible to get sensible producer names for e.g.
GPIO rails in "lsgpio" in userspace.
New drivers:
- New driver for the Loongson1.
- The XLP driver now supports Broadcom Vulcan ARM64.
- The IT87 driver now supports IT8620 and IT8628.
- The PCA953X driver now supports Galileo Gen2.
Driver improvements:
- MCP23S08 was switched to use the gpiolib irqchip helpers and now
also suppors level-triggered interrupts.
- 74x164 and RCAR now supports the .set_multiple() callback
- AMDPT was converted to use generic GPIO.
- TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
support the new single ended callback for open drain and in some
cases open source.
- Implement the .get_direction() callback for a few more drivers like
PL061, Xgene.
Cleanups:
- Paul Gortmaker combed through the drivers and de-modularized those
who are not really modules.
- Move the GPIO poweroff DT bindings to the power subdir where they
belong.
- Rename gpio-generic.c to gpio-mmio.c, which is much more to the
point. That's what it is handling, nothing more, nothing less"
* tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (126 commits)
MIPS: do away with ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
gpio: zevio: make it explicitly non-modular
gpio: timberdale: make it explicitly non-modular
gpio: stmpe: make it explicitly non-modular
gpio: sodaville: make it explicitly non-modular
pinctrl: sh-pfc: Let gpio_chip.to_irq() return zero on error
gpio: dwapb: Add ACPI device ID for DWAPB GPIO controller on X-Gene platforms
gpio: dt-bindings: add wd,mbl-gpio bindings
gpio: of: make it possible to name GPIO lines
gpio: make gpiod_to_irq() return negative for NO_IRQ
gpio: xgene: implement .get_direction()
gpio: xgene: Enable ACPI support for X-Gene GFC GPIO driver
gpio: tegra: Implement gpio_get_direction callback
gpio: set up initial state from .get_direction()
gpio: rename gpio-generic.c into gpio-mmio.c
gpio: generic: fix GPIO_GENERIC_PLATFORM is set to module case
gpio: dwapb: add gpio-signaled acpi event support
gpio: dwapb: convert device node to fwnode
gpio: dwapb: remove name from dwapb_port_property
gpio/qoriq: select IRQ_DOMAIN
...
In the function, setup_irq(), we don't check that the descriptor
returned from irq_to_desc() is valid before we start using it. For
example chip_bus_lock() called from setup_irq(), assumes that the
descriptor pointer is valid and doesn't check before dereferencing it.
In many other functions including setup/free_percpu_irq() we do check
that the descriptor returned is not NULL and therefore add the same test
to setup_irq() to ensure the descriptor returned is valid.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to prepare the genirq layer for the concept of partitionned
percpu interrupts, let's allow an affinity to be associated with
such an interrupt. We introduce:
- irq_set_percpu_devid_partition: flag an interrupt as a percpu-devid
interrupt, and associate it with an affinity
- irq_get_percpu_devid_partition: allow the affinity of that interrupt
to be retrieved.
This will allow a driver to discover which CPUs the per-cpu interrupt
can actually fire on.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When iterating over the irq domain list, we try to match a domain
either by calling a match() function or by comparing a number
of fields passed as parameters.
Both approaches are a bit restrictive:
- match() is DT specific and only takes a device node
- the fallback case only deals with the fwnode_handle
It would be useful if we had a per-domain function that would
actually perform the matching check on the whole of the
irq_fwspec structure. This would allow for a domain to triage
matching attempts that need to extend beyond the fwnode.
Let's introduce irq_find_matching_fwspec(), which takes a full
blown irq_fwspec structure, and call into a select() function
implemented by the irqdomain. irq_find_matching_fwnode() is
made a wrapper around irq_find_matching_fwspec in order to
preserve compatibility.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Make these functions return appropriate error codes when something goes
wrong.
Previously irq_destroy_ipi returned void making it impossible to notify
the caller if the request could not be fulfilled. Patch 1 in the series
added another condition in which this could fail in addition to the
existing ones. irq_reserve_ipi returned an unsigned int meaning it could
only return 0 on failure and give the caller no indication as to why the
request failed.
As time goes on there are likely to be further conditions added in which
these functions can fail. These APIs and the IPI IRQ domain are new in
4.6 and the number of existing call sites are low, changing the API now
has little impact on the code, while making it easier for these
functions to grow over time.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1461568464-31701-2-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Previously irq_destroy_ipi() would destroy IPIs to all CPUs that were
configured by irq_reserve_ipi(). This change makes it possible to
destroy just a subset of the IPIs. This may be useful to remove IPIs to
CPUs that have been hot removed so that the IRQ numbers allocated within
the IPI domain can be re-used.
The original behaviour is restored by passing the complete mask that the
IPI was created with.
There are currently no users of this function that would break from the
API change.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1461568464-31701-1-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The IPI domain re-purposes the IRQ affinity to signify the mask of CPUs
that this IPI will deliver to. This must not be modified before the IPI
is destroyed again, so set the IRQ_NO_BALANCING flag to prevent the
affinity being overwritten by setup_affinity().
Without this, if an IPI is reserved for a single target CPU, then
allocated using __setup_irq(), the affinity is overwritten with
cpu_online_mask. When ipi_destroy() is subsequently called on a
multi-cpu system, it will attempt to free cpumask_weight() IRQs
that were never allocated, and crash.
Fixes: d17bf24e69 ("genirq: Add a new generic IPI reservation code to irq core")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Link: http://lkml.kernel.org/r/1461229712-13057-1-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export irq_domain_free_irqs_common so it can be used by modules.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Use the more common logging method with the eventual goal of removing
pr_warning altogether.
Miscellanea:
- Realign arguments
- Coalesce formats
- Add missing space between a few coalesced formats
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [kernel/power/suspend.c]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Redesign of cpufreq governors and the intel_pstate driver to
make them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers
for that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it
more straightforward and fix some concurrency problems in it
(Rafael Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve
its handling of voltage regulators and device clocks and updates
of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization
and cleanup problems in it and correct its worker thread handling
with respect to CPU offline, new powernv_throttle tracepoint
(Shilpasri Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced
by previos changes in the ACPICA code (Bob Moore, Lv Zheng,
David Box, Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers)
and ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as
a valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls made,
fixes related to Xeon x200 processors, compiler warning fixes) and
cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK
QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+
nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V
JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX
bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb
tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P
fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO
UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ
Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es
mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq
alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN
WB+itPc4Fw0YHOrAFsrx
=cfty
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the majority of changes go into cpufreq and they are
significant.
First off, the way CPU frequency updates are triggered is different
now. Instead of having to set up and manage a deferrable timer for
each CPU in the system to evaluate and possibly change its frequency
periodically, cpufreq governors set up callbacks to be invoked by the
scheduler on a regular basis (basically on utilization updates). The
"old" governors, "ondemand" and "conservative", still do all of their
work in process context (although that is triggered by the scheduler
now), but intel_pstate does it all in the callback invoked by the
scheduler with no need for any additional asynchronous processing.
Of course, this eliminates the overhead related to the management of
all those timers, but also it allows the cpufreq governor code to be
simplified quite a bit. On top of that, the common code and data
structures used by the "ondemand" and "conservative" governors are
cleaned up and made more straightforward and some long-standing and
quite annoying problems are addressed. In particular, the handling of
governor sysfs attributes is modified and the related locking becomes
more fine grained which allows some concurrency problems to be avoided
(particularly deadlocks with the core cpufreq code).
In principle, the new mechanism for triggering frequency updates
allows utilization information to be passed from the scheduler to
cpufreq. Although the current code doesn't make use of it, in the
works is a new cpufreq governor that will make decisions based on the
scheduler's utilization data. That should allow the scheduler and
cpufreq to work more closely together in the long run.
In addition to the core and governor changes, cpufreq drivers are
updated too. Fixes and optimizations go into intel_pstate, the
cpufreq-dt driver is updated on top of some modification in the
Operating Performance Points (OPP) framework and there are fixes and
other updates in the powernv cpufreq driver.
Apart from the cpufreq updates there is some new ACPICA material,
including a fix for a problem introduced by previous ACPICA updates,
and some less significant changes in the ACPI code, like CPPC code
optimizations, ACPI processor driver cleanups and support for loading
ACPI tables from initrd.
Also updated are the generic power domains framework, the Intel RAPL
power capping driver and the turbostat utility and we have a bunch of
traditional assorted fixes and cleanups.
Specifics:
- Redesign of cpufreq governors and the intel_pstate driver to make
them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers for
that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it more
straightforward and fix some concurrency problems in it (Rafael
Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve its
handling of voltage regulators and device clocks and updates of the
cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization and
cleanup problems in it and correct its worker thread handling with
respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced by
previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers) and
ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as a
valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls
made, fixes related to Xeon x200 processors, compiler warning
fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"
* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
tools/power turbostat: bugfix: TDP MSRs print bits fixing
tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
tools/power turbostat: call __cpuid() instead of __get_cpuid()
tools/power turbostat: indicate SMX and SGX support
tools/power turbostat: detect and work around syscall jitter
tools/power turbostat: show GFX%rc6
tools/power turbostat: show GFXMHz
tools/power turbostat: show IRQs per CPU
tools/power turbostat: make fewer systems calls
tools/power turbostat: fix compiler warnings
tools/power turbostat: add --out option for saving output in a file
tools/power turbostat: re-name "%Busy" field to "Busy%"
tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
tools/power turbostat: allow sub-sec intervals
ACPI / APEI: ERST: Fixed leaked resources in erst_init
ACPI / APEI: Fix leaked resources
intel_pstate: Do not skip samples partially
intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
...
Pull irq updates from Thomas Gleixner:
"The 4.6 pile of irq updates contains:
- Support for IPI irqdomains to support proper integration of IPIs to
and from coprocessors. The first user of this new facility is
MIPS. The relevant MIPS patches come with the core to avoid merge
ordering issues and have been acked by Ralf.
- A new command line option to set the default interrupt affinity
mask at boot time.
- Support for some more new ARM and MIPS interrupt controllers:
tango, alpine-msix and bcm6345-l1
- Two small cleanups for x86/apic which we merged into irq/core to
avoid yet another branch in x86 with two tiny commits.
- The usual set of updates, cleanups in drivers/irqchip. Mostly in
the area of ARM-GIC, arada-37-xp and atmel chips. Nothing
outstanding here"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
irqchip/irq-alpine-msi: Release the correct domain on error
irqchip/mxs: Fix error check of of_io_request_and_map()
irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
genirq: Export IRQ functions for module use
irqchip/gic/realview: Support more RealView DCC variants
Documentation/bindings: Document the Alpine MSIX driver
irqchip: Add the Alpine MSIX interrupt controller
irqchip/gic-v3: Always return IRQ_SET_MASK_OK_DONE in gic_set_affinity
irqchip/gic-v3-its: Mark its_init() and its children as __init
irqchip/gic-v3: Remove gic_root_node variable from the ITS code
irqchip/gic-v3: ACPI: Add redistributor support via GICC structures
irqchip/gic-v3: Add ACPI support for GICv3/4 initialization
irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver
x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytes
x86/apic: Deinline __default_send_IPI_*, save ~200 bytes
dt-bindings: interrupt-controller: Add SoC-specific compatible string to Marvell ODMI
irqchip/mips-gic: Add new DT property to reserve IPIs
MIPS: Delete smp-gic.c
MIPS: Make smp CMP, CPS and MT use the new generic IPI functions
MIPS: Add generic SMP IPI support
...
Export irq_chip_*_parent(), irq_domain_create_hierarchy(),
irq_domain_set_hwirq_and_chip(), irq_domain_reset_irq_data(),
irq_domain_alloc/free_irqs_parent()
So gpio drivers can be built as modules. First user: gpio-xgene-sb
Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Phong Vo <pvo@apm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: patches@apm.com
Cc: Loc Ho <lho@apm.com>
Cc: Keyur Chudgar <kchudgar@apm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: https://lists.01.org/pipermail/kbuild-all/2016-February/017914.html
Link: http://lkml.kernel.org/r/1457017012-10628-1-git-send-email-qnguyen@apm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Per the x86-specific footnote to PCI spec r3.0, sec 6.2.4, the value 255 in
the Interrupt Line register means "unknown" or "no connection."
Previously, when we couldn't derive an IRQ from the _PRT, we fell back to
using the value from Interrupt Line as an IRQ. It's questionable whether
we should do that at all, but the spec clearly suggests we shouldn't do it
for the value 255 on x86.
Calling request_irq() with IRQ 255 may succeed, but the driver won't
receive any interrupts. Or, if IRQ 255 is shared with another device, it
may succeed, and the driver's ISR will be called at random times when the
*other* device interrupts. Or it may fail if another device is using IRQ
255 with incompatible flags. What we *want* is for request_irq() to fail
predictably so the driver can fall back to polling.
On x86, assume 255 in the Interrupt Line means the INTx line is not
connected. In that case, set dev->irq to IRQ_NOTCONNECTED so request_irq()
will fail gracefully with -ENOTCONN.
We found this problem on a system where Secure Boot firmware assigned
Interrupt Line 255 to an i801_smbus device and another device was already
using MSI-X IRQ 255. This was in v3.10, where i801_probe() fails if
request_irq() fails:
i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16
After aeb8a3d16a ("i2c: i801: Check if interrupts are disabled"),
i801_probe() will fall back to polling if request_irq() fails. But we
still need this patch because request_irq() may succeed or fail depending
on other devices in the system. If request_irq() fails, i801_smbus will
work by falling back to polling, but if it succeeds, i801_smbus won't work
because it expects interrupts that it may not receive.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add APIs to send IPIs from driver and arch code.
We have different functions because we allow architecture code to cache the
irq descriptor to avoid lookups. Driver code has to use the irq number and is
subject to more restrictive checks.
[ tglx: Polish the implementation ]
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-12-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When dealing with coprocessors we need to find out the actual hwirqs values to
pass on to the firmware so that it knows what it needs to use to receive IPIs
from and send IPIs to Linux cpus.
[ tglx: Fixed the single hwirq IPI case. The hardware irq number does not
change due to the cpu number ]
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-10-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a generic mechanism to dynamically allocate an IPI. Depending on the
underlying implementation this creates either a single Linux irq or a
consective range of Linux irqs. The Linux irq is used later to send IPIs to
other CPUs.
[ tglx: Massaged the code and removed the 'consecutive mask' restriction for
the single IRQ case ]
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-9-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We will need to use this function to implement irq_reserve_ipi() later. So
make it non static and move the prototype to irqdomain.h to allow using it
outside irqdomain.c
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: <jason@lakedaemon.net>
Cc: <marc.zyngier@arm.com>
Cc: <jiang.liu@linux.intel.com>
Cc: <ralf@linux-mips.org>
Cc: <linux-mips@linux-mips.org>
Cc: <lisa.parratt@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Link: http://lkml.kernel.org/r/1449580830-23652-8-git-send-email-qais.yousef@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_common_data::state_use_accessors is not designed for public use.
Therefore make it private so that people who write code accessing it
directly will get blamed by sparse. Also #undef the macro
__irqd_to_state after used in header files, so that the macro can't be
misused.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The irq code browses the list of actions differently to inspect the element
one by one. Even if it is not a problem, for the sake of consistent code,
provide a macro similar to for_each_irq_desc in order to have the same loop to
go through the actions list and use it in the code.
[ tglx: Renamed the macro ]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1452765253-31148-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If we isolate CPUs, then we don't want random device interrupts on them. Even
w/o the user space irq balancer enabled we can end up with irqs on non boot
cpus and chasing newly requested interrupts is a tedious task.
Allow to restrict the default irq affinity mask.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1602031948190.25254@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull IRQ fixes from Ingo Molnar:
"Mostly irqchip driver fixes, but also an irq core crash fix and a
build fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mxs: Add missing set_handle_irq()
irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
irqchip/gic-v3-its: Recompute the number of pages on page size change
base: Export platform_msi_domain_[alloc,free]_irqs
of: MSI: Simplify irqdomain lookup
irqdomain: Allow domain lookup with DOMAIN_BUS_WIRED token
irqchip: Fix dependencies for archs w/o HAS_IOMEM
irqchip/s3c24xx: Mark init_eint as __maybe_unused
genirq: Validate action before dereferencing it in handle_irq_event_percpu()
Let's take the (outlandish) example of an interrupt controller
capable of handling both wired interrupts and PCI MSIs.
With the current code, the PCI MSI domain is going to be tagged
with DOMAIN_BUS_PCI_MSI, and the wired domain with DOMAIN_BUS_ANY.
Things get hairy when we start looking up the domain for a wired
interrupt (typically when creating it based on some firmware
information - DT or ACPI).
In irq_create_fwspec_mapping(), we perform the lookup using
DOMAIN_BUS_ANY, which is actually used as a wildcard. This gives
us one chance out of two to end up with the wrong domain, and
we try to configure a wired interrupt with the MSI domain.
Everything grinds to a halt pretty quickly.
What we really need to do is to start looking for a domain that
would uniquely identify a wired interrupt domain, and only use
DOMAIN_BUS_ANY as a fallback.
In order to solve this, let's introduce a new DOMAIN_BUS_WIRED
token, which is going to be used exactly as described above.
Of course, this depends on the irqchip to setup the domain
bus_token, and nobody had to implement this so far.
Only so far.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1453816347-32720-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export irq_domain_set_info() for module use. It will be used by the Volume
Management Device driver.
[bhelgaas: changelog]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Previously msi_domain_alloc() assumed MSI irqdomains always had parent
irqdomains, but that's not true for the new Intel VMD devices. Relax
msi_domain_alloc() to support parentless MSI irqdomains.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
commit 71f64340fc changed the handling of irq_desc->action from
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
if (desc->action) {
handle_irq_event()
action = desc->action
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc, action)
action->xxx
to
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
if (desc->action) {
handle_irq_event()
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc, action)
action = desc->action
action->xxx
So if free_irq manages to set the action to NULL between the unlock and before
the readout, we happily dereference a null pointer.
We could simply revert 71f64340fc, but we want to preserve the better code
generation. A simple solution is to change the action loop from a do {} while
to a while {} loop.
This is safe because we either see a valid desc->action or NULL. If the action
is about to be removed it is still valid as free_irq() is blocked on
synchronize_irq().
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
set(INPROGRESS)
unlock(desc)
handle_irq_event_percpu(desc)
action = desc->action
desc->action = NULL while (action) {
action->xxx
...
action = action->next;
sychronize_irq()
while(INPROGRESS); lock(desc)
clr(INPROGRESS)
free(action)
That's basically the same mechanism as we have for shared
interrupts. action->next can become NULL while handle_irq_event_percpu()
runs. Either it sees the action or NULL. It does not matter, because action
itself cannot go away before the interrupt in progress flag has been cleared.
Fixes: commit 71f64340fc "genirq: Remove the second parameter from handle_irq_event_percpu()"
Reported-by: zyjzyj2000@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Huang Shijie <shijie.huang@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1601131224190.3575@nanos
Pull irq updates from Thomas Gleixner:
"The irq department provides:
- Support for MSI to wire bridges and a first user of it
- More ACPI support for ARM/GIC
- A new TS-4800 interrupt controller driver
- RCU based free of interrupt descriptors to support the upcoming
Intel VMD technology without introducing a locking nightmare
- The usual pile of fixes and updates to drivers and core code"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
irqchip/omap-intc: Add support for spurious irq handling
irqchip/zevio: Use irq_data_get_chip_type() helper
irqchip/omap-intc: Remove duplicate setup for IRQ chip type handler
irqchip/ts4800: Add TS-4800 interrupt controller
irqchip/ts4800: Add documentation for TS-4800 interrupt controller
irq/platform-MSI: Increase the maximum MSIs the MSI framework can support
irqchip/gicv2m: Miscellaneous fixes for v2m resources and SPI ranges
irqchip/bcm2836: Make code more readable
irqchip/bcm2836: Tolerate IRQs while no flag is set in ISR
irqchip/bcm2836: Add SMP support for the 2836
irqchip/bcm2836: Fix initialization of the LOCAL_IRQ_CNT timers
irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support
irqchip/gic-v2m: Refactor to prepare for ACPI support
irqdomain: Introduce is_fwnode_irqchip helper
acpi: pci: Setup MSI domain for ACPI based pci devices
genirq/msi: Export functions to allow MSI domains in modules
irqchip/mbigen: Implement the mbigen irq chip operation functions
irqchip/mbigen: Create irq domain for each mbigen device
irqchip/mgigen: Add platform device driver for mbigen device
dt-bindings: Documents the mbigen bindings
...
Since there will be several places checking if fwnode.type
is equal FWNODE_IRQCHIP, this patch adds a convenient function
for this purpose.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Pull the MSI wire bridge implementation from Marc Zyngier along with
the first user of it. This is infrastructure to support a wired
interrupt to MSI interrupt brigde. The first user is mbigen found in
Hisilicon ARM SoCs.
To be able to allocate interrupts from the MSI layer down,
add a new msi_domain_populate_irqs entry point.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The .prepare callbacks are so far only called from msi_domain_alloc_irqs.
In order to reuse that code, split that code and create a
msi_domain_prepare_irqs function that the existing code can call into.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We are soon going to need the MSI layer to call into the domain
allocators. Instead of open coding this, make the standard
irq_domain_alloc_irqs_recursive function available to the MSI
layer.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The new VMD device driver needs to iterate over a list of
"demultiplexing" interrupts. Protecting that list with a lock is not
possible because the list is also required in code pathes which hold
irq descriptor lock. Therefor the demultiplexing interrupt handler
would create a lock inversion scenario if it calls a demux handler
with the list protection lock held.
A solution for this is to free the irq descriptor via RCU, so the
list can be walked with rcu read lock held.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Keith Busch <keith.busch@intel.com>
If a interrupt chip utilizes chip->buslock then free_irq() can
deadlock in the following way:
CPU0 CPU1
interrupt(X) (Shared or spurious)
free_irq(X) interrupt_thread(X)
chip_bus_lock(X)
irq_finalize_oneshot(X)
chip_bus_lock(X)
synchronize_irq(X)
synchronize_irq() waits for the interrupt thread to complete,
i.e. forever.
Solution is simple: Drop chip_bus_lock() before calling
synchronize_irq() as we do with the irq_desc lock. There is nothing to
be protected after the point where irq_desc lock has been released.
This adds chip_bus_lock/unlock() to the remove_irq() code path, but
that's actually correct in the case where remove_irq() is called on
such an interrupt. The current users of remove_irq() are not affected
as none of those interrupts is on a chip which requires buslock.
Reported-by: Fredrik Markström <fredrik.markstrom@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Certain interrupt controller drivers have a register set that does not
make it easy to save/restore the mask of enabled/disabled interrupts
at suspend/resume time. At resume time, such drivers rely on the core
kernel irq subsystem to tell whether such or such interrupt is enabled
or not, in order to restore the proper state in the interrupt
controller register.
While the irqd_irq_disabled() provides the relevant information for
global interrupts, there is no similar function to query the
enabled/disabled state of a per-CPU interrupt.
Therefore, this commit complements the percpu_irq API with an
irq_percpu_is_enabled() function.
[ tglx: Simplified the implementation and added kerneldoc ]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Tawfik Bayouk <tawfik@marvell.com>
Cc: Nadav Haklai <nadavh@marvell.com>
Cc: Lior Amsalem <alior@marvell.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1445347435-2333-2-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In case of a wakeup interrupt, irq_pm_check_wakeup disables the interrupt
and marks it pending and suspended, disables it and notifies the pm core
about the wake event. The interrupt gets handled later once the system
is resumed.
However the irq stats is updated twice: once when it's disabled waiting
for the system to resume and later when it's handled, resulting in wrong
counting of the wakeup interrupt when waking up the system.
This patch updates the interrupt count so that it's updated only when
the interrupt gets handled. It's already handled correctly in
handle_edge_irq and handle_edge_eoi_irq.
Reported-by: Manoil Claudiu <claudiu.manoil@freescale.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1446661957-1019-1-git-send-email-sudeep.holla@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull irq and timer fixes from Thomas Gleixner:
- An irq regression fix to restore the wakeup behaviour of chained
interrupts.
- A timer fix for a long standing race versus timers scheduled on a
target cpu which got exposed by recent changes in the workqueue
implementation.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/PM: Restore system wake up from chained interrupts
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Use proper base migration in add_timer_on()
Commit e509bd7da1 ("genirq: Allow migration of chained interrupts
by installing default action") breaks PCS wake up IRQ behaviour on
TI OMAP based platforms (dra7-evm).
TI OMAP IRQ wake up configuration:
GIC-irqchip->PCM_IRQ
|- omap_prcm_register_chain_handler
|- PRCM-irqchip -> PRCM_IO_IRQ
|- pcs_irq_chain_handler
|- pinctrl-irqchip -> PCS_uart1_wakeup_irq
This happens because IRQ PM code (irq/pm.c) is expected to ignore
chained interrupts by default:
static bool suspend_device_irq(struct irq_desc *desc)
{
if (!desc->action || desc->no_suspend_depth)
return false;
- it's expected !desc->action = true for chained interrupts;
but, after above change, all chained interrupt descriptors will
have default action handler installed - chained_action.
As result, chained interrupts will be silently disabled during system
suspend.
Hence, fix it by introducing helper function irq_desc_is_chained() and
use it in suspend_device_irq() for chained interrupts identification
and skip them, once detected.
Fixes: e509bd7da1 ("genirq: Allow migration of chained interrupts..")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: <nsekhar@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tony Lindgren <tony@atomide.com>
Link: http://lkml.kernel.org/r/1447149492-20699-1-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng).
The most significant change is to allow the AML debugger to be
built into the kernel. On top of that there is an update related
to the NFIT table (the ACPI persistent memory interface)
and a few fixes and cleanups.
- ACPI CPPC2 (Collaborative Processor Performance Control v2)
support along with a cpufreq frontend (Ashwin Chaugule).
This can only be enabled on ARM64 at this point.
- New ACPI infrastructure for the early probing of IRQ chips and
clock sources (Marc Zyngier).
- Support for a new hierarchical properties extension of the ACPI
_DSD (Device Specific Data) device configuration object allowing
the kernel to handle hierarchical properties (provided by the
platform firmware this way) automatically and make them available
to device drivers via the generic device properties interface
(Rafael Wysocki).
- Generic device properties API extension to obtain an index of
certain string value in an array of strings, along the lines of
of_property_match_string(), but working for all of the supported
firmware node types, and support for the "dma-names" device
property based on it (Mika Westerberg).
- ACPI core fix to parse the MADT (Multiple APIC Description Table)
entries in the order expected by platform firmware (and mandated
by the specification) to avoid confusion on systems with more than
255 logical CPUs (Lukasz Anaczkowski).
- Consolidation of the ACPI-based handling of PCI host bridges
on x86 and ia64 (Jiang Liu).
- ACPI core fixes to ensure that the correct IRQ number is used to
represent the SCI (System Control Interrupt) in the cases when
it has been re-mapped (Chen Yu).
- New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede).
- ACPI EC driver fixes (Lv Zheng).
- Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri
Kosina, Rami Rosen, Rasmus Villemoes).
- New mechanism in the PM core allowing drivers to check if the
platform firmware is going to be involved in the upcoming system
suspend or if it has been involved in the suspend the system is
resuming from at the moment (Rafael Wysocki).
This should allow drivers to optimize their suspend/resume
handling in some cases and the changes include a couple of users
of it (the i8042 input driver, PCI PM).
- PCI PM fix to prevent runtime-suspended devices with PME enabled
from being resumed during system suspend even if they aren't
configured to wake up the system from sleep (Rafael Wysocki).
- New mechanism to report the number of a wakeup IRQ that woke up
the system from sleep last time (Alexandra Yates).
- Removal of unused interfaces from the generic power domains
framework and fixes related to latency measurements in that
code (Ulf Hansson, Daniel Lezcano).
- cpufreq core sysfs interface rework to make it handle CPUs that
share performance scaling settings (represented by a common
cpufreq policy object) more symmetrically (Viresh Kumar).
This should help to simplify the CPU offline/online handling among
other things.
- cpufreq core fixes and cleanups (Viresh Kumar).
- intel_pstate fixes related to the Turbo Activation Ratio (TAR)
mechanism on client platforms which causes the turbo P-states
range to vary depending on platform firmware settings (Srinivas
Pandruvada).
- intel_pstate sysfs interface fix (Prarit Bhargava).
- Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes
and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G
Bhat, Luis de Bethencourt).
- cpuidle mvebu driver cleanups (Russell King).
- OPP (Operating Performance Points) framework code reorganization
to make it more maintainable (Viresh Kumar).
- Intel Broxton support for the RAPL (Running Average Power Limits)
power capping driver (Amy Wiles).
- Assorted power management code fixes and cleanups (Dan Carpenter,
Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus
Villemoes).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWOC9oAAoJEILEb/54YlRx/c8P/joflwoFsISwJccG62YTQMuc
bMQKM4Kw0vl5La8+pkLpe5t6+mW7l81UFtYF6Dzd8LOKlD9sszD34z1lHmCeT/oR
wn0uZpHagRyLMUfoyiEtlU/VRU6WQNNtS3EgjwUi7xgFz9Q0pjcCZ9OQ6vKov1j5
+6j40ODif5sgo+2vl+rztJiV0SIMkYdkgNqgfN1FE9bdLA2Zkk+PxxJbtGQORuDu
O/K+XhQT2xWquVWi/1p+VtQxs5glBS1oKm0kogV5bElCvNTRNIVABUNcjogITQwo
QSAKgoCKIoaIl5jtDT6u5dc0y67q/dMtqOY9fOCcOz1Z7jbWQzR8D7mpFWIsJUPK
K2LClI3t85ynpN6Jref246A6+C9nwB8JMAiAR11oBw7WbBlkd6tbRgcT5B+iz8UE
FuCCif7pha/Fs+Jt1YRazscIqteQ2bAhhxikuIPMfw2M6M67MNfVNeKA1bAoSM34
dH7JsilblitvV7shrwJHwXPXCOF2jEPoK8I4/q2+TR5qUxEpRJjelQxXGSaQScMZ
iNnjeTgv8H8q+rY5Yjzsl4pxP0Fvf7IuqkptWOJbgepg4cQc9pS87wOpY3uEeQzr
H7ruaQJFCnLO4aXbPNClsiJARhrBk+qMlsh4vBEyCJ2T0ucb+nIUcN4BTi8t85yl
X97BfHHUiDoUrnIsNids
=1gaH
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"Quite a new features are included this time.
First off, the Collaborative Processor Performance Control interface
(version 2) defined by ACPI will now be supported on ARM64 along with
a cpufreq frontend for CPU performance scaling.
Second, ACPI gets a new infrastructure for the early probing of IRQ
chips and clock sources (along the lines of the existing similar
mechanism for DT).
Next, the ACPI core and the generic device properties API will now
support a recently introduced hierarchical properties extension of the
_DSD (Device Specific Data) ACPI device configuration object. If the
ACPI platform firmware uses that extension to organize device
properties in a hierarchical way, the kernel will automatically handle
it and make those properties available to device drivers via the
generic device properties API.
It also will be possible to build the ACPICA's AML interpreter
debugger into the kernel now and use that to diagnose AML-related
problems more efficiently. In the future, this should make it
possible to single-step AML execution and do similar things.
Interesting stuff, although somewhat experimental at this point.
Finally, the PM core gets a new mechanism that can be used by device
drivers to distinguish between suspend-to-RAM (based on platform
firmware support) and suspend-to-idle (or other variants of system
suspend the platform firmware is not involved in) and possibly
optimize their device suspend/resume handling accordingly.
In addition to that, some existing features are re-organized quite
substantially.
First, the ACPI-based handling of PCI host bridges on x86 and ia64 is
unified and the common code goes into the ACPI core (so as to reduce
code duplication and eliminate non-essential differences between the
two architectures in that area).
Second, the Operating Performance Points (OPP) framework is
reorganized to make the code easier to find and follow.
Next, the cpufreq core's sysfs interface is reorganized to get rid of
the "primary CPU" concept for configurations in which the same
performance scaling settings are shared between multiple CPUs.
Finally, some interfaces that aren't necessary any more are dropped
from the generic power domains framework.
On top of the above we have some minor extensions, cleanups and bug
fixes in multiple places, as usual.
Specifics:
- ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng).
The most significant change is to allow the AML debugger to be
built into the kernel. On top of that there is an update related
to the NFIT table (the ACPI persistent memory interface) and a few
fixes and cleanups.
- ACPI CPPC2 (Collaborative Processor Performance Control v2) support
along with a cpufreq frontend (Ashwin Chaugule).
This can only be enabled on ARM64 at this point.
- New ACPI infrastructure for the early probing of IRQ chips and
clock sources (Marc Zyngier).
- Support for a new hierarchical properties extension of the ACPI
_DSD (Device Specific Data) device configuration object allowing
the kernel to handle hierarchical properties (provided by the
platform firmware this way) automatically and make them available
to device drivers via the generic device properties interface
(Rafael Wysocki).
- Generic device properties API extension to obtain an index of
certain string value in an array of strings, along the lines of
of_property_match_string(), but working for all of the supported
firmware node types, and support for the "dma-names" device
property based on it (Mika Westerberg).
- ACPI core fix to parse the MADT (Multiple APIC Description Table)
entries in the order expected by platform firmware (and mandated by
the specification) to avoid confusion on systems with more than 255
logical CPUs (Lukasz Anaczkowski).
- Consolidation of the ACPI-based handling of PCI host bridges on x86
and ia64 (Jiang Liu).
- ACPI core fixes to ensure that the correct IRQ number is used to
represent the SCI (System Control Interrupt) in the cases when it
has been re-mapped (Chen Yu).
- New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede).
- ACPI EC driver fixes (Lv Zheng).
- Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri
Kosina, Rami Rosen, Rasmus Villemoes).
- New mechanism in the PM core allowing drivers to check if the
platform firmware is going to be involved in the upcoming system
suspend or if it has been involved in the suspend the system is
resuming from at the moment (Rafael Wysocki).
This should allow drivers to optimize their suspend/resume handling
in some cases and the changes include a couple of users of it (the
i8042 input driver, PCI PM).
- PCI PM fix to prevent runtime-suspended devices with PME enabled
from being resumed during system suspend even if they aren't
configured to wake up the system from sleep (Rafael Wysocki).
- New mechanism to report the number of a wakeup IRQ that woke up the
system from sleep last time (Alexandra Yates).
- Removal of unused interfaces from the generic power domains
framework and fixes related to latency measurements in that code
(Ulf Hansson, Daniel Lezcano).
- cpufreq core sysfs interface rework to make it handle CPUs that
share performance scaling settings (represented by a common cpufreq
policy object) more symmetrically (Viresh Kumar).
This should help to simplify the CPU offline/online handling among
other things.
- cpufreq core fixes and cleanups (Viresh Kumar).
- intel_pstate fixes related to the Turbo Activation Ratio (TAR)
mechanism on client platforms which causes the turbo P-states range
to vary depending on platform firmware settings (Srinivas
Pandruvada).
- intel_pstate sysfs interface fix (Prarit Bhargava).
- Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes
and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G
Bhat, Luis de Bethencourt).
- cpuidle mvebu driver cleanups (Russell King).
- OPP (Operating Performance Points) framework code reorganization to
make it more maintainable (Viresh Kumar).
- Intel Broxton support for the RAPL (Running Average Power Limits)
power capping driver (Amy Wiles).
- Assorted power management code fixes and cleanups (Dan Carpenter,
Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus
Villemoes)"
* tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (108 commits)
cpufreq: postfix policy directory with the first CPU in related_cpus
cpufreq: create cpu/cpufreq/policyX directories
cpufreq: remove cpufreq_sysfs_{create|remove}_file()
cpufreq: create cpu/cpufreq at boot time
cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask
cpufreq: ondemand: Drop unnecessary locks from update_sampling_rate()
PM / Domains: Merge measurements for PM QoS device latencies
PM / Domains: Don't measure ->start|stop() latency in system PM callbacks
PM / clk: Fix broken build due to non-matching code and header #ifdefs
ACPI / Documentation: add copy_dsdt to ACPI format options
ACPI / sysfs: correctly check failing memory allocation
ACPI / video: Add a quirk to force native backlight on Lenovo IdeaPad S405
ACPI / CPPC: Fix potential memory leak
ACPI / CPPC: signedness bug in register_pcc_channel()
ACPI / PAD: power_saving_thread() is not freezable
ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle
ACPI: Using correct irq when waiting for events
ACPI: Use correct IRQ when uninstalling ACPI interrupt handler
cpuidle: mvebu: disable the bind/unbind attributes and use builtin_platform_driver
cpuidle: mvebu: clean up multiple platform drivers
...
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
merged from tip/irq/for-arm to allow the arm64-specific part to be
upstreamed via the arm64 tree
- CPU feature detection reworked to cope with heterogeneous systems
where CPUs may not have exactly the same features. The features
reported by the kernel via internal data structures or ELF_HWCAP are
delayed until all the CPUs are up (and before user space starts)
- Support for 16KB pages, with the additional bonus of a 36-bit VA
space, though the latter only depending on EXPERT
- Implement native {relaxed, acquire, release} atomics for arm64
- New ASID allocation algorithm which avoids IPI on roll-over, together
with TLB invalidation optimisations (using local vs global where
feasible)
- KASan support for arm64
- EFI_STUB clean-up and isolation for the kernel proper (required by
KASan)
- copy_{to,from,in}_user optimisations (sharing the memcpy template)
- perf: moving arm64 to the arm32/64 shared PMU framework
- L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware
- Support for the contiguous PTE hint on kernel mapping (16 consecutive
entries may be able to use a single TLB entry)
- Generic CONFIG_HZ now used on arm64
- defconfig updates
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWOkmIAAoJEGvWsS0AyF7x4GgQAINU3NePjFFvWZNCkqobeH9+
jFKwtXamIudhTSdnXNXyYWmtRL9Krg3qI4zDQf68dvDFAZAze2kVuOi1yPpCbpFZ
/j/afNyQc7+PoyqRAzmT+EMPZlcuOA84Prrl1r3QWZ58QaFeVk/6ZxrHunTHxN0x
mR9PIXfWx73MTo+UnG8FChkmEY6LmV4XpemgTaMR9FqFhdT51OZSxDDAYXOTm4JW
a5HdN9OWjjJ2rhLlFEaC7tszG9B5doHdy2tr5ge/YERVJzIPDogHkMe8ZhfAJc+x
SQU5tKN6Pg4MOi+dLhxlk0/mKCvHLiEQ5KVREJnt8GxupAR54Bat+DQ+rP9cSnpq
dRQTcARIOyy9LGgy+ROAsSo+NiyM5WuJ0/WJUYKmgWTJOfczRYoZv6TMKlwNOUYb
tGLCZHhKPM3yBHJlWbQykl3xmSuudxCMmjlZzg7B+MVfTP6uo0CRSPmYl+v67q+J
bBw/Z2RYXWYGnvlc6OfbMeImI6prXeE36+5ytyJFga0m+IqcTzRGzjcLxKEvdbiU
pr8n9i+hV9iSsT/UwukXZ8ay6zH7PrTLzILWQlieutfXlvha7MYeGxnkbLmdYcfe
GCj374io5cdImHcVKmfhnOMlFOLuOHphl9cmsd/O2LmCIqBj9BIeNH2Om8mHVK2F
YHczMdpESlJApE7kUc1e
=3six
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
merged from tip/irq/for-arm to allow the arm64-specific part to be
upstreamed via the arm64 tree
- CPU feature detection reworked to cope with heterogeneous systems
where CPUs may not have exactly the same features. The features
reported by the kernel via internal data structures or ELF_HWCAP are
delayed until all the CPUs are up (and before user space starts)
- Support for 16KB pages, with the additional bonus of a 36-bit VA
space, though the latter only depending on EXPERT
- Implement native {relaxed, acquire, release} atomics for arm64
- New ASID allocation algorithm which avoids IPI on roll-over, together
with TLB invalidation optimisations (using local vs global where
feasible)
- KASan support for arm64
- EFI_STUB clean-up and isolation for the kernel proper (required by
KASan)
- copy_{to,from,in}_user optimisations (sharing the memcpy template)
- perf: moving arm64 to the arm32/64 shared PMU framework
- L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware
- Support for the contiguous PTE hint on kernel mapping (16 consecutive
entries may be able to use a single TLB entry)
- Generic CONFIG_HZ now used on arm64
- defconfig updates
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (91 commits)
arm64/efi: fix libstub build under CONFIG_MODVERSIONS
ARM64: Enable multi-core scheduler support by default
arm64/efi: move arm64 specific stub C code to libstub
arm64: page-align sections for DEBUG_RODATA
arm64: Fix build with CONFIG_ZONE_DMA=n
arm64: Fix compat register mappings
arm64: Increase the max granular size
arm64: remove bogus TASK_SIZE_64 check
arm64: make Timer Interrupt Frequency selectable
arm64/mm: use PAGE_ALIGNED instead of IS_ALIGNED
arm64: cachetype: fix definitions of ICACHEF_* flags
arm64: cpufeature: declare enable_cpu_capabilities as static
genirq: Make the cpuhotplug migration code less noisy
arm64: Constify hwcap name string arrays
arm64/kvm: Make use of the system wide safe values
arm64/debug: Make use of the system wide safe value
arm64: Move FP/ASIMD hwcap handling to common code
arm64/HWCAP: Use system wide safe values
arm64/capabilities: Make use of system wide safe value
arm64: Delay cpu feature capability checks
...
Pull networking updates from David Miller:
Changes of note:
1) Allow to schedule ICMP packets in IPVS, from Alex Gartrell.
2) Provide FIB table ID in ipv4 route dumps just as ipv6 does, from
David Ahern.
3) Allow the user to ask for the statistics to be filtered out of
ipv4/ipv6 address netlink dumps. From Sowmini Varadhan.
4) More work to pass the network namespace context around deep into
various packet path APIs, starting with the netfilter hooks. From
Eric W Biederman.
5) Add layer 2 TX/RX checksum offloading to qeth driver, from Thomas
Richter.
6) Use usec resolution for SYN/ACK RTTs in TCP, from Yuchung Cheng.
7) Support Very High Throughput in wireless MESH code, from Bob
Copeland.
8) Allow setting the ageing_time in switchdev/rocker. From Scott
Feldman.
9) Properly autoload L2TP type modules, from Stephen Hemminger.
10) Fix and enable offload features by default in 8139cp driver, from
David Woodhouse.
11) Support both ipv4 and ipv6 sockets in a single vxlan device, from
Jiri Benc.
12) Fix CWND limiting of thin streams in TCP, from Bendik Rønning
Opstad.
13) Fix IPSEC flowcache overflows on large systems, from Steffen
Klassert.
14) Convert bridging to track VLANs using rhashtable entries rather than
a bitmap. From Nikolay Aleksandrov.
15) Make TCP listener handling completely lockless, this is a major
accomplishment. Incoming request sockets now live in the
established hash table just like any other socket too.
From Eric Dumazet.
15) Provide more bridging attributes to netlink, from Nikolay
Aleksandrov.
16) Use hash based algorithm for ipv4 multipath routing, this was very
long overdue. From Peter Nørlund.
17) Several y2038 cures, mostly avoiding timespec. From Arnd Bergmann.
18) Allow non-root execution of EBPF programs, from Alexei Starovoitov.
19) Support SO_INCOMING_CPU as setsockopt, from Eric Dumazet. This
influences the port binding selection logic used by SO_REUSEPORT.
20) Add ipv6 support to VRF, from David Ahern.
21) Add support for Mellanox Spectrum switch ASIC, from Jiri Pirko.
22) Add rtl8xxxu Realtek wireless driver, from Jes Sorensen.
23) Implement RACK loss recovery in TCP, from Yuchung Cheng.
24) Support multipath routes in MPLS, from Roopa Prabhu.
25) Fix POLLOUT notification for listening sockets in AF_UNIX, from Eric
Dumazet.
26) Add new QED Qlogic river, from Yuval Mintz, Manish Chopra, and
Sudarsana Kalluru.
27) Don't fetch timestamps on AF_UNIX sockets, from Hannes Frederic
Sowa.
28) Support ipv6 geneve tunnels, from John W Linville.
29) Add flood control support to switchdev layer, from Ido Schimmel.
30) Fix CHECKSUM_PARTIAL handling of potentially fragmented frames, from
Hannes Frederic Sowa.
31) Support persistent maps and progs in bpf, from Daniel Borkmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1790 commits)
sh_eth: use DMA barriers
switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion
net: sched: kill dead code in sch_choke.c
irda: Delete an unnecessary check before the function call "irlmp_unregister_service"
net: dsa: mv88e6xxx: include DSA ports in VLANs
net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports
net/core: fix for_each_netdev_feature
vlan: Invoke driver vlan hooks only if device is present
arcnet/com20020: add LEDS_CLASS dependency
bpf, verifier: annotate verbose printer with __printf
dp83640: Only wait for timestamps for packets with timestamping enabled.
ptp: Change ptp_class to a proper bitmask
dp83640: Prune rx timestamp list before reading from it
dp83640: Delay scheduled work.
dp83640: Include hash in timestamp/packet matching
ipv6: fix tunnel error handling
net/mlx5e: Fix LSO vlan insertion
net/mlx5e: Re-eanble client vlan TX acceleration
net/mlx5e: Return error in case mlx5e_set_features() fails
net/mlx5e: Don't allow more than max supported channels
...
Pull irq updates from Thomas Gleixner:
"The irq departement delivers:
- Rework the irqdomain core infrastructure to accomodate ACPI based
systems. This is required to support ARM64 without creating
artificial device tree nodes.
- Sanitize the ACPI based ARM GIC initialization by making use of the
new firmware independent irqdomain core
- Further improvements to the generic MSI management
- Generalize the irq migration on CPU hotplug
- Improvements to the threaded interrupt infrastructure
- Allow the migration of "chained" low level interrupt handlers
- Allow optional force masking of interrupts in disable_irq[_nosysnc]
- Support for two new interrupt chips - Sigh!
- A larger set of errata fixes for ARM gicv3
- The usual pile of fixes, updates, improvements and cleanups all
over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
Document that IRQ_NONE should be returned when IRQ not actually handled
PCI/MSI: Allow the MSI domain to be device-specific
PCI: Add per-device MSI domain hook
of/irq: Use the msi-map property to provide device-specific MSI domain
of/irq: Split of_msi_map_rid to reuse msi-map lookup
irqchip/gic-v3-its: Parse new version of msi-parent property
PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
of/irq: Add support code for multi-parent version of "msi-parent"
irqchip/gic-v3-its: Add handling of PCI requester id.
PCI/MSI: Add helper function pci_msi_domain_get_msi_rid().
of/irq: Add new function of_msi_map_rid()
Docs: dt: Add PCI MSI map bindings
irqchip/gic-v2m: Add support for multiple MSI frames
irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec
irqchip/mxs: Add Alphascale ASM9260 support
irqchip/mxs: Prepare driver for hardware with different offsets
irqchip/mxs: Panic if ioremap or domain creation fails
irqdomain: Documentation updates
irqdomain/msi: Use fwnode instead of of_node
...
* pm-sleep:
PM / hibernate: fix a comment typo
input: i8042: Avoid resetting controller on system suspend/resume
PM / PCI / ACPI: Kick devices that might have been reset by firmware
PM / sleep: Add flags to indicate platform firmware involvement
PM / sleep: Drop pm_request_idle() from pm_generic_complete()
PCI / PM: Avoid resuming more devices during system suspend
PM / wakeup: wakeup_source_create: use kstrdup_const
PM / sleep: Report interrupt that caused system wakeup
This is an incremental fix for a patch previously pulled from tip
irq/for-arm.
* 'irq/for-arm' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make the cpuhotplug migration code less noisy
The original arm code has a pr_debug() statement for the case where
the irq chip has no set_affinity() callback. That's sufficient for
debugging and we really don't want to spam dmesg with useless warnings
for the normal case.
Fixes: f1e0bb0ad4: "genirq: Introduce generic irq migration for cpu hotunplug"
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Requested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Conflicts:
drivers/net/usb/asix_common.c
net/ipv4/inet_connection_sock.c
net/switchdev/switchdev.c
In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.
The other two conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
When we create a generic MSI domain, that MSI_FLAG_USE_DEF_CHIP_OPS
is set, and that any of .mask or .unmask are NULL in the irq_chip
structure, we set them to pci_msi_[un]mask_irq.
This is a bad idea for at least two reasons:
- PCI_MSI might not be selected, kernel fails to build (yes, this is
legitimate, at least on arm64!)
- This may not be a PCI/MSI domain at all (platform MSI, for example)
Either way, this looks wrong. Move the overriding of mask/unmask to
the PCI counterpart, and panic is any of these two methods is not
set in the core code (they really should be present).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1444760085-27857-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As we continue to push of_node towards the outskirts of irq domains,
let's start tackling the case of msi_create_irq_domain and its little
friends.
This has limited impact in both PCI/MSI, platform MSI, and a few
drivers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-17-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As we're about to start converting the various MSI layers to
use fwnode_handle instead of device_node, add irq_domain_create_hierarchy
as a directly equivalent of irq_domain_add_hierarchy (which still
exists as a compatibility interface).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-16-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In order to be able to reference an irqdomain from ACPI, we need
to be able to create an identifier, which is usually a struct
device_node.
This device node does't really fit the ACPI infrastructure, so
we cunningly allocate a new structure containing a fwnode_handle,
and return that.
This structure doesn't really point to a device (interrupt
controllers are not "real" devices in Linux), but as we cannot
really deny that they exist, we create them with a new fwnode_type
(FWNODE_IRQCHIP).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-9-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Just like we have irq_domain_add_{linear,tree} to create a irq domain
identified by an of_node, introduce irq_domain_create_{linear,tree}
that do the same thing, except that they take a struct fwnode_handle.
Existing functions get rewritten in terms of the new ones so that
everything keeps working as before (and __irq_domain_add is now
fwnode_handle based as well).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-8-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Just like we have irq_create_of_mapping, irq_create_fwspec_mapping
creates a IRQ domain mapping for an interrupt described in a
struct irq_fwspec.
irq_create_of_mapping gets rewritten in terms of the new function,
and the hack we introduced before gets removed (now that no stacked
irqchip uses of_phandle_args anymore).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-7-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
So far the closest thing to a generic IRQ specifier structure is
of_phandle_args, which happens to be pretty OF specific (the of_node
pointer in there is quite annoying).
Let's introduce 'struct irq_fwspec' that can be used in place of
of_phandle_args for OF, but also for other firmware implementations
(that'd be ACPI). This is used together with a new 'translate' method
that is the pendent of 'xlate'.
We convert irq_create_of_mapping to use this new structure (with a
small hack that will be removed later).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-5-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
So far, our irq domains are still looked up by device node.
Let's change this and allow a domain to be looked up using
a fwnode_handle pointer.
The existing interfaces are preserved with a couple of helpers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-4-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Now that we have everyone accessing the of_node field via the
irq_domain_get_of_node accessor, it is pretty easy to swap it
for a pointer to a fwnode_handle.
This translates into a few limited changes in __irq_domain_add,
and an updated irq_domain_get_of_node.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The struct irq_domain contains a "struct device_node *" field
(of_node) that is almost the only link between the irqdomain
and the device tree infrastructure.
In order to prepare for the removal of that field, convert all
users to use irq_domain_get_of_node() instead.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If an irq chip does not implement the irq_disable callback, then we
use a lazy approach for disabling the interrupt. That means that the
interrupt is marked disabled, but the interrupt line is not
immediately masked in the interrupt chip. It only becomes masked if
the interrupt is raised while it's marked disabled. We use this to avoid
possibly expensive mask/unmask operations for common case operations.
Unfortunately there are devices which do not allow the interrupt to be
disabled easily at the device level. They are forced to use
disable_irq_nosync(). This can result in taking each interrupt twice.
Instead of enforcing the non lazy mode on all interrupts of a irq
chip, provide a settings flag, which can be set by the driver for that
particular interrupt line.
Reported-and-tested-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1510092348370.6097@nanos
When a CPU is offlined all interrupts that have an action are migrated to
other still online CPUs. However, if the interrupt has chained handler
installed this is not done. Chained handlers are used by GPIO drivers which
support interrupts, for instance.
When the affinity is not corrected properly we end up in situation where
most interrupts are not arriving to the online CPUs anymore. For example on
Intel Braswell system which has SD-card card detection signal connected to
a GPIO the IO-APIC routing entries look like below after CPU1 is offlined:
pin30, enabled , level, low , V(52), IRR(0), S(0), logical , D(03), M(1)
pin31, enabled , level, low , V(42), IRR(0), S(0), logical , D(03), M(1)
pin32, enabled , level, low , V(62), IRR(0), S(0), logical , D(03), M(1)
pin5b, enabled , level, low , V(72), IRR(0), S(0), logical , D(03), M(1)
The problem here is that the destination mask still contains both CPUs even
if CPU1 is already offline. This means that the IO-APIC still routes
interrupts to the other CPU as well.
We solve the problem by providing a default action for chained interrupts.
This action allows the migration code to correct affinity (as it finds
desc->action != NULL).
Also make the default action handler to emit a warning if for some reason a
chained handler ends up calling it.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1444039935-30475-1-git-send-email-mika.westerberg@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
A recent cleanup removed the 'irq' parameter from many functions, but
left the documentation for this in place for at least one function.
This removes it.
Fixes: bd0b9ac405 ("genirq: Remove irq argument from irq flow handlers")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: kbuild-all@01.org
Cc: Austin Schuh <austin@peloton-tech.com>
Cc: Santosh Shilimkar <ssantosh@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/5400000.cD19rmgWjV@wuerfel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
A cleanup of the omap gpio driver introduced a use of the
handle_bad_irq() function in a device driver that can be
a loadable module.
This broke the ARM allmodconfig build:
ERROR: "handle_bad_irq" [drivers/gpio/gpio-omap.ko] undefined!
This patch exports the handle_bad_irq symbol in order to
allow the use in modules.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Santosh Shilimkar <ssantosh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Austin Schuh <austin@peloton-tech.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/5847725.4IBopItaOr@wuerfel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ARM and ARM64 have almost identical code for migrating interrupts on
cpu hotunplug. Provide a generic version which can be used by both.
The new code addresses a shortcoming in the ARM[64] variants which
fails to update the affinity change in some cases. The solution for
this is to use the core function irq_do_set_affinity() instead of open
coding it.
[ tglx: Added copyright notice and license boilerplate. Rewrote
subject and changelog. ]
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: <linux-arm-kernel@lists.infradead.org>
Link: http://lkml.kernel.org/r/1443087135-17044-2-git-send-email-yangyingliang@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Per-IRQ directories in procfs are created only when a handler is first
added to the irqdesc, not when the irqdesc is created. In the case of
a shared IRQ, multiple tasks can race to create a directory. This
race condition seems to have been present forever, but is easier to
hit with async probing.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Some drivers might use the per-cpu interrupts and still might be built as a
module. Export request_percpu_irq an free_percpu_irq to these user, which
also make it consistent with enable/disable_percpu_irq that were exported.
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation of request_percpu_irq is confusing and suggest that the
interrupt is not enabled at all, while it is actually enabled on the local
CPU.
Clarify that.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Force threading of interrupts does not really deal with interrupts
which are requested with a primary and a threaded handler. The current
policy is to leave them alone and let the primary handler run in
interrupt context, but we set the ONESHOT flag for those interrupts as
well.
Kohji Okuno debugged a problem with the SDHCI driver where the
interrupt thread waits for a hardware interrupt to trigger, which can't
work well because the hardware interrupt is masked due to the ONESHOT
flag being set. He proposed to set the ONESHOT flag only if the
interrupt does not provide a thread handler.
Though that does not work either because these interrupts can be
shared. So the other interrupt would rightfully get the ONESHOT flag
set and therefor the same situation would happen again.
To deal with this proper, we need to force thread the primary handler
of such interrupts as well. That means that the primary interrupt
handler is treated as any other primary interrupt handler which is not
marked IRQF_NO_THREAD. The threaded handler becomes a separate thread
so the SDHCI flow logic can be handled gracefully.
The same issue was reported against 4.1-rt.
Reported-and-tested-by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
Reported-By: Michal Smucr <msmucr@gmail.com>
Reported-and-tested-by: Nathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1509211058080.5606@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.
Remove the argument.
Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
MSI descriptors are per-irq instead of per irqchip, so move it into
struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-35-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Irq affinity mask is per-irq instead of per irqchip, so move it into
struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433303281-27688-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Handler data (handler_data) is per-irq instead of per irqchip, so move
it into struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-13-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
NUMA node information is per-irq instead of per-irqchip, so move it into
struct irq_common_data. Also use CONFIG_NUMA to guard irq_common_data.node.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433145945-789-8-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a sysfs attribute, /sys/power/pm_wakeup_irq, reporting the IRQ
number of the first wakeup interrupt (that is, the first interrupt
from an IRQ line armed for system wakeup) seen by the kernel during
the most recent system suspend/resume cycle.
This feature will be useful for system wakeup diagnostics of
spurious wakeup interrupts.
Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com>
[ rjw: Fixed up pm_wakeup_irq definition ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
This helper is required for irq chips which do not implement a
irq_set_type callback and need to call down the irq domain hierarchy
for the actual trigger type change.
This helper is required to fix further wreckage caused by the
conversion of TI OMAP to hierarchical irq domains and therefor tagged
for stable.
[ tglx: Massaged changelog ]
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-3-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_chip_retrigger_hierarchy() returns -ENOSYS if it was not able to
find at least one .irq_retrigger() callback implemented in the IRQ
domain hierarchy.
That's wrong, because check_irq_resend() expects a 0 return value from
the callback in case that the hardware assisted resend was not
possible. If the return value is non zero the core code assumes
hardware resend success and the software resend is not invoked.
This results in lost interrupts on platforms where none of the parent
irq chips in the hierarchy implements the retrigger callback.
This is observable on TI OMAP, where the hierarchy is:
ARM GIC <- OMAP wakeupgen <- TI Crossbar
Return 0 instead so the software resend mechanism gets invoked.
[ tglx: Massaged changelog ]
Fixes: 85f08c17de ('genirq: Introduce helper functions...')
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-2-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It is not uncommon (at least with the ARM stuff) to have a piece
of hardware that implements different flavours of "interrupts".
A typical example of this is the GICv3 ITS, which implements
standard PCI/MSI support, but also some form of "generic MSI".
So far, the PCI/MSI domain is registered using the ITS device_node,
so that irq_find_host can return it. On the contrary, the raw MSI
domain is not registered with an device_node, making it impossible
to be looked up by another subsystem (obviously, using the same
device_node twice would only result in confusion, as it is not
defined which one irq_find_host would return).
A solution to this is to "type" domains that may be aliasing, and
to be able to lookup an device_node that matches a given type.
For this, we introduce irq_find_matching_host() as a superset
of irq_find_host:
struct irq_domain *irq_find_matching_host(struct device_node *node,
enum irq_domain_bus_token bus_token);
where bus_token is the "type" we want to match the domain against
(so far, only DOMAIN_BUS_ANY is defined). This result in some
moderately invasive changes on the PPC side (which is the only
user of the .match method).
This has otherwise no functionnal change.
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Ma Jun <majun258@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Duc Dang <dhdang@apm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1438091186-10244-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The following warning is emitted for make xmldocs:
Warning(.//kernel/irq/chip.c:1009): No description found for parameter 'vcpu_info'
Warning(.//kernel/irq/chip.c:1009): Excess function parameter 'dest' description in 'irq_chip_set_vcpu_affinity_parent'
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Link: http://lkml.kernel.org/r/1438164576-5945-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some (admittedly odd) irqchips perform functions that are not directly
related to any of their child IRQ lines, and therefore need to perform
some tasks during suspend/resume regardless of whether there are
any "installed" interrupts for the irqchip. However, the current
generic-chip framework does not call the chip's irq_{suspend,resume}
when there are no interrupts installed (this makes sense, because there
are no irq_data objects for such a call to be made).
More specifically, irq-bcm7120-l2 configures both a forwarding mask
(which affects other top-level GIC IRQs) and a second-level interrupt
mask (for managing its own child interrupts). The former must be
saved/restored on suspend/resume, even when there's nothing to do for
the latter.
This patch adds a new set of suspend/resume hooks to irq_chip_generic,
to help represent *chip* suspend/resume, rather than IRQ suspend/resume.
These callbacks will always be called for an IRQ chip (regardless of the
installed interrupts) and are based on the per-chip irq_chip_generic
struct, rather than the per-IRQ irq_data struct.
The original problem report is described in extra detail here:
http://lkml.kernel.org/g/20150619224123.GL4917@ld-irv-0074
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: Gregory Fong <gregory.0xf0@gmail.com>
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: linux-mips@linux-mips.org
Cc: Kevin Cernekee <cernekee@chromium.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1437607300-40858-1-git-send-email-computersforpeace@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export these functions to be able to build the Qualcomm family A PMIC
gpio and mpp drivers as modules.
[ tglx: Made them GPL exports ]
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <kernel-build-reports@lists.linaro.org>
Cc: <linaro-kernel@lists.linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/1437594184-22966-1-git-send-email-bjorn.andersson@sonymobile.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move alloc_msi_entry() from PCI MSI code into generic MSI code, so it
can be reused by other generic MSI drivers. Also introduce
free_msi_entry() for completeness.
Suggested-by: Stuart Yoder <stuart.yoder@freescale.com>.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/1436428847-8886-13-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The resend mechanism happily calls the interrupt handler of interrupts
which are marked IRQ_NESTED_THREAD from softirq context. This can
result in crashes because the interrupt handler is not the proper way
to invoke the device handlers. They must be invoked via
handle_nested_irq.
Prevent the resend even if the interrupt has no valid parent irq
set. Its better to have a lost interrupt than a crashing machine.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Unused except for the alpha wrapper, which can retrieve if from the
irq descriptor.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1433391238-19471-21-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Provide a irq_desc based variant of irq_can_set_affinity() to avoid a
redundant lookup for the core code users.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Only required for the slow path. Retrieve it from irq descriptor if
necessary.
[ tglx: Split out from combo patch. Left [try_]misrouted_irq()
untouched as there is no win in the slow path ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433391238-19471-19-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Unused argument.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not really a hotpath, so __report_bad_irq() can retrieve the irq
number from the irq descriptor.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Unused argument in both functions.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Solely used for debug output. Can be retrieved from irq descriptor if
necessary.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's only required for debug output and can be retrieved from the irq
descriptor if necessary.
[ tglx: Split out from combo patch ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's only used in the software resend case and can be retrieved from
irq_desc if necessary.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1433391238-19471-18-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The first parameter 'irq' is never used by
kstat_incr_irqs_this_cpu(). Remove it.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1433391238-19471-16-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When a cpu goes up some architectures (e.g. x86) have to walk the irq
space to set up the vector space for the cpu. While this needs extra
protection at the architecture level we can avoid a few race
conditions by preventing the concurrent allocation/free of irq
descriptors and the associated data.
When a cpu goes down it moves the interrupts which are targeted to
this cpu away by reassigning the affinities. While this happens
interrupts can be allocated and freed, which opens a can of race
conditions in the code which reassignes the affinities because
interrupt descriptors might be freed underneath.
Example:
CPU1 CPU2
cpu_up/down
irq_desc = irq_to_desc(irq);
remove_from_radix_tree(desc);
raw_spin_lock(&desc->lock);
free(desc);
We could protect the irq descriptors with RCU, but that would require
a full tree change of all accesses to interrupt descriptors. But
fortunately these kind of race conditions are rather limited to a few
things like cpu hotplug. The normal setup/teardown is very well
serialized. So the simpler and obvious solution is:
Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
If an interrupt is marked with the no balancing flag, we still allow
setting the affinity for such an interrupt from the kernel itself, but
for interrupts which move the affinity from interrupt context via
irq_move_mask_irq() this runs into a check for the no balancing flag,
which in turn ends up with an endless storm of stack dumps because the
move pending flag is not reset.
Allow the move for interrupts which have the no balancing flag set and
clear the move pending bit before checking for interrupts with the per
cpu flag set.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1506201002570.4107@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Driver authors seem to get the ordering of irq_set_chained_handler()
and irq_set_handler_data() wrong - ordering the former before the
latter. This opens a race window where, if there is an interrupt
pending, the handler will be called between these two calls,
potentially resulting in an oops.
Provide a single interface to set both of these together, especially
as that's commonly what is required.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/E1Z4yzs-0002Rw-4B@rmk-PC.arm.linux.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Introduce helper function irq_data_get_node() and variants thereof to
hide struct irq_data implementation details.
Convert the core code to use them.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433145945-789-5-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With the introduction of hierarchy irqdomain, struct irq_data becomes
per-chip instead of per-irq and there may be multiple irq_datas
associated with the same irq. Some per-irq data stored in struct
irq_data now may get duplicated into multiple irq_datas, and causes
inconsistent view.
So introduce struct irq_common_data to host per-irq common data and to
achieve consistent view among irq_chips.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-4-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The functions irq_move_irq() and irq_move_masked_irq() expect that the
caller passes the top-level irq_data to them when hierarchical
irqdomains are enabled. But that's not true when called from
apic_ack_edge(), which results in a null pointer dereference by
idata->chip->irq_mask(idata).
Instead of fixing callers to passing top-level irq_data, we rather
change irq_move_irq()/irq_move_masked_irq() to accept any irq_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1433145945-789-3-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For irq associated with hierarchy irqdomains, there will be multiple
irq_datas for one irq_desc. So enhance irq_data_to_desc() to support
hierarchy irqdomain. Also export irq_data_to_desc() as an inline
function for later reuse.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-2-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If no_irq_chip is used for wake up (e.g. gpio-keys with a simple GPIO
controller), the following warning is printed on resume from s2ram:
WANING: CPU: 0 PID: 1046 at kernel/irq/manage.c:537 irq_set_irq_wake+0x9c/0xf8()
Unbalanced IRQ 113 wake disable
This happens because no_irq_chip does not implement
irq_chip.irq_set_wake(), causing set_irq_wake_real() to return -ENXIO,
and irq_set_irq_wake() to reset the wake_depth to zero.
Set IRQCHIP_SKIP_SET_WAKE to indicate that irq_chip.irq_set_wake() is
not implemented.
Cfr. commit 10a50f1ab5 ("genirq: Set IRQCHIP_SKIP_SET_WAKE flag
for dummy_irq_chip").
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Roger Quadros <rogerq@ti.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Link: http://lkml.kernel.org/r/1432281529-23325-1-git-send-email-geert%2Brenesas@glider.be
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.
By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.
Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
-->QEMU and KVM handle this
-->KVM call this interface (passing posted interrupts descriptor
and guest vector)
-->irq core will transfer the control to IOMMU
-->IOMMU will do the real work of updating IRTE (IRTE has new
format for VT-d Posted-Interrupts)
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Link: http://lkml.kernel.org/r/1432026437-16560-2-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Nested IRQs can only fire when the parent irq fires. So when the
parent is suspended, there is no need to suspend the child irq.
Suspending nested irqs can cause a problem is they are suspended or
resumed in the wrong order. If an interrupt fires while the parent is
active but the child is suspended, then the interrupt will not be
acknowledged properly and so an interrupt storm can result. This is
particularly likely if the parent is resumed before the child, and the
interrupt was raised during suspend.
Ensuring correct ordering would be possible, but it is simpler to just
never suspend nested interrupts.
Signed-off-by: NeilBrown <neil@brown.name>
Cc: GTA04 owners <gta04-owner@goldelico.com>
Cc: Kalle Jokiniemi <kalle.jokiniemi@jollamobile.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/20150517151934.2393e8f8@notabene.brown
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
request_any_context_irq() returns a negative value on failure.
It returns either IRQC_IS_HARDIRQ or IRQC_IS_NESTED on success.
So fix testing return value of request_any_context_irq().
Also fixup the return value of devm_request_any_context_irq() to make it
consistent with request_any_context_irq().
Fixes: 0668d30651 ("genirq: Add devm_request_any_context_irq()")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1431334978.17783.4.camel@ingics.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The return type of kstat_irqs_usr() is unsigned int and kstat_irqs() also
returns unsigned int so sum should be unsigned int here as well.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Link: http://lkml.kernel.org/r/1430642951-23964-1-git-send-email-hofrat@osadl.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
kstat_irqs is unsigned int and the return type of kstat_irqs() is also
unsigned int so sum should be unsigned int as well even if the result
is correct due to automatic type conversion.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Link: http://lkml.kernel.org/r/1430642930-23929-1-git-send-email-hofrat@osadl.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Without this system suspend is broken on systems that have
drivers calling enable/disable_irq_wake() for interrupts based off
the dummy irq hook. (e.g. drivers/gpio/gpio-pcf857x.c)
Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: <cw00.choi@samsung.com>
Cc: <balbi@ti.com>
Cc: <tony@atomide.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Link: http://lkml.kernel.org/r/552E1DD3.4040106@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Purge the gic_arch_extn hacks and abuse by using the new stacked domains
NOTE: Due to the nature of these changes, patches crossing subsystems have
been kept together in their own branches.
- tegra
- Handle the LIC properly
- omap
- Convert crossbar to stacked domains
- kill arm,routable-irqs in GIC binding
- exynos
- Convert PMU wakeup to stacked domains
- shmobile, ux500, zynq (irq_set_wake branch)
- Switch from abusing gic_arch_extn to using gic_set_irqchip_flags
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJVKFhRAAoJEP45WPkGe8ZnYFcP/iBznjkMYG+OUwrxo7G4rTyu
JYj0dmg/D76ewFsxWFv24II9V+KJaqrEtFTHH4MVbeEbbrDIx7Am0i/Ip6rDRgxS
7Q/jGic8etfPGV8gW6x38zbTHOl1rfqQtoHcqBH5FnLITuMAuHPa51jpwhMik4ri
AbMwb6Whep6tEsxiEjspPxXWphEZoXluOkRjPLokTwuifo4rEo7bqU8WMizzSW5g
xEjf8eUvBYIMTA40FBQWHQwxf1jRySSW2A9u5JgT1ccZHoajEyDgQr22KUHpCAWU
hlZ/8uTqCUeecDQKFPr4zXhq9mbEVZ7lld5Gl82cxY6aI3Xj/bUI3tSYubPWEgx6
0VhbmvjqKPiFfdCrLq5ZTY5UHmW8khdttdycIPNz9LmUDVgIzJpmpAW+oyG7BN/N
QgGF4lzaN49mHQmjtXGfwY3iJTadxyVaWoZTBinjw8LyxpzUO/MNQGLumsxEtkxN
Nbbsc2k+ERpSx40ospB1WOslAzMsNi6eLwqLRfjGGfSYK1P6Mm7FhansJm08p1/D
8h6ymqA4heZrYdI1vrfuy7QuEqQgnVUf0TDTHxX+aNGrHnBSsPTTfYHBOHXUh4Cr
Ox3yLECAhWle4VlgInu3XLRmuUiYGk4JV4nbZUjpZvIaOZV4gLArcsQU7C/KTDT8
CqrybDOIxFkIbxfU+EE0
=IPgJ
-----END PGP SIGNATURE-----
Merge tag 'irqchip-core-4.1-3' of git://git.infradead.org/users/jcooper/linux into irq/core
irqchip core change for v4.1 (round 3) from Jason Cooper
Purge the gic_arch_extn hacks and abuse by using the new stacked domains
NOTE: Due to the nature of these changes, patches crossing subsystems have
been kept together in their own branches.
- tegra
- Handle the LIC properly
- omap
- Convert crossbar to stacked domains
- kill arm,routable-irqs in GIC binding
- exynos
- Convert PMU wakeup to stacked domains
- shmobile, ux500, zynq (irq_set_wake branch)
- Switch from abusing gic_arch_extn to using gic_set_irqchip_flags
There is a number of cases where a kernel subsystem may want to
introspect the state of an interrupt at the irqchip level:
- When a peripheral is shared between virtual machines,
its interrupt state becomes part of the guest's state,
and must be switched accordingly. KVM on arm/arm64 requires
this for its guest-visible timer
- Some GPIO controllers seem to require peeking into the
interrupt controller they are connected to to report
their internal state
This seem to be a pattern that is common enough for the core code
to try and support this without too many horrible hacks. Introduce
a pair of accessors (irq_get_irqchip_state/irq_set_irqchip_state)
to retrieve the bits that can be of interest to another subsystem:
pending, active, and masked.
- irq_get_irqchip_state returns the state of the interrupt according
to a parameter set to IRQCHIP_STATE_PENDING, IRQCHIP_STATE_ACTIVE,
IRQCHIP_STATE_MASKED or IRQCHIP_STATE_LINE_LEVEL.
- irq_set_irqchip_state similarly sets the state of the interrupt.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Tested-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Phong Vo <pvo@apm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Tin Huynh <tnhuynh@apm.com>
Cc: Y Vo <yvo@apm.com>
Cc: Toan Le <toanle@apm.com>
Cc: Bjorn Andersson <bjorn@kryo.se>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1426676484-21812-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
While debugging an unrelated issue with the GICv3 ITS driver, the
following trace triggered:
WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:1121 irq_domain_free_irqs+0x160/0x17c()
NULL pointer, cannot free irq
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 3.19.0-rc6+ #3690
Hardware name: FVP Base (DT)
Call trace:
[<ffffffc000089398>] dump_backtrace+0x0/0x13c
[<ffffffc0000894e4>] show_stack+0x10/0x1c
[<ffffffc00066d134>] dump_stack+0x74/0x94
[<ffffffc0000a92f8>] warn_slowpath_common+0x9c/0xd4
[<ffffffc0000a938c>] warn_slowpath_fmt+0x5c/0x80
[<ffffffc0000ee04c>] irq_domain_free_irqs+0x15c/0x17c
[<ffffffc0000ef918>] msi_domain_free_irqs+0x58/0x74
[<ffffffc000386f58>] free_msi_irqs+0xb4/0x1c0
// The msi_prepare callback fails here
[<ffffffc0003872c0>] pci_enable_msix+0x25c/0x3d4
[<ffffffc00038746c>] pci_enable_msix_range+0x34/0x80
[<ffffffc0003924ac>] vp_try_to_find_vqs+0xec/0x528
[<ffffffc000392954>] vp_find_vqs+0x6c/0xa8
[<ffffffc0003ee2a8>] init_vq+0x120/0x248
[<ffffffc0003eefb0>] virtblk_probe+0xb0/0x6bc
[<ffffffc00038fc34>] virtio_dev_probe+0x17c/0x214
[<ffffffc0003d4a04>] driver_probe_device+0x7c/0x23c
[<ffffffc0003d4cb0>] __driver_attach+0x98/0xa0
[<ffffffc0003d2c60>] bus_for_each_dev+0x60/0xb4
[<ffffffc0003d455c>] driver_attach+0x1c/0x28
[<ffffffc0003d41b0>] bus_add_driver+0x150/0x208
[<ffffffc0003d54c0>] driver_register+0x64/0x130
[<ffffffc00038f9e8>] register_virtio_driver+0x24/0x68
[<ffffffc00091320c>] init+0x70/0xac
[<ffffffc0000828f0>] do_one_initcall+0x94/0x1d0
[<ffffffc0008e9b00>] kernel_init_freeable+0x144/0x1e4
[<ffffffc00066a434>] kernel_init+0xc/0xd8
---[ end trace f9ee562a77cc7bae ]---
The ITS msi_prepare callback having failed, we end-up trying to
free MSIs that have never been allocated. Oddly enough, the kernel
is pretty upset about it.
It turns out that this behaviour was expected before the MSI domain
was introduced (and dealt with in arch_teardown_msi_irqs).
The obvious fix is to detect this early enough and bail out.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1422299419-6051-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This proves to be useful with stacked domains, when the current
domain doesn't implement wake-up, but expect the parent to do so.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/1426088629-15377-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJU9enEAAoJEHm+PkMAQRiG/ewIAJ4MW4tcAhaVj6ndCF3+uL/b
RaVm1apUjsTloe5Fl0TT9J5CO3zdOetmMNToy2sf0W4MJDIyHf21o83l7eniV/6q
al/c3fQ6HVtNjiSUNghTtzVlL+gUD1F60b9BGYi1V5h2Mp8u0NG1alTGLQfCB8sE
ArB+v2aWEdSPn7mZDA0Yuc1In+8bkpht3oy+OLD/8JNkqqLnml9YOyPjM1cuRpBr
NxKCLcPzSHH9/nR3T6XtkxXYV5xD3+CDm9roJhfHukoFmfT/G3C65Zcp2KEed/Cw
QQpu+ox7fpUs10F/Fbfm8AE+tRB4o2sGh97sprXrO5oaFdx6FPIBo4WN8i/Vy68=
=qpY+
-----END PGP SIGNATURE-----
Merge tag 'v4.0-rc2' into irq/core, to refresh the tree before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It currently is required that all users of NO_SUSPEND interrupt
lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
WARN_ON_ONCE() in irq_pm_install_action() will trigger. That is
done to warn about situations in which unprepared interrupt handlers
may be run unnecessarily for suspended devices and may attempt to
access those devices by mistake. However, it may cause drivers
that have no technical reasons for using IRQF_NO_SUSPEND to set
that flag just because they happen to share the interrupt line
with something like a timer.
Moreover, the generic handling of wakeup interrupts introduced by
commit 9ce7a25849 (genirq: Simplify wakeup mechanism) only works
for IRQs without any NO_SUSPEND users, so the drivers of wakeup
devices needing to use shared NO_SUSPEND interrupt lines for
signaling system wakeup generally have to detect wakeup in their
interrupt handlers. Thus if they happen to share an interrupt line
with a NO_SUSPEND user, they also need to request that their
interrupt handlers be run after suspend_device_irqs().
In both cases the reason for using IRQF_NO_SUSPEND is not because
the driver in question has a genuine need to run its interrupt
handler after suspend_device_irqs(), but because it happens to
share the line with some other NO_SUSPEND user. Otherwise, the
driver would do without IRQF_NO_SUSPEND just fine.
To make it possible to specify that condition explicitly, introduce
a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
that, when set, will indicate to the IRQ core that the interrupt
user is generally fine with suspending the IRQ, but it also can
tolerate handler invocations after suspend_device_irqs() and, in
particular, it is capable of detecting system wakeup and triggering
it as appropriate from its interrupt handler.
That will allow us to work around a problem with a shared timer
interrupt line on at91 platforms.
Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
Link: http://marc.info/?t=142252775300011&r=1&w=2
Link: https://lkml.org/lkml/2014/12/15/552
Reported-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
For things like netpoll there is a need to disable an interrupt from
atomic context. Currently netpoll uses disable_irq() which will
sleep-wait on threaded handlers and thus forced_irqthreads breaks
things.
Provide disable_hardirq(), which uses synchronize_hardirq() to only wait
for active hardirq handlers; also change synchronize_hardirq() to
return the status of threaded handlers.
This will allow one to try-disable an interrupt from atomic context, or
in case of request_threaded_irq() to only wait for the hardirq part.
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Miller <davem@davemloft.net>
Cc: Eyal Perry <eyalpe@mellanox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Quentin Lambert <lambert.quentin@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20150205130623.GH5029@twins.programming.kicks-ass.net
[ Fixed typos and such. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull irqchip updates from Ingo Molnar:
"Various irqchip driver updates, plus a genirq core update that allows
the initial spreading of irqs amonst CPUs without having to do it from
user-space"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Fix null pointer reference in irq_set_affinity_hint()
irqchip: gic: Allow interrupt level to be set for PPIs
irqchip: mips-gic: Handle pending interrupts once in __gic_irq_dispatch()
irqchip: Conexant CX92755 interrupts controller driver
irqchip: Devicetree: document Conexant Digicolor irq binding
irqchip: omap-intc: Remove unused legacy interface for omap2
irqchip: omap-intc: Fix support for dm814 and dm816
irqchip: mtk-sysirq: Get irq number from register resource size
irqchip: renesas-intc-irqpin: r8a7779 IRLM setup support
genirq: Set initial affinity in irq_set_affinity_hint()
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The recent set_affinity commit by me introduced some null
pointer dereferences on driver unload, because some drivers
call this function with a NULL argument. This fixes the issue
by just checking for null before setting the affinity mask.
Fixes: e2e64a9325 ("genirq: Set initial affinity in irq_set_affinity_hint()")
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20150128185739.9689.84588.stgit@jbrandeb-cp2.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Problem:
The default behavior of the kernel is somewhat undesirable as all
requested interrupts end up on CPU0 after registration. A user can
run irqbalance daemon, or can manually configure smp_affinity via the
proc filesystem, but the default affinity of the interrupts for all
devices is always CPU zero, this can cause performance problems or
very heavy cpu use of only one core if not noticed and fixed by the
user.
Solution:
Enable the setting of the initial affinity directly when the driver
sets a hint.
This enabling means that kernel drivers can include an initial
affinity setting for the interrupt, instead of all interrupts starting
out life on CPU0. Of course if irqbalance is still running then the
interrupts will get moved as before.
This function is currently called by drivers in block, crypto,
infiniband, ethernet and scsi trees, but only a handful, so these will
be the devices affected by this change.
Tested on i40e, and default interrupts were spread across the CPUs
according to the hint.
drivers/block/mtip32xx/mtip32xx.c:3
drivers/block/nvme-core.c:2
drivers/crypto/qat/qat_dh895xcc/adf_isr.c:3
drivers/infiniband/hw/qib/qib_iba7322.c:2
drivers/net/ethernet/intel/i40e/i40e_main.c:3
drivers/net/ethernet/intel/i40evf/i40evf_main.c:3
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3
drivers/net/ethernet/mellanox/mlx4/en_cq.c:2
drivers/scsi/hpsa.c:3
drivers/scsi/lpfc/lpfc_init.c:3
drivers/scsi/megaraid/megaraid_sas_base.c:8
drivers/soc/ti/knav_qmss_acc.c:1
drivers/soc/ti/knav_qmss_queue.c:2
drivers/virtio/virtio_pci_common.c:2
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20141219012206.4220.27491.stgit@jbrandeb-cp2.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.
CPU0 CPU1
show_interrupts()
desc = irq_to_desc(X);
free_desc(desc)
remove_from_radix_tree();
kfree(desc);
raw_spinlock_irq(&desc->lock);
/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.
The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.
For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.
Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.
Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.
Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.
Fixes: 1f5a5b87f7 "genirq: Implement a sane sparse_irq allocator"
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Pull irq domain updates from Thomas Gleixner:
"The real interesting irq updates:
- Support for hierarchical irq domains:
For complex interrupt routing scenarios where more than one
interrupt related chip is involved we had no proper representation
in the generic interrupt infrastructure so far. That made people
implement rather ugly constructs in their nested irq chip
implementations. The main offenders are x86 and arm/gic.
To distangle that mess we have now hierarchical irqdomains which
seperate the various interrupt chips and connect them via the
hierarchical domains. That keeps the domain specific details
internal to the particular hierarchy level and removes the
criss/cross referencing of chip internals. The resulting hierarchy
for a complex x86 system will look like this:
vector mapped: 74
msi-0 mapped: 2
dmar-ir-1 mapped: 69
ioapic-1 mapped: 4
ioapic-0 mapped: 20
pci-msi-2 mapped: 45
dmar-ir-0 mapped: 3
ioapic-2 mapped: 1
pci-msi-1 mapped: 2
htirq mapped: 0
Neither ioapic nor pci-msi know about the dmar interrupt remapping
between themself and the vector domain. If interrupt remapping is
disabled ioapic and pci-msi become direct childs of the vector
domain.
In hindsight we should have done that years ago, but in hindsight
we always know better :)
- Support for generic MSI interrupt domain handling
We have more and more non PCI related MSI interrupts, so providing
a generic infrastructure for this is better than having all
affected architectures implementing their own private hacks.
- Support for PCI-MSI interrupt domain handling, based on the generic
MSI support.
This part carries the pci/msi branch from Bjorn Helgaas pci tree to
avoid a massive conflict. The PCI/MSI parts are acked by Bjorn.
I have two more branches on top of this. The full conversion of x86
to hierarchical domains and a partial conversion of arm/gic"
* 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
genirq: Move irq_chip_write_msi_msg() helper to core
PCI/MSI: Allow an msi_controller to be associated to an irq domain
PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain
PCI/MSI: Enhance core to support hierarchy irqdomain
PCI/MSI: Move cached entry functions to irq core
genirq: Provide default callbacks for msi_domain_ops
genirq: Introduce msi_domain_alloc/free_irqs()
asm-generic: Add msi.h
genirq: Add generic msi irq domain support
genirq: Introduce callback irq_chip.irq_write_msi_msg
genirq: Work around __irq_set_handler vs stacked domains ordering issues
irqdomain: Introduce helper function irq_domain_add_hierarchy()
irqdomain: Implement a method to automatically call parent domains alloc/free
genirq: Introduce helper irq_domain_set_info() to reduce duplicated code
genirq: Split out flow handler typedefs into seperate header file
genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip
genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip
genirq: Add more helper functions to support stacked irq_chip
genirq: Introduce helper functions to support stacked irq_chip
irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF
...
No point to expose this to the world. The only legitimate user is the
core code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Required to support non PCI based MSI.
[ tglx: Extracted from Jiangs patch series ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Implement the basic functions for MSI interrupt support with
hierarchical interrupt domains.
[ tglx: Extracted and combined from several patches ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With the introduction of stacked domains, we have the issue that,
depending on where in the stack this is called, __irq_set_handler
will succeed or fail: If this is called from the inner irqchip,
__irq_set_handler() will fail, as it will look at the outer domain
as the (desc->irq_data.chip == &no_irq_chip) test fails (we haven't
set the top level yet).
This patch implements the following: "If there is at least one
valid irqchip in the domain, it will probably sort itself out".
This is clearly not ideal, but it is far less confusing then
crashing because the top-level domain is not up yet.
[ tglx: Added comment and a protection against chained interrupts in
that context ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1416048553-29289-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Introduce helper function irq_domain_add_hierarchy(), which creates
a linear irqdomain if parameter 'size' is not zero, otherwise creates
a tree irqdomain.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Link: http://lkml.kernel.org/r/1416061447-9472-5-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a flags to irq_domain.flags to control whether the irqdomain core
should automatically call parent irqdomain's alloc/free callbacks. It
help to reduce hierarchy irqdomains users' code size.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Link: http://lkml.kernel.org/r/1416061447-9472-4-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add IRQ_SET_MASK_OK_DONE in addition to IRQ_SET_MASK_OK and
IRQ_SET_MASK_OK_NOCOPY to support stacked irqchip. IRQ_SET_MASK_OK_DONE
is the same as IRQ_SET_MASK_OK to irq core. To stacked irqchip, it means
that ascendant irqchips have done all the work and no more handling
needed in descendant irqchips.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add callback irq_compose_msi_msg to struct irq_chip, which will be used
to support stacked irqchip.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Now we already support hierarchy irq_data, so introduce several helpers
to support stacked irq_chips.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It is possible to call irq_create_of_mapping to create/translate the
same IRQ from DT for multiple times. Perform irq_find_mapping check
and set_type for hierarchy irqdomain in irq_create_of_mapping() to
avoid duplicate these functionality in all outer most irqdomain.
Signed-off-by: Yingjoe Chen <yingjoe.chen@mediatek.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We plan to use hierarchy irqdomain to suppport CPU vector assignment,
interrupt remapping controller, IO-APIC controller, MSI interrupt
and hypertransport interrupt etc on x86 platforms. So extend irqdomain
interfaces to support hierarchy irqdomain.
There are already many clients of current irqdomain interfaces.
To minimize the changes, we choose to introduce new version 2 interfaces
to support hierarchy instead of extending existing irqdomain interfaces.
According to Thomas's suggestion, the most important design decision is
to build hierarchy struct irq_data to support hierarchy irqdomain, so
hierarchy irqdomain related data could be saved in struct irq_data.
With support of hierarchy irq_data, we could also support stacked
irq_chips. This is most useful in case of set_affinity().
The new hierarchy irqdomain introduces following interfaces:
1) irq_domain_alloc_irqs()/irq_domain_free_irqs(): allocate/release IRQ
and related resources.
2) __irq_domain_alloc_irqs(): a special version to support legacy IRQs.
3) irq_domain_activate_irq()/irq_domain_deactivate_irq(): program
interrupt controllers to activate/deactivate interrupt.
There are also several help functions to ease irqdomain implemenations:
1) irq_domain_get_irq_data(): get irq_data associated with a specific
irqdomain.
2) irq_domain_set_hwirq_and_chip(): save irqdomain specific data into
irq_data.
3) irq_domain_alloc_irqs_parent()/irq_domain_free_irqs_parent(): invoke
parent irqdomain's alloc/free callbacks.
We also changed irq_startup()/irq_shutdown() to invoke
irq_domain_activate_irq()/irq_domain_deactivate_irq() to program
interrupt controller when start/stop interrupts.
[ tglx: Folded parts of the later patch series in ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use io{read,write}32be if the caller specified IRQ_GC_BE_IO when creating
the irqchip.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/1415342669-30640-5-git-send-email-cernekee@gmail.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Pass in the irq_chip_generic struct so we can use different readl/writel
settings for each irqchip driver, when appropriate. Compute
(gc->reg_base + reg_offset) in the helper function because this is pretty
much what all callers want to do anyway.
Compile-tested using the following configurations:
at91_dt_defconfig (CONFIG_ATMEL_AIC_IRQ=y)
sama5_defconfig (CONFIG_ATMEL_AIC5_IRQ=y)
sunxi_defconfig (CONFIG_ARCH_SUNXI=y)
tb10x (ARC) is untested.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/1415342669-30640-3-git-send-email-cernekee@gmail.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
- Rework the handling of wakeup IRQs by the IRQ core such that
all of them will be switched over to "wakeup" mode in
suspend_device_irqs() and in that mode the first interrupt
will abort system suspend in progress or wake up the system
if already in suspend-to-idle (or equivalent) without executing
any interrupt handlers. Among other things that eliminates the
wakeup-related motivation to use the IRQF_NO_SUSPEND interrupt
flag with interrupts which don't really need it and should not
use it (Thomas Gleixner and Rafael J Wysocki).
- Switch over ACPI to handling wakeup interrupts with the help
of the new mechanism introduced by the above IRQ core rework
(Rafael J Wysocki).
- Rework the core generic PM domains code to eliminate code that's
not used, add DT support and add a generic mechanism by which
devices can be added to PM domains automatically during
enumeration (Ulf Hansson, Geert Uytterhoeven and Tomasz Figa).
- Add debugfs-based mechanics for debugging generic PM domains
(Maciej Matraszek).
- ACPICA update to upstream version 20140828. Included are updates
related to the SRAT and GTDT tables and the _PSx methods are in
the METHOD_NAME list now (Bob Moore and Hanjun Guo).
- Add _OSI("Darwin") support to the ACPI core (unfortunately, that
can't really be done in a straightforward way) to prevent
Thunderbolt from being turned off on Apple systems after boot
(or after resume from system suspend) and rework the ACPI Smart
Battery Subsystem (SBS) driver to work correctly with Apple
platforms (Matthew Garrett and Andreas Noever).
- ACPI LPSS (Low-Power Subsystem) driver update cleaning up the
code, adding support for 133MHz I2C source clock on Intel Baytrail
to it and making it avoid using UART RTS override with Auto Flow
Control (Heikki Krogerus).
- ACPI backlight updates removing the video_set_use_native_backlight
quirk which is not necessary any more, making the code check the
list of output devices returned by the _DOD method to avoid
creating acpi_video interfaces that won't work and adding a quirk
for Lenovo Ideapad Z570 (Hans de Goede, Aaron Lu and Stepan Bujnak).
- New Win8 ACPI OSI quirks for some Dell laptops (Edward Lin).
- Assorted ACPI code cleanups (Fabian Frederick, Rasmus Villemoes,
Sudip Mukherjee, Yijing Wang, and Zhang Rui).
- cpufreq core updates and cleanups (Viresh Kumar, Preeti U Murthy,
Rasmus Villemoes).
- cpufreq driver updates: cpufreq-cpu0/cpufreq-dt (driver name
change among other things), ppc-corenet, powernv (Viresh Kumar,
Preeti U Murthy, Shilpasri G Bhat, Lucas Stach).
- cpuidle support for DT-based idle states infrastructure, new
ARM64 cpuidle driver, cpuidle core cleanups (Lorenzo Pieralisi,
Rasmus Villemoes).
- ARM big.LITTLE cpuidle driver updates: support for DT-based
initialization and Exynos5800 compatible string (Lorenzo Pieralisi,
Kevin Hilman).
- Rework of the test_suspend kernel command line argument and
a new trace event for console resume (Srinivas Pandruvada,
Todd E Brandt).
- Second attempt to optimize swsusp_free() (hibernation core) to
make it avoid going through all PFNs which may be way too slow on
some systems (Joerg Roedel).
- devfreq updates (Paul Bolle, Punit Agrawal, Ãrjan Eide).
- rockchip-io Adaptive Voltage Scaling (AVS) driver and AVS
entry update in MAINTAINERS (Heiko Stübner, Kevin Hilman).
- PM core fix related to clock management (Geert Uytterhoeven).
- PM core's sysfs code cleanup (Johannes Berg).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJUNbJoAAoJEILEb/54YlRxRp8QAJyGIPdx+f03oBir+7vvEwhY
svxd+V9xXK0UgWNGkCvlMk/1RIVy0qqtXliUrDaE+9tcHACA9+iAxMmNmDsjLOiO
gpazuz5kgeznrmp1eNwQnYTt+OCReQIcyCsj4q4fNo9bbETTyr2bRz226LEuZekC
TAiKdphYoOszFBgTVg5gfu+lqjHyXjgXPnwMTlRYn1y4YL2adDIgxj9cFedykTTW
Eu593TY2dH6ovERJ6q3qxZbRuWuxtww95J07b3t2/2Eb3e/R/zlX0/XJ/C88f/m2
DkqngbOYqCdw+zJeN6k8631foyfUwAcTd0sJ1+5nsm5H4NE5NqObjbxOk5/yNht6
HgvgISGHWLerEw+A/Dk6o0oZOtR1G/TAQ5qQk5nUfKT/sSoU+9/USsXtWhXwZCia
XccnJgW6ZtPrJJP3zDnkrxe3gndmLic11QXArw2IhWTsq0sZlAyMgtauBXLdDiQa
H/AMiYrUNmIABef1cirBLTtgXN4Zbsai9vIrxMmV7OgBrclrh52NTjzr05P5Hnl2
fRK56mb6mP59LymI7n8fyXL8tHnbNwFvTaxuvrZmzcYbzL0l9DuPocJrrTHRSfhm
GFfzfvLj0R66ZM4PthRSwz4H2v1FnlRcCkj5k/QjtBPlyzxtOnJveqve5umbrnb9
T5mRmlAs4iYwLuKCVVNT
=sIv/
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"Features-wise, to me the most important this time is a rework of
wakeup interrupts handling in the core that makes them work
consistently across all of the available sleep states, including
suspend-to-idle. Many thanks to Thomas Gleixner for his help with
this work.
Second is an update of the generic PM domains code that has been in
need of some care for quite a while. Unused code is being removed, DT
support is being added and domains are now going to be attached to
devices in bus type code in analogy with the ACPI PM domain. The
majority of work here was done by Ulf Hansson who also has been the
most active developer this time.
Apart from this we have a traditional ACPICA update, this time to
upstream version 20140828 and a few ACPI wakeup interrupts handling
patches on top of the general rework mentioned above. There also are
several cpufreq commits including renaming the cpufreq-cpu0 driver to
cpufreq-dt, as this is what implements generic DT-based cpufreq
support, and a new DT-based idle states infrastructure for cpuidle.
In addition to that, the ACPI LPSS driver is updated, ACPI support for
Apple machines is improved, a few bugs are fixed and a few cleanups
are made all over.
Finally, the Adaptive Voltage Scaling (AVS) subsystem now has a tree
maintained by Kevin Hilman that will be merged through the PM tree.
Numbers-wise, the generic PM domains update takes the lead this time
with 32 non-merge commits, second is cpufreq (15 commits) and the 3rd
place goes to the wakeup interrupts handling rework (13 commits).
Specifics:
- Rework the handling of wakeup IRQs by the IRQ core such that all of
them will be switched over to "wakeup" mode in suspend_device_irqs()
and in that mode the first interrupt will abort system suspend in
progress or wake up the system if already in suspend-to-idle (or
equivalent) without executing any interrupt handlers. Among other
things that eliminates the wakeup-related motivation to use the
IRQF_NO_SUSPEND interrupt flag with interrupts which don't really
need it and should not use it (Thomas Gleixner and Rafael Wysocki)
- Switch over ACPI to handling wakeup interrupts with the help of the
new mechanism introduced by the above IRQ core rework (Rafael Wysocki)
- Rework the core generic PM domains code to eliminate code that's
not used, add DT support and add a generic mechanism by which
devices can be added to PM domains automatically during enumeration
(Ulf Hansson, Geert Uytterhoeven and Tomasz Figa).
- Add debugfs-based mechanics for debugging generic PM domains
(Maciej Matraszek).
- ACPICA update to upstream version 20140828. Included are updates
related to the SRAT and GTDT tables and the _PSx methods are in the
METHOD_NAME list now (Bob Moore and Hanjun Guo).
- Add _OSI("Darwin") support to the ACPI core (unfortunately, that
can't really be done in a straightforward way) to prevent
Thunderbolt from being turned off on Apple systems after boot (or
after resume from system suspend) and rework the ACPI Smart Battery
Subsystem (SBS) driver to work correctly with Apple platforms
(Matthew Garrett and Andreas Noever).
- ACPI LPSS (Low-Power Subsystem) driver update cleaning up the code,
adding support for 133MHz I2C source clock on Intel Baytrail to it
and making it avoid using UART RTS override with Auto Flow Control
(Heikki Krogerus).
- ACPI backlight updates removing the video_set_use_native_backlight
quirk which is not necessary any more, making the code check the
list of output devices returned by the _DOD method to avoid
creating acpi_video interfaces that won't work and adding a quirk
for Lenovo Ideapad Z570 (Hans de Goede, Aaron Lu and Stepan Bujnak)
- New Win8 ACPI OSI quirks for some Dell laptops (Edward Lin)
- Assorted ACPI code cleanups (Fabian Frederick, Rasmus Villemoes,
Sudip Mukherjee, Yijing Wang, and Zhang Rui)
- cpufreq core updates and cleanups (Viresh Kumar, Preeti U Murthy,
Rasmus Villemoes)
- cpufreq driver updates: cpufreq-cpu0/cpufreq-dt (driver name change
among other things), ppc-corenet, powernv (Viresh Kumar, Preeti U
Murthy, Shilpasri G Bhat, Lucas Stach)
- cpuidle support for DT-based idle states infrastructure, new ARM64
cpuidle driver, cpuidle core cleanups (Lorenzo Pieralisi, Rasmus
Villemoes)
- ARM big.LITTLE cpuidle driver updates: support for DT-based
initialization and Exynos5800 compatible string (Lorenzo Pieralisi,
Kevin Hilman)
- Rework of the test_suspend kernel command line argument and a new
trace event for console resume (Srinivas Pandruvada, Todd E Brandt)
- Second attempt to optimize swsusp_free() (hibernation core) to make
it avoid going through all PFNs which may be way too slow on some
systems (Joerg Roedel)
- devfreq updates (Paul Bolle, Punit Agrawal, Ãrjan Eide).
- rockchip-io Adaptive Voltage Scaling (AVS) driver and AVS entry
update in MAINTAINERS (Heiko Stübner, Kevin Hilman)
- PM core fix related to clock management (Geert Uytterhoeven)
- PM core's sysfs code cleanup (Johannes Berg)"
* tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (105 commits)
ACPI / fan: printk replacement
PM / clk: Fix crash in clocks management code if !CONFIG_PM_RUNTIME
PM / Domains: Rename cpu_data to cpuidle_data
cpufreq: cpufreq-dt: fix potential double put of cpu OF node
cpufreq: cpu0: rename driver and internals to 'cpufreq_dt'
PM / hibernate: Iterate over set bits instead of PFNs in swsusp_free()
cpufreq: ppc-corenet: remove duplicate update of cpu_data
ACPI / sleep: Rework the handling of ACPI GPE wakeup from suspend-to-idle
PM / sleep: Rename platform suspend/resume functions in suspend.c
PM / sleep: Export dpm_suspend_late/noirq() and dpm_resume_early/noirq()
ACPICA: Introduce acpi_enable_all_wakeup_gpes()
ACPICA: Clear all non-wakeup GPEs in acpi_hw_enable_wakeup_gpe_block()
ACPI / video: check _DOD list when creating backlight devices
PM / Domains: Move dev_pm_domain_attach|detach() to pm_domain.h
cpufreq: Replace strnicmp with strncasecmp
cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec
cpufreq: powernv: Set the pstate of the last hotplugged out cpu in policy->cpus to minimum
cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
PM / devfreq: exynos: Enable building exynos PPMU as module
PM / devfreq: Export helper functions for drivers
...
Pull irq updates from Thomas Gleixner:
"The irq departement delivers:
- a cleanup series to get rid of mindlessly copied code.
- another bunch of new pointlessly different interrupt chip drivers.
Adding homebrewn irq chips (and timers) to SoCs must provide a
value add which is beyond the imagination of mere mortals.
- the usual SoC irq controller updates, IOW my second cat herding
project"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
irqchip: gic-v3: Implement CPU PM notifier
irqchip: gic-v3: Refactor gic_enable_redist to support both enabling and disabling
irqchip: renesas-intc-irqpin: Add minimal runtime PM support
irqchip: renesas-intc-irqpin: Add helper variable dev = &pdev->dev
irqchip: atmel-aic5: Add sama5d4 support
irqchip: atmel-aic5: The sama5d3 has 48 IRQs
Documentation: bcm7120-l2: Add Broadcom BCM7120-style L2 binding
irqchip: bcm7120-l2: Add Broadcom BCM7120-style Level 2 interrupt controller
irqchip: renesas-irqc: Add binding docs for new R-Car Gen2 SoCs
irqchip: renesas-irqc: Add DT binding documentation
irqchip: renesas-intc-irqpin: Document SoC-specific bindings
openrisc: Get rid of handle_IRQ
arm64: Get rid of handle_IRQ
ARM: omap2: irq: Convert to handle_domain_irq
ARM: imx: tzic: Convert to handle_domain_irq
ARM: imx: avic: Convert to handle_domain_irq
irqchip: or1k-pic: Convert to handle_domain_irq
irqchip: atmel-aic5: Convert to handle_domain_irq
irqchip: atmel-aic: Convert to handle_domain_irq
irqchip: gic-v3: Convert to handle_domain_irq
...
* pm-genirq:
PM / genirq: Document rules related to system suspend and interrupts
PCI / PM: Make PCIe PME interrupts wake up from suspend-to-idle
x86 / PM: Set IRQCHIP_SKIP_SET_WAKE for IOAPIC IRQ chip objects
genirq: Simplify wakeup mechanism
genirq: Mark wakeup sources as armed on suspend
genirq: Create helper for flow handler entry check
genirq: Distangle edge handler entry
genirq: Avoid double loop on suspend
genirq: Move MASK_ON_SUSPEND handling into suspend_device_irqs()
genirq: Make use of pm misfeature accounting
genirq: Add sanity checks for PM options on shared interrupt lines
genirq: Move suspend/resume logic into irq/pm code
PM / sleep: Mechanism for aborting system suspends unconditionally
Calling irq_find_mapping from outside a irq_{enter,exit} section is
unsafe and produces ugly messages if CONFIG_PROVE_RCU is enabled:
If coming from the idle state, the rcu_read_lock call in irq_find_mapping
will generate an unpleasant warning:
<quote>
===============================
[ INFO: suspicious RCU usage. ]
3.16.0-rc1+ #135 Not tainted
-------------------------------
include/linux/rcupdate.h:871 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: (rcu_read_lock){......}, at: [<ffffffc00010206c>]
irq_find_mapping+0x4c/0x198
</quote>
As this issue is fairly widespread and involves at least three
different architectures, a possible solution is to add a new
handle_domain_irq entry point into the generic IRQ code that
the interrupt controller code can call.
This new function takes an irq_domain, and calls into irq_find_domain
inside the irq_{enter,exit} block. An additional "lookup" parameter is
used to allow non-domain architecture code to be replaced by this as well.
Interrupt controllers can then be updated to use the new mechanism.
This code is sitting behind a new CONFIG_HANDLE_DOMAIN_IRQ, as not all
architectures implement set_irq_regs (yes, mn10300, I'm looking at you...).
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/1409047421-27649-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Currently we suspend wakeup interrupts by lazy disabling them and
check later whether the interrupt has fired, but that's not sufficient
for suspend to idle as there is no way to check that once we
transitioned into the CPU idle state.
So we change the mechanism in the following way:
1) Leave the wakeup interrupts enabled across suspend
2) Add a check to irq_may_run() which is called at the beginning of
each flow handler whether the interrupt is an armed wakeup source.
This check is basically free as it just extends the existing check
for IRQD_IRQ_INPROGRESS. So no new conditional in the hot path.
If the IRQD_WAKEUP_ARMED flag is set, then the interrupt is
disabled, marked as pending/suspended and the pm core is notified
about the wakeup event.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ rjw: syscore.c and put irq_pm_check_wakeup() into pm.c ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This allows us to utilize this information in the irq_may_run() check
without adding another conditional to the fast path.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All flow handlers - except the per cpu ones - check for an interrupt
in progress and an eventual concurrent polling on another cpu.
Create a helper function for the repeated code pattern.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the interrupt is disabled or has no action, then we should not call
the poll check. Separate the checks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We can synchronize the suspended interrupts right away. No need for an
extra loop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no reason why we should delay the masking of interrupts whose
interrupt chip requests MASK_ON_SUSPEND to the point where we check
the wakeup interrupts. We can do it right at the point where we mark
the interrupt as suspended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the accounting fields which got introduced for snity checking for
the various PM options.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Account the IRQF_NO_SUSPEND and IRQF_RESUME_EARLY actions on shared
interrupt lines and yell loudly if there is a mismatch.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
No functional change. Preparatory patch for cleaning up the suspend
abort functionality. Update the comments while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It should be request_threaded_irq, not request_irq
[jkosina@suse.cz: not that it would matter, as both have the same
set of arguments anyway, but for sake of consistency ...]
Signed-off-by: Emilio López <emilio@elopez.com.ar>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[ARM specific]
These are generally replaced with raw_cpu_ptr. However, in
gic_get_percpu_base() we immediately dereference the pointer. This is
equivalent to a raw_cpu_read. So use that operation there.
Cc: nicolas.pitre@linaro.org
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Export handle_fasteoi_irq to be able to use it in e.g. the Zynq gpio driver
since commit 6dd8595083 ("gpio: zynq: Fix IRQ handlers").
This fixes the following link issue:
ERROR: "handle_fasteoi_irq" [drivers/gpio/gpio-zynq.ko] undefined!
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Vincent Stehle <vincent.stehle@laposte.net>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/1408663880-29179-1-git-send-email-vincent.stehle@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull irq updates from Thomas Gleixner:
"Nothing spectacular from the irq department this time:
- overhaul of the crossbar chip driver
- overhaul of the spear shirq chip driver
- support for the atmel-aic chip
- code move from arch to drivers
- the usual tiny fixlets
- two reverts worth to mention which undo the too simple attempt of
supporting wakeup interrupts on shared interrupt lines"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
Revert "irq: Warn when shared interrupts do not match on NO_SUSPEND"
Revert "PM / sleep / irq: Do not suspend wakeup interrupts"
irq: Warn when shared interrupts do not match on NO_SUSPEND
irqchip: atmel-aic: Define irq fixups for atmel SoCs
irqchip: atmel-aic: Implement RTC irq fixup
irqchip: atmel-aic: Add irq fixup infrastructure
irqchip: atmel-aic: Add atmel AIC/AIC5 drivers
irqchip: atmel-aic: Move binding doc to interrupt-controller directory
genirq: generic chip: Export irq_map_generic_chip function
PM / sleep / irq: Do not suspend wakeup interrupts
irqchip: or1k-pic: Migrate from arch/openrisc/
irqchip: crossbar: Allow for quirky hardware with direct hardwiring of GIC
documentation: dt: omap: crossbar: Add description for interrupt consumer
irqchip: crossbar: Introduce centralized check for crossbar write
irqchip: crossbar: Introduce ti, max-crossbar-sources to identify valid crossbar mapping
irqchip: crossbar: Add kerneldoc for crossbar_domain_unmap callback
irqchip: crossbar: Set cb pointer to null in case of error
irqchip: crossbar: Change the goto naming
irqchip: crossbar: Return proper error value
irqchip: crossbar: Fix kerneldoc warning
...
This reverts commit 4fae4e7624.
Undo because it breaks working systems.
Requested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This reverts commit d709f7bcbb.
Undo, because it might break exisiting functionality.
Requested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When suspend_device_irqs() iterates all descriptors, its pointless if
one has NO_SUSPEND set while another has not.
Validate on request_irq() that NO_SUSPEND state maches for SHARED
interrupts.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/20140724133921.GY6758@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export the generic irq map function in order to provide irq_domain ops with
generic mapping and specific of xlate function (needed by the new atmel
AIC driver).
Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1405012462-766-2-git-send-email-boris.brezillon@free-electrons.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
If an IRQ has been configured for wakeup via enable_irq_wake(), the
driver who has done that must be prepared for receiving interrupts
after suspend_device_irqs() has returned, so there is no need to
"suspend" such IRQs. Moreover, if drivers using enable_irq_wake()
actually want to receive interrupts after suspend_device_irqs() has
returned, they need to add IRQF_NO_SUSPEND to the IRQ flags while
requesting the IRQs, which shouldn't be necessary (it also goes a bit
too far, as IRQF_NO_SUSPEND causes the IRQ to be ignored by
suspend_device_irqs() all the time regardless of whether or not it
has been configured for signaling wakeup).
For the above reasons, make __disable_irq() ignore IRQ descriptors
with IRQD_WAKEUP_STATE set when its suspend argument is true which
effectively causes them to behave like IRQs with IRQF_NO_SUSPEND
set.
This also allows IRQs configured for wakeup via enable_irq_wake()
to work as wakeup interrupts for the "freeze" (suspend-to-idle)
sleep mode automatically just like for any other sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Li Aubrey <aubrey.li@linux.intel.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Link: http://lkml.kernel.org/r/4679574.kGUnqAuNl9@vostro.rjw.lan
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_free_hwirqs() always calls irq_free_descs() with a cnt == 0
which makes it a no-op since the interrupt count to free is
decremented in itself.
Fixes: 7b6ef12625
Signed-off-by: Keith Busch <keith.busch@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/1404167084-8070-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export irq_domain_disassociate() to architecture interrupt drivers,
so it could be used to handle legacy IRQ descriptors on x86.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1402302011-23642-37-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
No more users. Get rid of the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140507154341.012847637@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Create a new interface and confine it with a config switch which makes
clear that this is just legacy support and not to be used for new code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140507154340.574437049@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
No more users. And it's not going to come back. If you need
hotplugable irq chips, use irq domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-and-acked-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140507154340.302183048@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We want to get rid of the public interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140507154340.061990194@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not really the solution to the problem, but at least it confines the
mess in the core code and allows to get rid of the create/destroy_irq
variants from hell, i.e. 3 implementations with different semantics
plus the x86 specific variants __create_irqs and create_irq_nr
which have been invented in another circle of hell.
x86 : x86 should be converted to irq domains and I'm deliberately
making it impossible to do the multi-vector MSI support by
adding more crap to the current mess. It's not that hard to do
and I'm really tired of the trainwrecks which have been invented
by baindaid engineering so far. Any attempt to do multi-vector
MSI or ioapic hotplug without converting to irq domains is NAKed
hereby.
tile: Might use irq domains as well, but it has a very limited
interrupt space, so handling it via this functionality might be
the right thing to do even in the long run.
ia64: That's an hopeless case, as I doubt that anyone has the stomach
to rewrite the homebrewn dynamic allocation facilities. I stared
at it for a couple of hours and gave up. The create/destroy_irq
mess could be made private to itanic right away if there
wouldn't be the iommu/dmar driver being shared with x86. So to
do that I'm going to add a separate ia64 specific implementation
later in order not to deep-six itanic right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20140507154334.208629358@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Till reported that the spurious interrupt detection of threaded
interrupts is broken in two ways:
- note_interrupt() is called for each action thread of a shared
interrupt line. That's wrong as we are only interested whether none
of the device drivers felt responsible for the interrupt, but by
calling multiple times for a single interrupt line we account
IRQ_NONE even if one of the drivers felt responsible.
- note_interrupt() when called from the thread handler is not
serialized. That leaves the members of irq_desc which are used for
the spurious detection unprotected.
To solve this we need to defer the spurious detection of a threaded
interrupt to the next hardware interrupt context where we have
implicit serialization.
If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we
check whether the previous interrupt requested a deferred check. If
not, we request a deferred check for the next hardware interrupt and
return.
If set, we check whether one of the interrupt threads signaled
success. Depending on this information we feed the result into the
spurious detector.
If one primary handler of a shared interrupt returns IRQ_HANDLED we
disable the deferred check of irq threads on the same line, as we have
found at least one device driver who cared.
Reported-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Austin Schuh <austin@peloton-tech.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionos
On x86 the allocation of irq descriptors may allocate interrupts which
are in the range of the GSI interrupts. That's wrong as those
interrupts are hardwired and we don't have the irq domain translation
like PPC. So one of these interrupts can be hooked up later to one of
the devices which are hard wired to it and the io_apic init code for
that particular interrupt line happily reuses that descriptor with a
completely different configuration so hell breaks lose.
Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs,
except for a few usage sites which have not yet blown up in our face
for whatever reason. But for drivers which need an irq range, like the
GPIO drivers, we have no limit in place and we don't want to expose
such a detail to a driver.
To cure this introduce a function which an architecture can implement
to impose a lower bound on the dynamic interrupt allocations.
Implement it for x86 and set the lower bound to nr_gsi_irqs, which is
the end of the hardwired interrupt space, so all dynamic allocations
happen above.
That not only allows the GPIO driver to work sanely, it also protects
the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and
htirq code. They need to be cleaned up as well, but that's a separate
issue.
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Krogerus Heikki <heikki.krogerus@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current implementation of irq_set_affinity() refuses rightfully to
route an interrupt to an offline cpu.
But there is a special case, where this is actually desired. Some of
the ARM SoCs have per cpu timers which require setting the affinity
during cpu startup where the cpu is not yet in the online mask.
If we can't do that, then the local timer interrupt for the about to
become online cpu is routed to some random online cpu.
The developers of the affected machines tried to work around that
issue, but that results in a massive mess in that timer code.
We have a yet unused argument in the set_affinity callbacks of the irq
chips, which I added back then for a similar reason. It was never
required so it got not used. But I'm happy that I never removed it.
That allows us to implement a sane handling of the above scenario. So
the affected SoC drivers can add the required force handling to their
interrupt chip, switch the timer code to irq_force_affinity() and
things just work.
This does not affect any existing user of irq_set_affinity().
Tagged for stable to allow a simple fix of the affected SoC clock
event drivers.
Reported-and-tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Tomasz Figa <t.figa@samsung.com>,
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>,
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: linux-arm-kernel@lists.infradead.org,
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This will allow to use the dummy IRQ handler no_action() from drivers
compiled as module. Drivers which use ARM FIQ interrupts can use this
to request the interrupt via the normal request_irq() mechanism w/o
having to copy the dummy handler to their own code.
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Link: http://lkml.kernel.org/r/1395476431-16070-1-git-send-email-shc_work@mail.ru
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Includes:
- /proc/irq/default_smp_affinity
- /proc/irq/*/affinity_hint
- /proc/irq/*/smp_affinity
- /proc/irq/*/smp_affinity_list
Users can distill the same information by reading /proc/interrupts.
Signed-off-by: Chema Gonzalez <chema@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: http://lkml.kernel.org/r/1394765455-1217-1-git-send-email-chema@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The flag is necessary for interrupt chips which require an ACK/EOI
after the handler has run. In case of threaded handlers this needs to
happen after the threaded handler has completed before the unmask of
the interrupt.
The flag is only unseful in combination with the handle_fasteoi_irq
flow control handler.
It can be combined with the flag IRQCHIP_EOI_IF_HANDLED, so the EOI is
not issued when the interrupt is disabled or in progress.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-sunxi@googlegroups.com
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Link: http://lkml.kernel.org/r/1394733834-26839-2-git-send-email-hdegoede@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge the request/release callbacks which are in a separate branch for
consumption by the gpio folks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For certain irq types, e.g. gpios, it's necessary to request resources
before starting up the irq.
This might fail so we cannot use the irq_startup() callback because we
might call the irq_set_type() callback before that which does not make
sense when the resource is not available. Calling irq_startup() before
irq_set_type() can lead to spurious interrupts which is not desired
either.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean-Jacques Hiblot <jjhiblot@traphandler.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1403080857160.18573@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
No more users outside the core code. Put it into the poison
cabinet. That also gets rid of the linux/irq.h include in
kernel_stat.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140223212739.124207133@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is a common pattern all over the place:
kstat_incr_irqs_this_cpu(irq, irq_to_desc(irq));
This results in a call to core code anyway. So provide a function
which does the same thing in core.
While at it, replace the butt ugly macro with an inline.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140223212737.422068876@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Include appropriate header file include/linux/of_irq.h in
kernel/irq/irqdomain.c because it contains prototype definition of
function define in kernel/irq/irqdomain.c.
This eliminates the following warning in kernel/irq/irqdomain.c:
kernel/irq/irqdomain.c:468:14: warning: no previous prototype for ‘irq_create_of_mapping’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/eb89aebea7ff1a46122918ac389ebecf8248be9a.1393493276.git.rashika.kheria@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We hit one rare case below:
T1 calling disable_irq(), but hanging at synchronize_irq()
always;
The corresponding irq thread is in sleeping state;
And all CPUs are in idle state;
After analysis, we found there is one possible scenerio which
causes T1 is waiting there forever:
CPU0 CPU1
synchronize_irq()
wait_event()
spin_lock()
atomic_dec_and_test(&threads_active)
insert the __wait into queue
spin_unlock()
if(waitqueue_active)
atomic_read(&threads_active)
wake_up()
Here after inserted the __wait into queue on CPU0, and before
test if queue is empty on CPU1, there is no barrier, it maybe
cause it is not visible for CPU1 immediately, although CPU0 has
updated the queue list.
It is similar for CPU0 atomic_read() threads_active also.
So we'd need one smp_mb() before waitqueue_active.that, but removing
the waitqueue_active() check solves it as wel l and it makes
things simple and clear.
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Xiaoming Wang <xiaoming.wang@intel.com>
Link: http://lkml.kernel.org/r/1393212590-32543-1-git-send-email-chuansheng.liu@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In course of the sdhci/sdio discussion with Russell about killing the
sdio kthread hackery we discovered the need to be able to wake an
interrupt thread from software.
The rationale for this is, that sdio hardware can lack proper
interrupt support for certain features. So the driver needs to poll
the status registers, but at the same time it needs to be woken up by
an hardware interrupt.
To be able to get rid of the home brewn kthread construct of sdio we
need a way to wake an irq thread independent of an actual hardware
interrupt.
Provide an irq_wake_thread() function which wakes up the thread which
is associated to a given dev_id. This allows sdio to invoke the irq
thread from the hardware irq handler via the IRQ_WAKE_THREAD return
value and provides a possibility to wake it via a timer for the
polling scenarios. That allows to simplify the sdio logic
significantly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Chris Ball <chris@printf.net>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140215003823.772565780@linutronix.de
synchronize_irq() waits for hard irq and threaded handlers to complete
before returning. For some special cases we only need to make sure
that the hard interrupt part of the irq line is not in progress when
we disabled the - possibly shared - interrupt at the device level.
A proper use case for this was provided by Russell. The sdhci driver
requires some irq triggered functions to be run in thread context. The
current implementation of the thread context is a sdio private kthread
construct, which has quite some shortcomings. These can be avoided
when the thread is directly associated to the device interrupt via the
generic threaded irq infrastructure.
Though there is a corner case related to run time power management
where one side disables the device interrupts at the device level and
needs to make sure, that an already running hard interrupt handler has
completed before proceeding further. Though that hard interrupt
handler might wake the associated thread, which in turn can request
the runtime PM to reenable the device. Using synchronize_irq() leads
to an immediate deadlock of the irq thread waiting for the PM lock and
the synchronize_irq() waiting for the irq thread to complete.
Due to the fact that it is sufficient for this case to ensure that no
hard irq handler is executing a new function which avoids the check
for the thread is required.
Add a function, which just monitors the hard irq parts and ignores the
threaded handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Russell King <linux@arm.linux.org.uk>
Cc: Chris Ball <chris@printf.net>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140215003823.653236081@linutronix.de
Pull irq update from Thomas Gleixner:
"Fix from the urgent branch: a trivial oneliner adding the missing
Kconfig dependency curing build failures which have been discovered by
several build robots.
The update in the irq-core branch provides a new function in the
irq/devres code, which is a prerequisite for driver developers to get
rid of boilerplate code all over the place.
Not a bugfix, but it has zero impact on the current kernel due to the
lack of users. It's simpler to provide the infrastructure to
interested parties via your tree than fulfilling the wishlist of
driver maintainers on which particular commit or tag this should be
based on"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add devm_request_any_context_irq()
In allmodconfig builds for sparc and any other arch which does
not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:
CC [M] lib/cpu-notifier-error-inject.o
CC [M] lib/pm-notifier-error-inject.o
ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
make[2]: *** [__modpost] Error 1
This happens because commit 3911ff30f5 ("genirq: export
handle_edge_irq() and irq_to_desc()") added one export for it, but
there were actually two instances of it, in an if/else clause for
CONFIG_SPARSE_IRQ. Add the second one.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: stable@vger.kernel.org # 3.4+
Link: http://lkml.kernel.org/r/1392057610-11514-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The generic_chip.c uses interfaces from irq_domain.c which is
controlled by the IRQ_DOMAIN config option, but there is no Kconfig
dependency so the build can fail:
linux/kernel/irq/generic-chip.c:400:11: error:
'irq_domain_xlate_onetwocell' undeclared here (not in a function)
Select IRQ_DOMAIN when GENERIC_IRQ_CHIP is selected.
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Link: http://lkml.kernel.org/r/1391129410-54548-2-git-send-email-nitin.a.kamble@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # 3.11+
Pull irq fixes from Thomas Gleixner:
- Correction of fuzzy and fragile IRQ_RETVAL macro
- IRQ related resume fix affecting only XEN
- ARM/GIC fix for chained GIC controllers
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Gic: fix boot for chained gics
irq: Enable all irqs unconditionally in irq_resume
genirq: Correct fuzzy and fragile IRQ_RETVAL() definition
When the system enters suspend, it disables all interrupts in
suspend_device_irqs(), including the interrupts marked EARLY_RESUME.
On the resume side things are different. The EARLY_RESUME interrupts
are reenabled in sys_core_ops->resume and the non EARLY_RESUME
interrupts are reenabled in the normal system resume path.
When suspend_noirq() failed or suspend is aborted for any other
reason, we might omit the resume side call to sys_core_ops->resume()
and therefor the interrupts marked EARLY_RESUME are not reenabled and
stay disabled forever.
To solve this, enable all irqs unconditionally in irq_resume()
regardless whether interrupts marked EARLY_RESUMEhave been already
enabled or not.
This might try to reenable already enabled interrupts in the non
failure case, but the only affected platform is XEN and it has been
confirmed that it does not cause any side effects.
[ tglx: Massaged changelog. ]
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Acked-by-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Cc: <ian.campbell@citrix.com>
Cc: <rjw@rjwysocki.net>
Cc: <len.brown@intel.com>
Cc: <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1385388587-16442-1-git-send-email-ldewangan@nvidia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull irq cleanups from Ingo Molnar:
"This is a multi-arch cleanup series from Thomas Gleixner, which we
kept to near the end of the merge window, to not interfere with
architecture updates.
This series (motivated by the -rt kernel) unifies more aspects of IRQ
handling and generalizes PREEMPT_ACTIVE"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt: Make PREEMPT_ACTIVE generic
sparc: Use preempt_schedule_irq
ia64: Use preempt_schedule_irq
m32r: Use preempt_schedule_irq
hardirq: Make hardirq bits generic
m68k: Simplify low level interrupt handling code
genirq: Prevent spurious detection for unconditionally polled interrupts
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
On a 68k platform a couple of interrupts are demultiplexed and
"polled" from a top level interrupt. Unfortunately there is no way to
determine which of the sub interrupts raised the top level interrupt,
so all of the demultiplexed interrupt handlers need to be
invoked. Given a high enough frequency this can trigger the spurious
interrupt detection mechanism, if one of the demultiplex interrupts
returns IRQ_NONE continuously. But this is a false positive as the
polling causes this behaviour and not buggy hardware/software.
Introduce IRQ_POLLED which can be set at interrupt chip setup time via
irq_set_status_flags(). The flag excludes the interrupt from the
spurious detector and from all core polling activities.
Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311061149250.23353@ionos.tec.linutronix.de
usual for this cycle with lots of clean-up.
- Cross arch clean-up and consolidation of early DT scanning code.
- Clean-up and removal of arch prom.h headers. Makes arch specific
prom.h optional on all but Sparc.
- Addition of interrupts-extended property for devices connected to
multiple interrupt controllers.
- Refactoring of DT interrupt parsing code in preparation for deferred
probe of interrupts.
- ARM cpu and cpu topology bindings documentation.
- Various DT vendor binding documentation updates.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJSgPQ4AAoJEMhvYp4jgsXif28H/1WkrXq5+lCFQZF8nbYdE2h0
R8PsfiJJmAl6/wFgQTsRel+ScMk2hiP08uTyqf2RLnB1v87gCF7MKVaLOdONfUDi
huXbcQGWCmZv0tbBIklxJe3+X3FIJch4gnyUvPudD1m8a0R0LxWXH/NhdTSFyB20
PNjhN/IzoN40X1PSAhfB5ndWnoxXBoehV/IVHVDU42vkPVbVTyGAw5qJzHW8CLyN
2oGTOalOO4ffQ7dIkBEQfj0mrgGcODToPdDvUQyyGZjYK2FY2sGrjyquir6SDcNa
Q4gwatHTu0ygXpyphjtQf5tc3ZCejJ/F0s3olOAS1ahKGfe01fehtwPRROQnCK8=
=GCbY
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DeviceTree updates for 3.13. This is a bit larger pull request than
usual for this cycle with lots of clean-up.
- Cross arch clean-up and consolidation of early DT scanning code.
- Clean-up and removal of arch prom.h headers. Makes arch specific
prom.h optional on all but Sparc.
- Addition of interrupts-extended property for devices connected to
multiple interrupt controllers.
- Refactoring of DT interrupt parsing code in preparation for
deferred probe of interrupts.
- ARM cpu and cpu topology bindings documentation.
- Various DT vendor binding documentation updates"
* tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
powerpc: add missing explicit OF includes for ppc
dt/irq: add empty of_irq_count for !OF_IRQ
dt: disable self-tests for !OF_IRQ
of: irq: Fix interrupt-map entry matching
MIPS: Netlogic: replace early_init_devtree() call
of: Add Panasonic Corporation vendor prefix
of: Add Chunghwa Picture Tubes Ltd. vendor prefix
of: Add AU Optronics Corporation vendor prefix
of/irq: Fix potential buffer overflow
of/irq: Fix bug in interrupt parsing refactor.
of: set dma_mask to point to coherent_dma_mask
of: add vendor prefix for PHYTEC Messtechnik GmbH
DT: sort vendor-prefixes.txt
of: Add vendor prefix for Cadence
of: Add empty for_each_available_child_of_node() macro definition
arm/versatile: Fix versatile irq specifications.
of/irq: create interrupts-extended property
microblaze/pci: Drop PowerPC-ism from irq parsing
of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
of/irq: Use irq_of_parse_and_map()
...
In commit ee23871389 ("genirq: Set irq thread to RT priority on
creation") we moved the assigment of the thread's priority from the
thread's function into __setup_irq(). That function may run in user
context for instance if the user opens an UART node and then driver
calls requests in the ->open() callback. That user may not have
CAP_SYS_NICE and so the irq thread won't run with the SCHED_OTHER
policy.
This patch uses sched_setscheduler_nocheck() so we omit the CAP_SYS_NICE
check which is otherwise required for the SCHED_OTHER policy.
[bigeasy: Rewrite the changelog]
Signed-off-by: Thomas Pfaff <tpfaff@pcs.com>
Cc: Ivo Sieben <meltedpianoman@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1381489240-29626-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All the callers of irq_create_of_mapping() pass the contents of a struct
of_phandle_args structure to the function. Since all the callers already
have an of_phandle_args pointer, why not pass it directly to
irq_create_of_mapping()?
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull irq updates from Thomas Gleixner:
- core fix for missing round up in the generic irq chip implementation
- new irq chip for MOXA SoCs
- a few fixes and cleanups in the irqchip drivers
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Add support for MOXA ART SoCs
genirq: generic chip: Use DIV_ROUND_UP to calculate numchips
irqchip: nvic: Fix wrong num_ct argument for irq_alloc_domain_generic_chips()
irqchip: sun4i: Staticize sun4i_irq_ack()
irqchip: vt8500: Staticize local symbols
Pull MIPS updates from Ralf Baechle:
"MIPS updates:
- All the things that didn't make 3.10.
- Removes the Windriver PPMC platform. Nobody will miss it.
- Remove a workaround from kernel/irq/irqdomain.c which was there
exclusivly for MIPS. Patch by Grant Likely.
- More small improvments for the SEAD 3 platform
- Improvments on the BMIPS / SMP support for the BCM63xx series.
- Various cleanups of dead leftovers.
- Platform support for the Cavium Octeon-based EdgeRouter Lite.
Two large KVM patchsets didn't make it for this pull request because
their respective authors are vacationing"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (124 commits)
MIPS: Kconfig: Add missing MODULES dependency to VPE_LOADER
MIPS: BCM63xx: CLK: Add dummy clk_{set,round}_rate() functions
MIPS: SEAD3: Disable L2 cache on SEAD-3.
MIPS: BCM63xx: Enable second core SMP on BCM6328 if available
MIPS: BCM63xx: Add SMP support to prom.c
MIPS: define write{b,w,l,q}_relaxed
MIPS: Expose missing pci_io{map,unmap} declarations
MIPS: Malta: Update GCMP detection.
Revert "MIPS: make CAC_ADDR and UNCAC_ADDR account for PHYS_OFFSET"
MIPS: APSP: Remove <asm/kspd.h>
SSB: Kconfig: Amend SSB_EMBEDDED dependencies
MIPS: microMIPS: Fix improper definition of ISA exception bit.
MIPS: Don't try to decode microMIPS branch instructions where they cannot exist.
MIPS: Declare emulate_load_store_microMIPS as a static function.
MIPS: Fix typos and cleanup comment
MIPS: Cleanup indentation and whitespace
MIPS: BMIPS: support booting from physical CPU other than 0
MIPS: Only set cpu_has_mmips if SYS_SUPPORTS_MICROMIPS
MIPS: GIC: Fix gic_set_affinity infinite loop
MIPS: Don't save/restore OCTEON wide multiplier state on syscalls.
...
This is the long awaited simplification of irqdomain. It gets rid of the
different types of irq domains and instead both linear and tree mappings
can be supported in a single domain. Doing this removes a lot of special
case code and makes irq domains simpler to understand overall.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR1mnQAAoJEEFnBt12D9kBW7gP/jQpLqFU6PTdsCMX459Ib7fE
JGLQWYjAsqVbVT3H8LZYOZl+ItIbXSQVk+wf0mislnOzxtGxNCxZeJdniFK2RWyz
sGNm/eEZfp9bvm9jlrPE615QE7U7c9fQ5dqETNS4UMMFFmjgrsY4Z82zYH2/Y/jK
YbxFUT6+fep5QaF7lYM7phfrs7FrvnpCoxCio6LO+GgxnMxHSsGJ8L9JcojZ5Qoa
MwlSwsMaQrWJ6PNVbB2YrtlMxjHkNpQ7EV7bNF693PjACVpXf3ItpvQ1k6W/ns06
j6Nysj4nbjsTdWOVh5mY7tPfknZUDs38rLT5s+RbfHqySMIGQXp6CzieYaJ+lxV9
YsItyfZE7h3anebjwRv/9XqLNsDM4KmSIJGcnq+QxNSGxO3rXkItWtpCKaqYdWnN
wI359rSZ//x2ltFzvB14mZJm2j7r9Hh7gH8M5h8W8YTqCbe3oSES3iRhJqnyPxMX
of4PsQaTAicuLwberGdVkKafg9XIsHPoebBzDxssB4nxPrEKIILL4g54x1rV17TZ
kBXurd79FhP4gOQxJHf5uEpdxrQ3Cm/PQpD5CG2rZJrrcV7hfP4QMMFHqqbFSIOu
2h4M4fZX2gsDh3RLboGRGn0GckSPTHiO05yeeBriLzMCVGcZ1JMqAxLUSJpH/UYl
EzFy9RpwnwLoUPi8jUXE
=yJRw
-----END PGP SIGNATURE-----
Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux
Pull irqdomain refactoring from Grant Likely:
"This is the long awaited simplification of irqdomain. It gets rid of
the different types of irq domains and instead both linear and tree
mappings can be supported in a single domain. Doing this removes a
lot of special case code and makes irq domains simpler to understand
overall"
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux:
irq: fix checkpatch error
irqdomain: Include hwirq number in /proc/interrupts
irqdomain: make irq_linear_revmap() a fast path again
irqdomain: remove irq_domain_generate_simple()
irqdomain: Refactor irq_domain_associate_many()
irqdomain: Beef up debugfs output
irqdomain: Clean up aftermath of irq_domain refactoring
irqdomain: Eliminate revmap type
irqdomain: merge linear and tree reverse mappings.
irqdomain: Add a name field
irqdomain: Replace LEGACY mapping with LINEAR
irqdomain: Relax failure path on setting up mappings
The number of interrupts in a domain may be not divisible by the
number of interrupts each chip handles. Integer division may truncate
the result, thus use DIV_ROUND_UP to count numchips.
Seems all users of irq_alloc_domain_generic_chips() in current code do
not have this issue. I just found the issue while reading the code.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1373015592.18252.2.camel@phoenix
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull trivial tree updates from Jiri Kosina:
"The usual stuff from trivial tree"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
treewide: relase -> release
Documentation/cgroups/memory.txt: fix stat file documentation
sysctl/net.txt: delete reference to obsolete 2.4.x kernel
spinlock_api_smp.h: fix preprocessor comments
treewide: Fix typo in printk
doc: device tree: clarify stuff in usage-model.txt.
open firmware: "/aliasas" -> "/aliases"
md: bcache: Fixed a typo with the word 'arithmetic'
irq/generic-chip: fix a few kernel-doc entries
frv: Convert use of typedef ctl_table to struct ctl_table
sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
doc: clk: Fix incorrect wording
Documentation/arm/IXP4xx fix a typo
Documentation/networking/ieee802154 fix a typo
Documentation/DocBook/media/v4l fix a typo
Documentation/video4linux/si476x.txt fix a typo
Documentation/virtual/kvm/api.txt fix a typo
Documentation/early-userspace/README fix a typo
Documentation/video4linux/soc-camera.txt fix a typo
lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
...
Pull core irq changes from Ingo Molnar:
"The main changes:
- generic-irqchip driver additions, cleanups and fixes
- 3 new irqchip drivers: ARMv7-M NVIC, TB10x and Marvell Orion SoCs
- irq_get_trigger_type() simplification and cross-arch cleanup
- various cleanups, simplifications
- documentation updates"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
softirq: Use _RET_IP_
genirq: Add the generic chip to the genirq docbook
genirq: generic-chip: Export some irq_gc_ functions
genirq: Fix can_request_irq() for IRQs without an action
irqchip: exynos-combiner: Staticize combiner_init
irqchip: Add support for ARMv7-M NVIC
irqchip: Add TB10x interrupt controller driver
irqdomain: Use irq_get_trigger_type() to get IRQ flags
MIPS: octeon: Use irq_get_trigger_type() to get IRQ flags
arm: orion: Use irq_get_trigger_type() to get IRQ flags
mfd: stmpe: use irq_get_trigger_type() to get IRQ flags
mfd: twl4030-irq: Use irq_get_trigger_type() to get IRQ flags
gpio: mvebu: Use irq_get_trigger_type() to get IRQ flags
genirq: Add irq_get_trigger_type() to get IRQ flags
genirq: Irqchip: document gcflags arg of irq_alloc_domain_generic_chips
genirq: Set irq thread to RT priority on creation
irqchip: Add support for Marvell Orion SoCs
genirq: Add kerneldoc for irq_disable.
genirq: irqchip: Add mask to block out invalid irqs
genirq: Generic chip: Add linear irq domain support
...
When building imx_v6_v7_defconfig with imx-drm drivers selected as
modules, we get the following build errors:
ERROR: "irq_gc_mask_clr_bit" [drivers/staging/imx-drm/ipu-v3/imx-ipu-v3.ko] undefined!
ERROR: "irq_gc_mask_set_bit" [drivers/staging/imx-drm/ipu-v3/imx-ipu-v3.ko] undefined!
ERROR: "irq_gc_ack_set_bit" [drivers/staging/imx-drm/ipu-v3/imx-ipu-v3.ko] undefined!
Export the required functions to avoid this problem.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Cc: shawn.guo@linaro.org
Cc: kernel@pengutronix.de
Link: http://lkml.kernel.org/r/1372389789-7048-1-git-send-email-festevam@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 02725e7471 ('genirq: Use irq_get/put functions'),
inadvertently changed can_request_irq() to return 0 for IRQs that have
no action. This causes pcibios_lookup_irq() to select only IRQs that
already have an action with IRQF_SHARED set, or to fail if there are
none. Change can_request_irq() to return 1 for IRQs that have no
action (if the first two conditions are met).
Reported-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Tested-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is> (against 3.2)
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: 709647@bugs.debian.org
Cc: stable@vger.kernel.org # 2.6.39+
Link: http://bugs.debian.org/709647
Link: http://lkml.kernel.org/r/1372383630.23847.40.camel@deadeye.wl.decadent.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use irq_get_trigger_type() to get the IRQ trigger type flags
instead calling irqd_get_trigger_type(irq_desc_get_irq_data(virq))
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/1371228049-27080-8-git-send-email-javier.martinez@collabora.co.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 088f40b7b0 ("genirq: Generic chip:
Add linear irq domain support") missed kerneldoc for the gcflags
argument of irq_alloc_domain_generic_chips(). Add it now.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Link: http://lkml.kernel.org/r/1371564513-4327-1-git-send-email-james.hogan@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ERROR: space required before the open parenthesis '('
WARNING: Prefer pr_warn(... to pr_warning(...
Just fix above 2 issue.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Add the hardware interrupt number to the output of /proc/interrupts.
It is often important to have access to the hardware interrupt number because
it identifies exactly how an interrupt signal is wired up to the interrupt
controller. This is especially important when using irq_domains since irq
numbers get dynamically allocated in that case, and have no relation to the
actual hardware number.
Note: This output is currently conditional on whether or not the irq_domain
pointer is set; however hwirq could still be used without irq_domain. It
may be worthwhile to always output the hwirq number regardless of the
domain pointer.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Olof Johansson <olof@lixom.net>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Over the years, irq_linear_revmap() gained tests and checks to make sure
callers were using it safely, which while important, also make it less
of a fast path. After the irqdomain refactoring done recently, it is now
possible to make irq_linear_revmap() a fast path again. This patch moves
irq_linear_revmap() to the header file and makes it a static inline so
that interrupt controller drivers using a linear mapping can decode the
virq from a hwirq in just a couple of instructions.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Originally, irq_domain_associate_many() was designed to unwind the
mapped irqs on a failure of any individual association. However, that
proved to be a problem with certain IRQ controllers. Some of them only
support a subset of irqs, and will fail when attempting to map a
reserved IRQ. In those cases we want to map as many IRQs as possible, so
instead it is better for irq_domain_associate_many() to make a
best-effort attempt to map irqs, but not fail if any or all of them
don't succeed. If a caller really cares about how many irqs got
associated, then it should instead go back and check that all of the
irqs is cares about were mapped.
The original design open-coded the individual association code into the
body of irq_domain_associate_many(), but with no longer needing to
unwind associations, the code becomes simpler to split out
irq_domain_associate() to contain the bulk of the logic, and
irq_domain_associate_many() to be a simple loop wrapper.
This patch also adds a new error check to the associate path to make
sure it isn't called for an irq larger than the controller can handle,
and adds locking so that the irq_domain_mutex is held while setting up a
new association.
v3: Fixup missing change to irq_domain_add_tree()
v2: Fixup x86 warning. irq_domain_associate_many() no longer returns an
error code, but reports errors to the printk log directly. In the
majority of cases we don't actually want to fail if there is a
problem, but rather log it and still try to boot the system.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
irqdomain: Fix flubbed irq_domain_associate_many refactoring
commit d39046ec72, "irqdomain: Refactor irq_domain_associate_many()" was
missing the following hunk which causes a boot failure on anything using
irq_domain_add_tree() to allocate an irq domain.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Cc: Thomas Gleixner <tglx@linutronix.de>,
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
When a threaded irq handler is installed the irq thread is initially
created on normal scheduling priority. Only after the irq thread is
woken up it sets its priority to RT_FIFO MAX_USER_RT_PRIO/2 itself.
This means that interrupts that occur directly after the irq handler
is installed will be handled on a normal scheduling priority instead
of the realtime priority that one would expect.
Fix this by setting the RT priority on creation of the irq_thread.
Signed-off-by: Ivo Sieben <meltedpianoman@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1370254322-17240-1-git-send-email-meltedpianoman@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch increases the amount of output produced by the
irq_domain_mapping debugfs file by first listing all of the registered
irq domains at the beginning of the output, and then by including all
mapped IRQs in the output, not just the active ones. It is very useful
when debugging irqdomain issues to be able to see the entire list of
mapped irqs, not just the ones that happen to be connected to devices.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
After refactoring the irqdomain code, there are a number of API
functions that are merely empty wrappers around core code. Drop those
wrappers out of the C file and replace them with static inlines in the
header.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
The NOMAP irq_domain type is only used by a handful of interrupt
controllers and it unnecessarily complicates the code by adding special
cases on how to look up mappings and different revmap functions are used
for each type which need to validate the correct type is passed to it
before performing the reverse map. Eliminating the revmap_type and
making a single reverse mapping function simplifies the code. It also
shouldn't be any slower than having separate revmap functions because
the type of the revmap needed to be checked anyway.
The linear and tree revmap types were already merged in a previous
patch. This patch rolls the NOMAP or direct mapping behaviour into the
same domain code making is possible for an irq domain to do any mapping
type; linear, tree or direct; and that the mapping will be transparent
to the interrupt controller driver.
With this change, direct mappings will get stored in the linear or tree
mapping for consistency. Reverse mapping from the hwirq to virq will go
through the normal lookup process. However, any controller using a
direct mapping can take advantage of knowing that hwirq==virq for any
mapped interrupts skip doing a revmap lookup when handling IRQs.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Keeping them separate makes irq_domain more complex and adds a lot of
code (as proven by the diffstat). Merging them simplifies the whole
scheme. This change makes it so both the tree and linear methods can be
used by the same irq_domain instance. If the hwirq is less than the
->linear_size, then the linear map is used to reverse map the hwirq.
Otherwise the radix tree is used. The test for which map to use is no
more expensive that the existing code, so the performance of fast path
is preserved.
It also means that complex interrupt controllers can use both the
linear map and a tree in the same domain. This may be useful for an
interrupt controller with a base set of core irqs and a large number
of GPIOs which might be used as irqs. The linear map could cover the
core irqs, and the tree used for thas irqs. The linear map could
cover the core irqs, and the tree used for the gpios.
v2: Drop reorganization of revmap data
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
This patch adds a name field to the irq_domain structure to help mere
mortals understand the mappings between irq domains and virqs. It also
converts a number of places that have open-coded some kind of fudging
an irqdomain name to use the new field. This means a more consistent
display of names in irq domain log messages and debugfs output.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
The LEGACY mapping unnecessarily complicates the irqdomain code and
can easily be implemented with a linear mapping. By ripping it out
and replacing it with the LINEAR mapping the object size of
irqdomain.c shrinks by about 330 bytes (ARMv7) which offsets the
additional allocation required by the linear map. It also makes it
possible for current LEGACY map users to pre-allocate irq_descs for a
subset of the hwirqs and dynamically allocate the rest as needed.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Commit 98aa468e, "irqdomain: Support for static IRQ mapping and
association" introduced an API for directly associating blocks of hwirqs
to linux irqs. However, if any irq in that block failed to map (say if
the mapping functions returns an error because the irq is already
mapped) then the whole thing will fail and roll back. This is probably
too aggressive since there are valid reasons why a mapping may fail.
ie. Firmware may have a particular IRQ marked as unusable.
This patch drops the error path out of irq_domain_associate(). If a
mapping fails, then it is simply skipped. There is no reason to fail the
entire allocation.
v2: Still output an information message on failed mappings and make sure
attempted mapping gets cleared out of the irq_data structure.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
The first_irq needs to be zero to get a linear domain and that
comes with special semantics. We want to simplify this going
forward but some documentation never hurts.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Since irq_data may be NULL, if so, we WARN_ON(), and continue, 'hwirq'
which related with 'irq_data' has to initialize later, or it will cause
issue.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
All other irq_domain_add_* functions are exported already, and apparently
this one got left out by mistake, which causes build errors for ARM
allmodconfig kernels:
ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-rcar.ko] undefined!
ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-em.ko] undefined!
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Some controllers have irqs that aren't wired up and must never be used.
For the generic chip attached to an irq_domain this provides a mask that
can be used to block out particular irqs so that they never get mapped.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Link: http://lkml.kernel.org/r/1369793454-19197-2-git-send-email-grant.likely@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Provide infrastructure for irq chip implementations which work on
linear irq domains.
- Interface to allocate multiple generic chips which are associated to
the irq domain.
- Interface to get the generic chip pointer for a particular hardware
interrupt in the domain.
- irq domain mapping function to install the chip for a particular
interrupt.
Note: This lacks a removal function for now.
[ Sebastian Hesselbarth: Mask cache and pointer math fixups ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Rob Landley <rob@landley.net>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: http://lkml.kernel.org/r/20130506142539.450634298@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Preparatory patch for linear interrupt domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Rob Landley <rob@landley.net>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: http://lkml.kernel.org/r/20130506142539.377017672@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some chips have weird bit mask access patterns instead of the linear
you expect. Allow them to calculate the cached mask themself.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Rob Landley <rob@landley.net>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: http://lkml.kernel.org/r/20130506142539.302898834@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cache the per irq bit mask instead of recalculating it over and over.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Rob Landley <rob@landley.net>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: http://lkml.kernel.org/r/20130506142539.227119865@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There are cases where all irq_chip_type instances have separate mask
registers, making a shared mask register cache unsuitable for the
purpose.
Introduce a new flag IRQ_GC_MASK_CACHE_PER_TYPE. If set, point the per
chip mask pointer to the per chip private mask cache instead.
[ tglx: Simplified code, renamed flag and massaged changelog ]
Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Joey Oravec <joravec@drewtech.com>
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Holger Brunck <Holger.Brunck@keymile.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Simon Guinot <simon@sequanux.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Rob Landley <rob@landley.net>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Link: http://lkml.kernel.org/r/20130506142539.152569748@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Today the same interrupt mask cache (stored within struct irq_chip_generic)
is shared between all the irq_chip_type instances. As there are instances
where each irq_chip_type uses a distinct mask register (as it is the case
for Orion SoCs), sharing a single mask cache may be incorrect.
So add a distinct pointer for each irq_chip_type, which for now
points to the original mask register within irq_chip_generic.
So no functional changes here.
[ tglx: Minor cosmetic tweaks ]
Reported-by: Joey Oravec <joravec@drewtech.com>
Signed-off-by: Simon Guinot <sguinot@lacie.com>
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Holger Brunck <Holger.Brunck@keymile.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Simon Guinot <simon@sequanux.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Rob Landley <rob@landley.net>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Link: http://lkml.kernel.org/r/20130506142539.082226607@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since we already have an irq_data_get_chip_type() function which returns
a pointer to irq_chip_type, use that instead of cur_regs().
Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Joey Oravec <joravec@drewtech.com>
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Holger Brunck <Holger.Brunck@keymile.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Simon Guinot <simon@sequanux.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jean-Francois Moine <moinejf@free.fr>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Rob Landley <rob@landley.net>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Link: http://lkml.kernel.org/r/20130506142539.010164766@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some interrupt controllers refuse to map interrupts marked as
"protected" by firwmare. Since we try to map everyting in the
device-tree on some platforms, we end up with a lot of nasty
WARN's in the boot log for what is a normal situation on those
machines.
This defines a specific return code (-EPERM) from the host map()
callback which cause irqdomain to fail silently.
MPIC is updated to return this when hitting a protected source
printing only a single line message for diagnostic purposes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The only part of proc_dir_entry the code outside of fs/proc
really cares about is PDE(inode)->data. Provide a helper
for that; static inline for now, eventually will be moved
to fs/proc, along with the knowledge of struct proc_dir_entry
layout.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
Pull x86/apic changes from Ingo Molnar:
"Main changes:
- Multiple MSI support added to the APIC, PCI and AHCI code - acked
by all relevant maintainers, by Alexander Gordeev.
The advantage is that multiple AHCI ports can have multiple MSI
irqs assigned, and can thus spread to multiple CPUs.
[ Drivers can make use of this new facility via the
pci_enable_msi_block_auto() method ]
- x86 IOAPIC code from interrupt remapping cleanups from Joerg
Roedel:
These patches move all interrupt remapping specific checks out of
the x86 core code and replaces the respective call-sites with
function pointers. As a result the interrupt remapping code is
better abstraced from x86 core interrupt handling code.
- Various smaller improvements, fixes and cleanups."
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess
x86, kvm: Fix intialization warnings in kvm.c
x86, irq: Move irq_remapped out of x86 core code
x86, io_apic: Introduce eoi_ioapic_pin call-back
x86, msi: Introduce x86_msi.compose_msi_msg call-back
x86, irq: Introduce setup_remapped_irq()
x86, irq: Move irq_remapped() check into free_remapped_irq
x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq()
x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core
x86, irq: Add data structure to keep AMD specific irq remapping information
x86, irq: Move irq_remapping_enabled declaration to iommu code
x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin
x86, io_apic: Move irq_remapping_enabled checks out of check_timer()
x86, io_apic: Convert setup_ioapic_entry to function pointer
x86, io_apic: Introduce set_affinity function pointer
x86, msi: Use IRQ remapping specific setup_msi_irqs routine
x86, hpet: Introduce x86_msi_ops.setup_hpet_msi
x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging
x86, io_apic: Introduce x86_io_apic_ops.disable()
x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume
...
Pull scheduler changes from Ingo Molnar:
"Main changes:
- scheduler side full-dynticks (user-space execution is undisturbed
and receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready, from Frederic
Weisbecker.
- Initial sched.h split-up changes, by Clark Williams
- select_idle_sibling() performance improvement by Mike Galbraith:
" 1 tbench pair (worst case) in a 10 core + SMT package:
pre 15.22 MB/sec 1 procs
post 252.01 MB/sec 1 procs "
- sched_rr_get_interval() ABI fix/change. We think this detail is not
used by apps (so it's not an ABI in practice), but lets keep it
under observation.
- misc RT scheduling cleanups, optimizations"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
cputime: Remove irqsave from seqlock readers
sched, powerpc: Fix sched.h split-up build failure
cputime: Restore CPU_ACCOUNTING config defaults for PPC64
sched/rt: Move rt specific bits into new header file
sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
sched: Move sched.h sysctl bits into separate header
sched: Fix signedness bug in yield_to()
sched: Fix select_idle_sibling() bouncing cow syndrome
sched/rt: Further simplify pick_rt_task()
sched/rt: Do not account zero delta_exec in update_curr_rt()
cputime: Safely read cputime of full dynticks CPUs
kvm: Prepare to add generic guest entry/exit callbacks
cputime: Use accessors to read task cputime stats
cputime: Allow dynamic switch between tick/virtual based cputime accounting
cputime: Generic on-demand virtual cputime accounting
cputime: Move default nsecs_to_cputime() to jiffies based cputime file
cputime: Librarize per nsecs resolution cputime definitions
cputime: Avoid multiplication overflow on utime scaling
context_tracking: Export context state for generic vtime
...
Fix up conflict in kernel/context_tracking.c due to comment additions.
Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The MSI specification has several constraints in comparison with
MSI-X, most notable of them is the inability to configure MSIs
independently. As a result, it is impossible to dispatch
interrupts from different queues to different CPUs. This is
largely devalues the support of multiple MSIs in SMP systems.
Also, a necessity to allocate a contiguous block of vector
numbers for devices capable of multiple MSIs might cause a
considerable pressure on x86 interrupt vector allocator and
could lead to fragmentation of the interrupt vectors space.
This patch overcomes both drawbacks in presense of IRQ remapping
and lets devices take advantage of multiple queues and per-IRQ
affinity assignments.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c8bd86ff56b5fc118257436768aaa04489ac0a4c.1353324359.git.agordeev@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJQ+MDOAAoJEHm+PkMAQRiGFL8H/39bXl3xs/ZkBct3ciICtaAt
UeRcdjvBdUrObQmmw8FEFJA62Az74g1UYyNyc0yVJoB6DwL2oawXhGXQXqDXMBMs
xJa/dpx41gmcrvsLOJ1v2Tmokk32hgdpLNWwzcn36xZMsrhYUxjxPhVNK4aDbwrW
3uzBj83dK1rvi2PEz3hgHizRhmdD1cNgLXKrczNbT6y6rLp6Jkng2q5UhNfaZUvE
A7Ub5/nW+X6gb1PctAjulz2CoySq0FbH4viUc+zLj3NN0L9ImeJvpREauE2QPVgJ
a8Sg3E6/cL4wIXBEV1E1KapVTd1XIEGa8yB5X6q8HWNKI8nf7Kwuj65VgvwJIMc=
=+VR8
-----END PGP SIGNATURE-----
Merge tag 'v3.8-rc4' into irq/core
Merge Linux 3.8-rc4 before pulling in new commits - we were on an old v3.7 base.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The array check is useless so remove it.
[akpm@linux-foundation.org: remove comment, per David]
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull irq fixes from Ingo Molnar:
"Affinity fixes and a nested threaded IRQ handling fix."
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Always force thread affinity
irq: Set CPU affinity right on thread creation
genirq: Provide means to retrigger parent
commit 52553ddf(genirq: fix regression in irqfixup, irqpoll)
introduced a potential deadlock by calling the action handler with the
irq descriptor lock held.
Remove the call and let the handling code run even for an interrupt
where only a single action is registered. That matches the goal of
the above commit and avoids the deadlock.
Document the confusing action = desc->action reload in the handling
loop while at it.
Reported-and-tested-by: "Wang, Warner" <warner.wang@hp.com>
Tested-by: Edward Donovan <edward.donovan@numble.net>
Cc: "Wang, Song-Bo (Stoney)" <song-bo.wang@hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In the simple irqdomain: don't shout warnings to the user,
there is no point. An informational print is sufficient.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Sankara reported that the genirq core code fails to adjust the
affinity of an interrupt thread in several cases:
1) On request/setup_irq() the call to setup_affinity() happens before
the new action is registered, so the new thread is not notified.
2) For secondary shared interrupts nothing notifies the new thread to
change its affinity.
3) Interrupts which have the IRQ_NO_BALANCE flag set are not moving
the thread either.
Fix this by setting the thread affinity flag right on thread creation
time. This ensures that under all circumstances the thread moves to
the right place. Requires a check in irq_thread_check_affinity for an
existing affinity mask (CONFIG_CPU_MASK_OFFSTACK=y)
Reported-and-tested-by: Sankara Muthukrishnan <sankara.m@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1209041738200.2754@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As irq_thread_check_affinity is called ONLY inside the while loop in
the irq thread, the core affinity is set only when an interrupt
occurs. This patch sets the core affinity right after the irq thread
is created and before it waits for interrupts. In real-tiime targets
that do not typically change the core affinity of irqs during
run-time, this patch will save additional latency of an irq thread in
setting the core affinity during the first interrupt occurrence for
that irq.
Signed-off-by: Sankara S Muthukrishnan <sankara.m@ni.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/CAFQPvXeVZ858WFYimEU5uvLNxLDd6bJMmqWihFmbCf3ntokz0A@mail.gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Attempts to retrigger nested threaded IRQs currently fail because they
have no primary handler. In order to support retrigger of nested
IRQs, the parent IRQ needs to be retriggered.
To fix, when an IRQ needs to be resent, if the interrupt has a parent
IRQ and runs in the context of the parent IRQ, then resend the parent.
Also, handle_nested_irq() needs to clear the replay flag like the
other handlers, otherwise check_irq_resend() will set it and it will
never be cleared. Without clearing, it results in the first resend
working fine, but check_irq_resend() returning early on subsequent
resends because the replay flag is still set.
Problem discovered on ARM/OMAP platforms where a nested IRQ that's
also a wakeup IRQ happens late in suspend and needed to be retriggered
during the resume process.
[khilman@ti.com: changelog edits, clear IRQS_REPLAY in handle_nested_irq()]
Reported-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1350425269-11489-1-git-send-email-khilman@deeprootsystems.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently we rely on all IRQ chip instances to dynamically
allocate their IRQ descriptors unless they use the linear
IRQ domain. So for irqdomain_add_legacy() and
irqdomain_add_simple() the caller need to make sure that
descriptors are allocated.
Let's slightly augment the yet unused irqdomain_add_simple()
to also allocate descriptors as a means to simplify usage
and avoid code duplication throughout the kernel.
We warn if descriptors cannot be allocated, e.g. if a
platform has the bad habit of hogging descriptors at boot
time.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Export dummy_irq_chip to modules to allow them to do things such as
irq_set_chip_and_handler(virq,
&dummy_irq_chip,
handle_level_irq);
This fixes
ERROR: "dummy_irq_chip" [drivers/gpio/gpio-pcf857x.ko] undefined!
when gpio-pcf857x.c is being built as a module.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Greg KH <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/871ujstrp6.wl%25kuninori.morimoto.gx@renesas.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export irq_set_chip_and_handler_name() to modules to allow them to
do things such as
irq_set_chip_and_handler(....);
This fixes
ERROR: "irq_set_chip_and_handler_name" \
[drivers/gpio/gpio-pcf857x.ko] undefined!
when gpio-pcf857x.c is being built as a module.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Greg KH <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/873948trpk.wl%25kuninori.morimoto.gx@renesas.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull irq fix from Ingo Molnar.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Allow irq chips to mark themself oneshot safe
Round of refactoring and enhancements to irq_domain infrastructure. This
series starts the process of simplifying irqdomain. The ultimate goal is
to merge LEGACY, LINEAR and TREE mappings into a single system, but had
to back off from that after some last minute bugs. Instead it mainly
reorganizes the code and ensures that the reverse map gets populated
when the irq is mapped instead of the first time it is looked up.
Merging of the irq_domain types is deferred to v3.7
In other news, this series adds helpers for creating static mappings on
a linear or tree mapping.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQGKKkAAoJEEFnBt12D9kB59gQAJnTjrihej1tr0OEkffIthGK
RyVI/DMo0jMgLs4K/rIo3Y+PdTSsNYd8x4R7ln8O7rNRQn8W6jE6NQgoMh51EvNc
FAltmTsBldq6hUNuz2FEnbmojBP4QklTzL8bAiXtX5EufWQsgMsP4guOuHXLCjEV
CkWYVk/slXEWJ8yYJc6GKVRvL+CNeiXVCTcOsYA0CI3ofN7O0rd+YAL314CRllIc
e5uARbWM+s9FJ/eXwCZP4+3jCmdI/CHJb284WldMc/mBD8Rbiqpb4kH6AZI+TH2O
CyiNEPWs6FG5eJPTID7HrOarXGzwYq/pvv8iG7Mh8NiKSae1C1HdkHelCjbLQ+pU
POya0fWF1Gvzlmw0gHik86dqaKjwb29btjj7SFg8KnQExWn2ifhsY70mM9wCTo3s
cwcQlssDIsARE83nttTFCoV/iAWh9AvTxafrXu/+9OKTjpsYlC8kgzdVjq5aAxON
JaAUK1OduTWRsd1TabKlh6naRXr9nRcLKikwKri2oYVKkj97wahBuib4ffzAcNqz
VklRBxTH6M+dz/t5NpcVyLXJpqzTN++QNdTAmeQG6LOnHJL4tpFTsx5sMa7ghmzX
LNpmp/AkVfP0MT7Drf0FUUx6iFA7sjANYzcepUVDrPGKHx0E3LyqbG5JKcC5LgM6
+UIoKAktF3vY7pdZJL9z
=ZUF/
-----END PGP SIGNATURE-----
Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull irqdomain changes from Grant Likely:
"Round of refactoring and enhancements to irq_domain infrastructure.
This series starts the process of simplifying irqdomain. The ultimate
goal is to merge LEGACY, LINEAR and TREE mappings into a single
system, but had to back off from that after some last minute bugs.
Instead it mainly reorganizes the code and ensures that the reverse
map gets populated when the irq is mapped instead of the first time it
is looked up.
Merging of the irq_domain types is deferred to v3.7
In other news, this series adds helpers for creating static mappings
on a linear or tree mapping."
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
irqdomain: Improve diagnostics when a domain mapping fails
irqdomain: eliminate slow-path revmap lookups
irqdomain: Fix irq_create_direct_mapping() to test irq_domain type.
irqdomain: Eliminate dedicated radix lookup functions
irqdomain: Support for static IRQ mapping and association.
irqdomain: Always update revmap when setting up a virq
irqdomain: Split disassociating code into separate function
irq_domain: correct a minor wrong comment for linear revmap
irq_domain: Standardise legacy/linear domain selection
irqdomain: Make ops->map hook optional
irqdomain: Remove unnecessary test for IRQ_DOMAIN_MAP_LEGACY
irqdomain: Simple NUMA awareness.
devicetree: add helper inline for retrieving a node's full name
from interrupts for /dev/random and /dev/urandom. The goal is to
addresses weaknesses discussed in the paper "Mining your Ps and Qs:
Detection of Widespread Weak Keys in Network Devices", by Nadia
Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman, which will
be published in the Proceedings of the 21st Usenix Security Symposium,
August 2012. (See https://factorable.net for more information and an
extended version of the paper.)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJQF/0DAAoJENNvdpvBGATwIowQAOep9QKtLrBvb2lwIRVmeiy8
lRf7V/tYZnz4FePbR0W92JQfKYkCV8yyOO0bmeRzWL3v4m+lRwDTSyA1DDyQMoH+
LOMzvDKSLJMSXTXdSOIr1WYACphViCR/9CrbMBCKSkYfZLJ1MdaEDxT3rcpTGD0T
6iknUweiSkHHhkerU5yQL7FKzD5kYUe0hsF47w7QVlHRHJsW2fsZqkFoh+RpnhNw
03u+djxNGBo9qV81vZ9D1b0vA9uRlEjoWOOEG2XE4M2iq6TUySueA72dQnCwunfi
3kG/u1Swv2dgq6aRrP3H7zdwhYSourGxziu3jNhEKwKEohrxYY7xjNX3RVeTqP67
AzlKsOTWpRLIDrzjSLlb8VxRQiZewu8Unex3e1G+eo20sbcIObHGrxNp7K00zZvd
QZiMHhOwItwFTe4lBO+XbqH2JKbL9/uJmwh5EipMpQTraKO9E6N3CJiUHjzBLo2K
iGDZxRMKf4gVJRwDxbbP6D70JPVu8ZJ09XVIpsXQ3Z1xNqaMF0QdCmP3ty56q1o0
NvkSXxPKrijZs8Sk0rVDqnJ3ll8PuDnXMv5eDtL42VT818I5WxESn9djjwEanGv0
TYxbFub/NRxmPEE5B2Js5FBpqsLf5f282OSMeS/5WLBbnHJR1OoPoAhGVpHvxntC
bi5FC1OolqhvzVIdsqgt
=u7KM
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random subsystem patches from Ted Ts'o:
"This patch series contains a major revamp of how we collect entropy
from interrupts for /dev/random and /dev/urandom.
The goal is to addresses weaknesses discussed in the paper "Mining
your Ps and Qs: Detection of Widespread Weak Keys in Network Devices",
by Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman,
which will be published in the Proceedings of the 21st Usenix Security
Symposium, August 2012. (See https://factorable.net for more
information and an extended version of the paper.)"
Fix up trivial conflicts due to nearby changes in
drivers/{mfd/ab3100-core.c, usb/gadget/omap_udc.c}
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: (33 commits)
random: mix in architectural randomness in extract_buf()
dmi: Feed DMI table to /dev/random driver
random: Add comment to random_initialize()
random: final removal of IRQF_SAMPLE_RANDOM
um: remove IRQF_SAMPLE_RANDOM which is now a no-op
sparc/ldc: remove IRQF_SAMPLE_RANDOM which is now a no-op
[ARM] pxa: remove IRQF_SAMPLE_RANDOM which is now a no-op
board-palmz71: remove IRQF_SAMPLE_RANDOM which is now a no-op
isp1301_omap: remove IRQF_SAMPLE_RANDOM which is now a no-op
pxa25x_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op
omap_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op
goku_udc: remove IRQF_SAMPLE_RANDOM which was commented out
uartlite: remove IRQF_SAMPLE_RANDOM which is now a no-op
drivers: hv: remove IRQF_SAMPLE_RANDOM which is now a no-op
xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op
n2_crypto: remove IRQF_SAMPLE_RANDOM which is now a no-op
pda_power: remove IRQF_SAMPLE_RANDOM which is now a no-op
i2c-pmcmsp: remove IRQF_SAMPLE_RANDOM which is now a no-op
input/serio/hp_sdc.c: remove IRQF_SAMPLE_RANDOM which is now a no-op
mfd: remove IRQF_SAMPLE_RANDOM which is now a no-op
...
Some interrupt chips like MSI are oneshot safe by implementation. For
those interrupts we can avoid the mask/unmask sequence for threaded
interrupt handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1207132056540.32033@ionos
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Jan Kiszka <jan.kiszka@web.de>
When the map operation fails log the error code we get and add a WARN_ON()
so we get a backtrace (which should help work out which interrupt is the
source of the issue).
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
With the current state of irq_domain, the reverse map is always updated
when new IRQs get mapped. This means that the irq_find_mapping() function
can be simplified to execute the revmap lookup functions unconditionally
This patch adds lookup functions for the revmaps that don't yet have one
and removes the slow path lookup code path.
v8: Broke out unrelated changes into separate patches. Rebased on Paul's irq
association patches.
v7: Rebased to irqdomain/next for v3.4 and applied before the removal of 'hint'
v6: Remove the slow path entirely. The only place where the slow path
could get called is for a linear mapping if the hwirq number is larger
than the linear revmap size. There shouldn't be any interrupt
controllers that do that.
v5: rewrite to not use a ->revmap() callback. It is simpler, smaller,
safer and faster to open code each of the revmap lookups directly into
irq_find_mapping() via a switch statement.
v4: Fix build failure on incorrect variable reference.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Rob Herring <rob.herring@calxeda.com>
A small set of changes for devicetree:
- Couple of Documentation fixes
- Addition of new helper function of_node_full_name
- Improve of_parse_phandle_with_args return values
- Some NULL related sparse fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJQDwsgAAoJEMhvYp4jgsXiuwUH/Ri6ZSnqHcz4Wa/X4FxvNc3I
3Xelo/Vt3WLYue3s/+OYiM5FK9+KH8T6x+U79Q4p7vePcfUh6GJII0AUbMeRghkS
m3FjNd5syzYNJlnDnqdngQYRDpaz8U/SyftjXyMPjJ1VWiyLx/EJQUkj1EEwDLe/
ZVabppnco3Y6OJpFuETONNvXx5mE7xq86isW5+aYmviMkWSMMwJPf8qofLJ78Dh5
OAhWuCPRDooz548+Wkabt90qHjF6FU43w5fU7zZW26NT39ptppcbZ2bAXcTYqIIq
sATp5YSitvwFqO2c1mA/drZ9nrgxDPCaw3qCDyiMdcbWgXqDirz2x7q1iauVHF4=
=5TZ/
-----END PGP SIGNATURE-----
Merge tag 'dt-for-3.6' of git://sources.calxeda.com/kernel/linux
Pull devicetree updates from Rob Herring:
"A small set of changes for devicetree:
- Couple of Documentation fixes
- Addition of new helper function of_node_full_name
- Improve of_parse_phandle_with_args return values
- Some NULL related sparse fixes"
Grant's busy packing.
* tag 'dt-for-3.6' of git://sources.calxeda.com/kernel/linux:
of: mtd: nuke useless const qualifier
devicetree: add helper inline for retrieving a node's full name
of: return -ENOENT when no property
usage-model.txt: fix typo machine_init->init_machine
of: Fix null pointer related warnings in base.c file
LED: Fix missing semicolon in OF documentation
of: fix a few typos in the binding documentation
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
With the new interrupt sampling system, we are no longer using the
timer_rand_state structure in the irq descriptor, so we can stop
initializing it now.
[ Merged in fixes from Sedat to find some last missing references to
rand_initialize_irq() ]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
We've been moving away from add_interrupt_randomness() for various
reasons: it's too expensive to do on every interrupt, and flooding the
CPU with interrupts could theoretically cause bogus floods of entropy
from a somewhat externally controllable source.
This solves both problems by limiting the actual randomness addition
to just once a second or after 64 interrupts, whicever comes first.
During that time, the interrupt cycle data is buffered up in a per-cpu
pool. Also, we make sure the the nonblocking pool used by urandom is
initialized before we start feeding the normal input pool. This
assures that /dev/urandom is returning unpredictable data as soon as
possible.
(Based on an original patch by Linus, but significantly modified by
tytso.)
Tested-by: Eric Wustrow <ewust@umich.edu>
Reported-by: Eric Wustrow <ewust@umich.edu>
Reported-by: Nadia Heninger <nadiah@cs.ucsd.edu>
Reported-by: Zakir Durumeric <zakir@umich.edu>
Reported-by: J. Alex Halderman <jhalderm@umich.edu>.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
irq_create_direct_mapping can only be used with the NOMAP type. Make
the function test to ensure it is passed the correct type of
irq_domain.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
In preparation to remove the slow revmap path, eliminate the public
radix revmap lookup functions. This simplifies the code and makes the
slowpath removal patch a lot simpler.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
This adds a new strict mapping API for supporting creation of linux IRQs
at existing positions within the domain. The new routines are as follows:
For dynamic allocation and insertion to specified ranges:
- irq_create_identity_mapping()
- irq_create_strict_mappings()
These will allocate and associate a range of linux IRQs at the specified
location. This can be used by controllers that have their own static linux IRQ
definitions to map a hwirq range to, as well as for platforms that wish to
establish 1:1 identity mapping between linux and hwirq space.
For insertion to specified ranges by platforms that do their own irq_desc
management:
- irq_domain_associate()
- irq_domain_associate_many()
These in turn call back in to the domain's ->map() routine, for further
processing by the platform. Disassociation of IRQs get handled through
irq_dispose_mapping() as normal.
With these in place it should be possible to begin migration of legacy IRQ
domains to linear ones, without requiring special handling for static vs
dynamic IRQ definitions in DT vs non-DT paths. This also makes it possible
for domains with static mappings to adopt whichever tree model best fits
their needs, rather than simply restricting them to linear revmaps.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
[grant.likely: Reorganized irq_domain_associate{,_many} to have all logic in one place]
[grant.likely: Add error checking for unallocated irq_descs at associate time]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
At irq_setup_virq() time all of the data needed to update the reverse
map is available, but the current code ignores it and relies upon the
slow path to insert revmap records. This patch adds revmap updating
to the setup path so the slow path will no longer be necessary.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
This patch moves the irq disassociation code out into a separate
function in preparation to extend irq_setup_virq to handle multiple
irqs and rename it for use by interrupt controller drivers. The new
function will be used by irq_setup_virq() in its error path.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
The revmap type should be linear for irq_domain_add_linear function.
Signed-off-by: Dong Aisheng <dong.aisheng@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
A large proportion of interrupt controllers that support legacy mappings
do so because non-DT systems need to use fixed IRQ numbers when registering
devices via buses but can otherwise use a linear mapping. The interrupt
controller itself typically is not affected by the mapping used and best
practice is to use a linear mapping where possible so drivers frequently
select at runtime depending on if a legacy range has been allocated to
them.
Standardise this behaviour by providing irq_domain_register_simple() which
will allocate a linear mapping unless a positive first_irq is provided in
which case it will fall back to a legacy mapping. This helps make best
practice for irq_domain adoption clearer.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The pattern (np ? np->full_name : "<none>") is rather common in the
kernel, but can also make for quite long lines. This patch adds a new
inline function, of_node_full_name() so that the test for a valid node
pointer doesn't need to be open coded at all call sites.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
There isn't a really compelling reason to force ->map to be populated,
so allow it to be left unset.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Where irq_domain_associate() is called in irq_create_mapping, there is
no need to test for IRQ_DOMAIN_MAP_LEGACY because it is already tested
for earlier in the routine.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
While common irqdesc allocation is node aware, the irqdomain code is not.
Presently we observe a number of regressions/inconsistencies on
NUMA-capable platforms:
- Platforms using irqdomains with legacy mappings, where the
irq_descs are allocated node-local and the irqdomain data
structure is not.
- Drivers implementing irqdomains will lose node locality
regardless of the underlying struct device's node id.
This plugs in NUMA node id proliferation across the various allocation
callsites by way of_node_to_nid() node lookup. While of_node_to_nid()
does the right thing for OF-capable platforms it doesn't presently handle
the non-DT case. This is trivially dealt with by simply wraping in to
numa_node_id() unconditionally.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The pattern (np ? np->full_name : "<none>") is rather common in the
kernel, but can also make for quite long lines. This patch adds a new
inline function, of_node_full_name() so that the test for a valid node
pointer doesn't need to be open coded at all call sites.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Pull irq and smpboot updates from Thomas Gleixner:
"Just cleanup patches with no functional change and a fix for suspend
issues."
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Introduce irq_do_set_affinity() to reduce duplicated code
genirq: Add IRQS_PENDING for nested and simple irq
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smpboot, idle: Fix comment mismatch over idle_threads_init()
smpboot, idle: Optimize calls to smp_processor_id() in idle_threads_init()
Pull second pile of signal handling patches from Al Viro:
"This one is just task_work_add() series + remaining prereqs for it.
There probably will be another pull request from that tree this
cycle - at least for helpers, to get them out of the way for per-arch
fixes remaining in the tree."
Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
keys: kill task_struct->replacement_session_keyring
keys: kill the dummy key_replace_session_keyring()
keys: change keyctl_session_to_parent() to use task_work_add()
genirq: reimplement exit_irq_thread() hook via task_work_add()
task_work_add: generic process-context callbacks
avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
parisc: need to check NOTIFY_RESUME when exiting from syscall
move key_repace_session_keyring() into tracehook_notify_resume()
TIF_NOTIFY_RESUME is defined on all targets now
Use the module-wide pr_fmt() mechanism rather than open-coding "genirq: "
everywhere.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minor changes and fixups for irqdomain infrastructure. Most
important change adds ability to remove registered irqdomain.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPvpKWAAoJEEFnBt12D9kB6OIP/RvT1nU223w+hQMr8RKqhjZQ
GbhuZ1o2lPLAxoGYuCQqapeH542NuuIQvviM2MKLn6u9ev3fKeFZFdF4YTpE8fwo
ljIXJnjJ9reu2yCrAsIVZtIHJ+hXs07h1QjRwqFWGN/y57BRhxXI6xm1+SEAhay9
gtRnfQ7eCpi665zYoNBIoNxKeESrdiwgHKUsbkNELmbTwvx+Sc9AWsYMqtO0qRAG
JrOFCIOu3bqEcshDhM4MLZGVEmlzVZR4zUbQrY0chj5Y2c383YUyg+l+tN0NjPsF
L3MfgIu8WFim/edQM294dLTrZjqicg4xF5uRxjx5hY2EESoLKdf1pUady673M7ux
6cBcczMKJVQI7P2do7i8F0VwATkokytcP289hqYzJrxMHeXa48ccEZfiQt6xuLwc
JwWAZu3BxeBMzZNxQRNX39ImSsP5wnsfZdzUBTAFAcV1ZEYgSrGJYAw+pOz18UXD
YwwKcnNKwQHgNIkSLjgputT9VSuJsS09xErGeZAqkj7f6oxGlql9ElhUvgrBT8qg
eiTjgIkArRY6RG8+c2mMeKE10fN822jWK9kWQdttIPa++cSBHo/Yxt8PlClvEvH8
qjyD4nIG2dhwG8RtMc74IPDyHSHRW5JGXHPg37IoTPzurcsnzuNMSzlXVw2hb39d
pxhCVNxe1r4GH6NQFOMg
=K3Pg
-----END PGP SIGNATURE-----
Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull irqdomain changes from Grant Likely:
"Minor changes and fixups for irqdomain infrastructure. The most
important change adds the ability to remove a registered irqdomain."
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
irqdomain: Document size parameter of irq_domain_add_linear()
irqdomain: trivial pr_fmt conversion.
irqdomain: Kill off duplicate definitions.
irqdomain: Make irq_domain_simple_map() static.
irqdomain: Export remaining public API symbols.
irqdomain: Support removal of IRQ domains.
All invocations of chip->irq_set_affinity() are doing the same return
value checks. Let them all use a common function.
[ tglx: removed the silly likely while at it ]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Keping Chen <chenkeping@huawei.com>
Link: http://lkml.kernel.org/r/1333120296-13563-3-git-send-email-jiang.liu@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Every interrupt which is an active wakeup source needs the ability to
abort suspend if there is a pending irq. Right now only edge and level
irqs can do that.
|
+---------+
| INTC |
+---------+
| GPIO_IRQ
+------------+
| gpio-exp |
+------------+
| |
GPIO0_IRQ GPIO1_IRQ
In the above diagram, gpio expander has irq number GPIO_IRQ, it is
connected with two sub GPIO pins, GPIO0 and GPIO1.
During suspend, we set IRQF_NO_SUSPEND for GPIO_IRQ so that gpio
expander driver can handle the sub irq GPIO0_IRQ and GPIO1_IRQ, and
these two irqs themselves can further be handled by simple or nested
irq in some drivers(typically gpio and mfd driver). If they are used
as wakeup sources during suspend, we want them to be able to abort
suspend too.
Setting IRQS_PENDING flag in handle_nested_irq() and handle_simple_irq()
when the irq is disabled allows check_wakeup_irqs() to identify such
irqs as source for aborting suspend.
Signed-off-by: Ning Jiang <ning.n.jiang@gmail.com>
Cc: rjw@sisk.pl
Link: http://lkml.kernel.org/r/CAH3Oq6T905%2B3fkF43NAMMFvJvq7dsk_so6T2vQ8ZJrA5xiU3YA@mail.gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
exit_irq_thread() and task->irq_thread are needed to handle the unexpected
(and unlikely) exit of irq-thread.
We can use task_work instead and make this all private to
kernel/irq/manage.c, cleanup plus micro-optimization.
1. rename exit_irq_thread() to irq_thread_dtor(), make it
static, and move it up before irq_thread().
2. change irq_thread() to do task_work_add(irq_thread_dtor)
at the start and task_work_cancel() before return.
tracehook_notify_resume() can never play with kthreads,
only do_exit()->exit_task_work() can call the callback
and this is what we want.
3. remove task_struct->irq_thread and the special hook
in do_exit().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Smith <dsmith@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull UML updates from Richard Weinberger:
"Most changes are bug fixes and cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: missing checks of __put_user()/__get_user() return values
um: stub_rt_sigsuspend isn't needed these days anymore
um/x86: merge (and trim) 32- and 64-bit variants of ptrace.h
irq: Remove irq_chip->release()
um: Remove CONFIG_IRQ_RELEASE_METHOD
um: Remove usage of irq_chip->release()
um: Implement um_free_irq()
um: Fix __swp_type()
um: Implement a custom pte_same() function
um: Add BUG() to do_ops()'s error path
um: Remove unused variables
um: bury unused _TIF_RESTORE_SIGMASK
um: wrong sigmask saved in case of multiple sigframes
um: add TIF_NOTIFY_RESUME
um: ->restart_block.fn needs to be reset on sigreturn
Pull core irq changes from Ingo Molnar:
"A collection of small fixes."
By Thomas Gleixner
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hexagon: Remove select of not longer existing Kconfig switches
arm: Select core options instead of redefining them
genirq: Do not consider disabled wakeup irqs
genirq: Allow check_wakeup_irqs to notice level-triggered interrupts
genirq: Be more informative on irq type mismatch
genirq: Reject bogus threaded irq requests
genirq: Streamline irq_action
As it's only user (UML) does no longer need it we can get
rid of it.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Convert to pr_fmt before things start to get out of hand and some
janitors start getting overly excited.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Presently irq_domain_simple_map() isn't labelled as static, but there's
no definition for it in the public irqdomain header either. At present
all in-tree ->map users have meaningful work to do, and all others are
using irq_domain_simple_ops directly. Make it static for now, as it can
always be exported and added to the public API later.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
modules making use of irq domains at the very least need access to the
add/remove/lookup routines, though there's nothing preventing them from
using the remainder of the public API, either.
The current set of exports seem primarily geared at DT-enabled platforms
using DT-backed IRQ domains, where many of the API accesses are hidden
away in OF code. The non-DT cases need to do most of this on their own.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Now that IRQ domains are being used by modules it's necessary to support
removing them, too. This adds a new irq_domain_remove() routine for doing
the bulk of the heavy lifting. It's left as an exercise to the caller to
ensure all mappings have been appropriatey disposed of before attempting
to remove the domain.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Export handle_edge_irq() and irq_to_desc() to modules to allow them to
do things such as
__irq_set_handler_locked(...., handle_edge_irq);
This fixes
ERROR: "handle_edge_irq" [drivers/gpio/gpio-pch.ko] undefined!
ERROR: "irq_to_desc" [drivers/gpio/gpio-pch.ko] undefined!
when gpio-pch is being built as a module.
This was introduced by commit df9541a60a ("gpio: pch9: Use proper flow
type handlers") that added
__irq_set_handler_locked(d->irq, handle_edge_irq);
but handle_edge_irq() was not exported for modules (and inlined
__irq_set_handler_locked() requires irq_to_desc() exported as well)
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an wakeup interrupt has been disabled before the suspend code
disables all interrupts then we have to ignore the pending flag.
Otherwise we would abort suspend over and over as nothing clears the
pending flag because the interrupt is disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: NeilBrown <neilb@suse.de>
Level triggered interrupts do not cause IRQS_PENDING to be set when
they fire while "disabled" as the 'pending' state is always present in
the level - they automatically refire where re-enabled.
However the IRQS_PENDING flag is also used to abort a suspend cycle -
if any 'is_wakeup_set' interrupt is PENDING, check_wakeup_irqs() will
cause suspend to abort. Without IRQS_PENDING, suspend won't abort.
Consequently, level-triggered interrupts that fire during the 'noirq'
phase of suspend do not currently abort suspend.
So set IRQS_PENDING even for level triggered interrupts, and make sure
to clear the flag in check_irq_resend.
[ Changelog by courtesy of Neil ]
Tested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The file kernel/irq/debug.h temporarily defines P, PS, PD
and then undefines them. However these names aren't really
"internal" enough, and collide with other more legit users
such as the ones in the xtensa arch, causing:
In file included from kernel/irq/internals.h:58:0,
from kernel/irq/irqdesc.c:18:
kernel/irq/debug.h:8:0: warning: "PS" redefined [enabled by default]
arch/xtensa/include/asm/regs.h:59:0: note: this is the location of the previous definition
Add a handful of underscores to do a better job of hiding these
temporary macros.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
We require that shared interrupts agree on a few flag settings. Right
now we silently return with an error code without giving any hint why
we reject it.
Make the printout unconditionally and actually useful by printing the
flags of the new and the already registered action.
Convert all printks to pr_* and use a proper prefix while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Requesting a threaded interrupt without a primary handler and without
IRQF_ONESHOT set is dangerous.
The core will use the default primary handler for it, which merily
wakes the thread. For a level type interrupt this results in an
interrupt storm, because the interrupt line is reenabled after the
primary handler runs. The device has still the line asserted, which
brings us back into the primary handler.
While this works for edge type interrupts, we play it safe and reject
unconditionally because we can't say for sure which type this
interrupt really has. The type flags are unreliable as the underlying
chip implementation can override them. And we cannot assume that
developers using that interface know what they are doing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
sizeof(void*) returns an unsigned long, but it was being used as a width parameter to a "%-*s" format string which requires an int. On 64 bit platforms this causes a type mismatch:
linux/kernel/irq/irqdomain.c:575: warning: field width should have type
'int', but argument 6 has type 'long unsigned int'
This change casts the size to an int so printf gets the right data type.
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: David Daney <david.daney@cavium.com>
This patch replaces the old global setting of irq_virq_count that is only
used by the NOMAP mapping and instead uses a revmap_data property so that
the maximum NOMAP allocation can be set per NOMAP irq_domain.
There is exactly one user of irq_virq_count in-tree right now: PS3.
Also, irq_virq_count is only useful for the NOMAP mapping. So,
instead of having a single global irq_virq_count values, this change
drops it entirely and added a max_irq argument to irq_domain_add_nomap().
That makes it a property of an individual nomap irq domain instead of
a global system settting.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
This patch fixes the irq_domain_mapping debugfs output to pad pointer
values with leading zeros so that pointer values are displayed
correctly. Otherwise you get output similar to "0x 5e0000000000000".
Also, when the irq_domain is set to 'null'
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: David Daney <david.daney@cavium.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
The actual name of the irq_domain mapping debugfs file is
"irq_domain_mapping" not "virq_mapping".
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
In commit 4bbdd45a (irq_domain/powerpc: eliminate irq_map; use
irq_alloc_desc() instead) code was added that ignores error returns
from irq_alloc_desc_from() by (silently) casting the return value to
unsigned. The negitive value error return now suddenly looks like a
valid irq number.
Commits cc79ca69 (irq_domain: Move irq_domain code from powerpc to
kernel/irq) and 1bc04f2c (irq_domain: Add support for base irq and
hwirq in legacy mappings) move this code to its current location in
irqdomain.c
The result of all of this is a null pointer dereference OOPS if one of
the error cases is hit.
The fix: Don't cast away the negativeness of the return value and then
check for errors.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
[grant.likely: dropped addition of new 'irq' variable]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Pull genirq updates from Thomas Gleixner.
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Adjust irq thread affinity on IRQ_SET_MASK_OK_NOCOPY return value
genirq: Respect NUMA node affinity in setup_irq_irq affinity()
genirq: Get rid of unneeded force parameter in irq_finalize_oneshot()
genirq: Minor readablity improvement in irq_wake_thread()
irq_move_masked_irq() checks the return code of
chip->irq_set_affinity() only for 0, but IRQ_SET_MASK_OK_NOCOPY is
also a valid return code, which is there to avoid a redundant copy of
the cpumask. But in case of IRQ_SET_MASK_OK_NOCOPY we not only avoid
the redundant copy, we also fail to adjust the thread affinity of an
eventually threaded interrupt handler.
Handle IRQ_SET_MASK_OK (==0) and IRQ_SET_MASK_OK_NOCOPY(==1) return
values correctly by checking the valid return values seperately.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Keping Chen <chenkeping@huawei.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1333120296-13563-2-git-send-email-jiang.liu@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull more ARM updates from Russell King.
This got a fair number of conflicts with the <asm/system.h> split, but
also with some other sparse-irq and header file include cleanups. They
all looked pretty trivial, though.
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (59 commits)
ARM: fix Kconfig warning for HAVE_BPF_JIT
ARM: 7361/1: provide XIP_VIRT_ADDR for no-MMU builds
ARM: 7349/1: integrator: convert to sparse irqs
ARM: 7259/3: net: JIT compiler for packet filters
ARM: 7334/1: add jump label support
ARM: 7333/2: jump label: detect %c support for ARM
ARM: 7338/1: add support for early console output via semihosting
ARM: use set_current_blocked() and block_sigmask()
ARM: exec: remove redundant set_fs(USER_DS)
ARM: 7332/1: extract out code patch function from kprobes
ARM: 7331/1: extract out insn generation code from ftrace
ARM: 7330/1: ftrace: use canonical Thumb-2 wide instruction format
ARM: 7351/1: ftrace: remove useless memory checks
ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path
ARM: Versatile Express: add NO_IOPORT
ARM: get rid of asm/irq.h in asm/prom.h
ARM: 7319/1: Print debug info for SIGBUS in user faults
ARM: 7318/1: gic: refactor irq_start assignment
ARM: 7317/1: irq: avoid NULL check in for_each_irq_desc loop
ARM: 7315/1: perf: add support for the Cortex-A7 PMU
...
The debugfs code is really generic for all platforms. This patch removes the
powerpc-specific directory reference and makes it available to all
architectures.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
We respect node affinity of devices already in the irq descriptor
allocation, but we ignore it for the initial interrupt affinity
setup, so the interrupt might be routed to a different node.
Restrict the default affinity mask to the node on which the irq
descriptor is allocated.
[ tglx: Massaged changelog ]
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/1332788538-17425-1-git-send-email-prarit@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The only place irq_finalize_oneshot() is called with force parameter set
is the threaded handler error exit path. But IRQTF_RUNTHREAD is dropped
at this point and irq_wake_thread() is not going to set it again,
since PF_EXITING is set for this thread already. So irq_finalize_oneshot()
will drop the threads bit in threads_oneshot anyway and hence the force
parameter is superfluous.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120321162234.GP24806@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
exit_irq_thread() clears IRQTF_RUNTHREAD flag and drops the thread's bit in
desc->threads_oneshot then. The bit must not be set again in between and it
does not, since irq_wake_thread() sees PF_EXITING flag first and returns.
Due to above the order or checking PF_EXITING and IRQTF_RUNTHREAD flags in
irq_wake_thread() is important. This change just makes it more visible in the
source code.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120321162212.GO24806@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This branch takes the PowerPC irq_host infrastructure (reverse mapping
from Linux IRQ numbers to hardware irq numbering), generalizes it,
renames it to irq_domain, and makes it available to all architectures.
Originally the plan has been to create an all-new irq_domain
implementation which addresses some of the powerpc shortcomings such
as not handling 1:1 mappings well, but doing that proved to be far
more difficult and invasive than generalizing the working code and
refactoring it in-place. So, this branch rips out the 'new'
irq_domain and replaces it with the modified powerpc version (in a
fully bisectable way of course). It converts all users over to the
new API and makes irq_domain selectable on any architecture.
No architecture is forced to enable irq_domain, but the infrastructure
is required for doing OpenFirmware style irq translations. It will
even work on SPARC even though SPARC has it's own mechanism for
translating irqs at boot time. MIPS, microblaze, embedded x86 and c6x
are converted too.
The resulting irq_domain code is probably still too verbose and can be
optimized more, but that can be done incrementally and is a task for
follow-on patches.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPZ1yiAAoJEEFnBt12D9kB4yIQAJvCfTPL65sCYVD6i9RnVHtR
ahwddtd0AtT+UYLU8Xg2fZgVi6cmupDGnqkBixzZD3xxSTERqm7Snqa0ugklfeAi
B6Zqf/K17H5hJNaoQ3fkNauow8m7ZYOeEH2vVUvkb3woWS9Wm7OGd+BvcIBgYSGe
Aaoumhu7kDxFkii0qz3x/+kvsb6DRp2HtSPWj+APL/kNjdiO4JBOihtcc/lX6d47
bsZLiEMzHUFV4ApJNwqmfDnf54oMrHmrRJxgQHIMjeJC5or9I3Do8wDGe/aTF5xO
5GVpxCQsTlJMjTBWlAFtpTwCJB6y76EHQrHc7WzLlq8OJSsxApOke8M0BzXFrfMy
CU7UUpTvNZTLpZibLCEQKemv1+oNOkfFylsHxfek2MCqx0W6W4FHEGV3qE/GtgV9
+vurA9hNNp7VM0FGRGigcUr3woYdHLdEVQrlnL7Z9AgBu1W44MZLaai7iRVZOeCT
ZQ9++v2PJJ8vHT8kdkgTdiRpnEhmv84MX/GBT7ilWFEMIVeT5zhGkIBojzNgyzGc
7cvermmM0P8h+unkDgmzmSbDxo0PboqVKeoO71AOBhA6MmR9iom7XkuNdHhoOwy2
4A5xT1srbhJDbuv15BBREBV24TywpZ4a1+4nwQT4L1fXe+HfCxeEWexGcKQMRcIt
dAelOHTQ+ZGkOKvXeW05
=ruGA
-----END PGP SIGNATURE-----
Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull irq_domain support for all architectures from Grant Likely:
"Generialize powerpc's irq_host as irq_domain
This branch takes the PowerPC irq_host infrastructure (reverse mapping
from Linux IRQ numbers to hardware irq numbering), generalizes it,
renames it to irq_domain, and makes it available to all architectures.
Originally the plan has been to create an all-new irq_domain
implementation which addresses some of the powerpc shortcomings such
as not handling 1:1 mappings well, but doing that proved to be far
more difficult and invasive than generalizing the working code and
refactoring it in-place. So, this branch rips out the 'new'
irq_domain and replaces it with the modified powerpc version (in a
fully bisectable way of course). It converts all users over to the
new API and makes irq_domain selectable on any architecture.
No architecture is forced to enable irq_domain, but the infrastructure
is required for doing OpenFirmware style irq translations. It will
even work on SPARC even though SPARC has it's own mechanism for
translating irqs at boot time. MIPS, microblaze, embedded x86 and c6x
are converted too.
The resulting irq_domain code is probably still too verbose and can be
optimized more, but that can be done incrementally and is a task for
follow-on patches."
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6: (31 commits)
dt: fix twl4030 for non-dt compile on x86
mfd: twl-core: Add IRQ_DOMAIN dependency
devicetree: Add empty of_platform_populate() for !CONFIG_OF_ADDRESS (sparc)
irq_domain: Centralize definition of irq_dispose_mapping()
irq_domain/mips: Allow irq_domain on MIPS
irq_domain/x86: Convert x86 (embedded) to use common irq_domain
ppc-6xx: fix build failure in flipper-pic.c and hlwd-pic.c
irq_domain/microblaze: Convert microblaze to use irq_domains
irq_domain/powerpc: Replace custom xlate functions with library functions
irq_domain/powerpc: constify irq_domain_ops
irq_domain/c6x: Use library of xlate functions
irq_domain/c6x: constify irq_domain structures
irq_domain/c6x: Convert c6x to use generic irq_domain support.
irq_domain: constify irq_domain_ops
irq_domain: Create common xlate functions that device drivers can use
irq_domain: Remove irq_domain_add_simple()
irq_domain: Remove 'new' irq_domain in favour of the ppc one
mfd: twl-core.c: Fix the number of interrupts managed by twl4030
of/address: add empty static inlines for !CONFIG_OF
irq_domain: Add support for base irq and hwirq in legacy mappings
...
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)
This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.
The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:
$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf
This output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.
This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.
- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.
- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485
- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.
- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:
struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code
...
static_key_slow_inc();
...
static_key_slow_inc();
...
The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.
This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.
- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.
- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.
- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.
- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
Alexander pointed out that the warnons in the regular exit path are
bogus and the thread_mask one actually could be triggered when
__setup_irq() hands out that thread_mask again after __free_irq()
dropped irq_desc->lock.
Thinking more about it, neither IRQTF_RUNTHREAD nor the bit in
thread_mask can be set as this is the regular exit path. We come here
due to:
__free_irq()
remove action from desc
synchronize_irq()
kthread_stop()
So synchronize_irq() makes sure that the thread finished running and
cleaned up both the thread_active count and thread_mask. After that
point nothing can set IRQTF_RUNTHREAD on this action. So the warnons
and the cleanups are pointless.
Reported-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Ido Yariv <ido@wizery.com>
Link: http://lkml.kernel.org/r/20120315190755.GA6732@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current implementation does not always flush the threaded handler
when disabling the irq. In case the irq handler was called, but the
threaded handler hasn't started running yet, the interrupt will be
flagged as pending, and the handler will not run. This implementation
has some issues:
First, if the interrupt is a wake source and flagged as pending, the
system will not be able to suspend.
Second, when quickly disabling and re-enabling the irq, the threaded
handler might continue to run after the irq is re-enabled without the
irq handler being called first. This might be an unexpected behavior.
In addition, it might be counter-intuitive that the threaded handler
will not be called even though the irq handler was called and returned
IRQ_WAKE_THREAD.
Fix this by always waiting for the threaded handler to complete in
synchronize_irq().
[ tglx: Massaged comments, added WARN_ONs and the missing
IRQTF_RUNTHREAD check in exit_irq_thread() ]
Signed-off-by: Ido Yariv <ido@wizery.com>
Link: http://lkml.kernel.org/r/1322843052-7166-1-git-send-email-ido@wizery.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently IRQTF_DIED flag is set when a IRQ thread handler calls do_exit()
But also PF_EXITING per process flag gets set when a thread exits. This
fix eliminates the duplicate by using PF_EXITING flag.
Also, there is a race condition in exit_irq_thread(). In case a thread's
bit is cleared in desc->threads_oneshot (and the IRQ line gets unmasked),
but before IRQTF_DIED flag is set, a new interrupt might come in and set
just cleared bit again, this time forever. This fix throws IRQTF_DIED flag
away, eliminating the race as a result.
[ tglx: Test THREAD_EXITING first as suggested by Oleg ]
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120309135958.GD2114@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since 63706172f3 kthread_stop() is not
afraid of dead kernel threads. So no need to check if a thread is
alive before stopping it. These checks still were racy.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120309135939.GC2114@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When a new thread handler is created, an irqaction is passed to it as
data. Not only that irqaction is stored in task_struct by the handler
for later use, but also a structure associated with the kernel thread
keeps this value as long as the thread exists.
This fix kicks irqaction out off task_struct. Yes, I introduce new bit
field. But it allows not only to eliminate the duplicate, but also
shortens size of task_struct.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120309135925.GB2114@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken)
fails to unmask when a !IRQ_ONESHOT threaded handler is handled by
handle_level_irq.
This happens because thread_mask is or'ed unconditionally in
irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So
the check for !desc->thread_active fails and keeps the interrupt
disabled.
Keep the thread_mask zero for !IRQ_ONESHOT interrupts.
Document the thread_mask magic while at it.
Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de>
Reported-and-tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In 2008, commit 0c5d1eb77a ("genirq: record trigger type") modified the
way set_irq_type() handles the 'no trigger' condition. However, this has
an adverse effect on PCMCIA support on Intel StrongARM and probably PXA
platforms.
PCMCIA has several status signals on the socket which can trigger
interrupts; some of these status signals depend on the card's mode
(whether it is configured in memory or IO mode). For example, cards have
a 'Ready/IRQ' signal: in memory mode, this provides an indication to
PCMCIA that the card has finished its power up initialization. In IO
mode, it provides the device interrupt signal. Other status signals
switch between on-board battery status and loud speaker output.
In classical PCMCIA implementations, where you have a specific socket
controller, the controller provides a method to mask interrupts from the
socket, and importantly ignore any state transitions on the pins which
correspond with interrupts once masked. This masking prevents unwanted
events caused by the removal and application of socket power being
forwarded.
However, on platforms where there is no socket controller, the PCMCIA
status and interrupt signals are routed to standard edge-triggered GPIOs.
These GPIOs can be configured to interrupt on rising edge, falling edge,
or never. This is where the problems start.
Edge triggered interrupts are required to record events while disabled via
the usual methods of {free,request,disable,enable}_irq() to prevent
problems with dropped interrupts (eg, the 8390 driver uses disable_irq()
to defer the delivery of interrupts). As a result, these interfaces can
not be used to implement the desired behaviour.
The side effect of this is that if the 'Ready/IRQ' GPIO is disabled via
disable_irq() on suspend, and enabled via enable_irq() after resume, we
will record the state transitions caused by powering events as valid
interrupts, and foward them to the card driver, which may attempt to
access a card which is not powered up.
This leads delays resume while drivers spin in their interrupt handlers,
and complaints from drivers before they realize what's happened.
Moreover, in the case of the 'Ready/IRQ' signal, this is requested and
freed by the card driver itself; the PCMCIA core has no idea whether the
interrupt is requested, and, therefore, whether a call to disable_irq()
would be valid. (We tried this around 2.4.17 / 2.5.1 kernel era, and
ended up throwing it out because of this problem.)
Therefore, it was decided back in around 2002 to disable the edge
triggering instead, resulting in all state transitions on the GPIO being
ignored. That's what we actually need the hardware to do.
The commit above changes this behaviour; it explicitly prevents the 'no
trigger' state being selected.
The reason that request_irq() does not accept the 'no trigger' state is
for compatibility with existing drivers which do not provide their desired
triggering configuration. The set_irq_type() function is 'new' and not
used by non-trigger aware drivers.
Therefore, revert this change, and restore previously working platforms
back to their former state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: linux@arm.linux.org.uk
Cc: Ingo Molnar <mingo@elte.hu>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch makes IRQ_DOMAIN usable on MIPS. It uses an ugly workaround
to preserve current behaviour so that MIPS has time to add irq_domain
registration to the irq controller drivers. The workaround will be
removed in Linux v3.6
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Make irq_domain_ops pointer a constant to make it safer for multiple
instances to share the same ops pointer and change the irq_domain code
so that it does not modify the ops.
v4: Fix mismatched type reference in powerpc code
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
Rather than having each interrupt controller driver creating its own barely
unique .xlate function for irq_domain, create a library of translators which
any driver can use directly.
v5: - Remove irq_domain_xlate_pci(). It was incorrect.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
irq_domain_add_simple() was a stop-gap measure until complete irq_domain
support was complete. This patch removes the irq_domain_add_simple()
interface.
This patch also drops the explicit irq_domain initialization performed
by the mach-versatile code because the versatile interrupt controller
already has irq_domain support built into it. This was a bug that was
hanging around quietly for a while, but with the full irq_domain which
actually verifies that irq_domain ranges are available it would cause
the registration to fail and the system wouldn't boot.
v4: Fixed number of irqs in mx5 gpio code
v2: Updated to pass in host_data pointer on irq_domain allocation.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Cc: Russell King <linux@arm.linux.org.uk>
Tested-by: Olof Johansson <olof@lixom.net>
This patch removes the simplistic implementation of irq_domains and enables
the powerpc infrastructure for all irq_domain users. The powerpc
infrastructure includes support for complex mappings between Linux and
hardware irq numbers, and can manage allocation of irq_descs.
This patch also converts the few users of irq_domain_add()/irq_domain_del()
to call irq_domain_add_legacy() instead.
v3: Fix bug that set up too many irqs in translation range.
v2: Fix removal of irq_alloc_descs() call in gic driver
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
Add support for a legacy mapping where irq = (hwirq - first_hwirq + first_irq)
so that a controller driver can allocate a fixed range of irq_descs and use
a simple calculation to translate back and forth between linux and hw irq
numbers. This is needed to use an irq_domain with many of the ARM interrupt
controller drivers that manage their own irq_desc allocations. Ultimately
the goal is to migrate those drivers to use the linear revmap, but doing it
this way allows each driver to be converted separately which makes the
migration path easier.
This patch generalizes the IRQ_DOMAIN_MAP_LEGACY method to use
(first_irq-first_hwirq) as the offset between hwirq and linux irq number,
and adds checks to make sure that the hwirq number does not exceed range
assigned to the controller.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
Each revmap type has different arguments for setting up the revmap.
This patch splits up the generator functions so that each revmap type
can do its own setup and the user doesn't need to keep track of how
each revmap type handles the arguments.
This patch also adds a host_data argument to the generators. There are
cases where the host_data pointer will be needed before the function returns.
ie. the legacy map calls the .map callback for each irq before returning.
v2: - Add void *host_data argument to irq_domain_add_*() functions
- fixed failure to compile
- Moved IRQ_DOMAIN_MAP_* defines into irqdomain.c
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
No functional changes. Replaces non-exported references to 'host' with domain.
Does not change any symbol names referenced by other .c files.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
zero always means no irq when using irq domains. Get rid of the NO_IRQ
references.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
This patch only moves the code. It doesn't make any changes, and the
code is still only compiled for powerpc. Follow-on patches will generalize
the code for other architectures.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
An interrupt might be pending when irq_startup() is called, but the
startup code does not invoke the resend logic. In some cases this
prevents the device from issuing another interrupt which renders the
device non functional.
Call the resend function in irq_startup() to keep things going.
Reported-and-tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When the primary handler of an interrupt which is marked IRQ_ONESHOT
returns IRQ_HANDLED or IRQ_NONE, then the interrupt thread is not
woken and the unmask logic of the interrupt line is never
invoked. This keeps the interrupt masked forever.
This was not noticed as most IRQ_ONESHOT users wake the thread
unconditionally (usually because they cannot access the underlying
device from hard interrupt context). Though this behaviour was nowhere
documented and not necessarily intentional. Some drivers can avoid the
thread wakeup in certain cases and run into the situation where the
interrupt line s kept masked.
Handle it gracefully.
Reported-and-tested-by: Lothar Wassmann <lw@karo-electronics.de>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Part of the series to unify the irq remapping mechanisms in the
kernel. A follow up patch will copy the powerpc implementation into
kernel/irq/irqdomain.c, which will be a lot easier if the structures
are identical.
Where they differ, I've chose to use the powerpc names since there is
a lot more code using those names.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
irq_domain printk's too much. Drop some output.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Olof Johansson <olof@lixom.net>
The __raise_softirq_irqoff() contains a tracepoint. As tracepoints in headers
can cause issues, and not to mention, bloats the kernel when they are
in a static inline, it is best to move the function that contains the
tracepoint out of the header and into softirq.c.
Link: http://lkml.kernel.org/r/20120118120711.GB14863@elte.hu
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
On ARM, we don't want SPARSE_IRQ to be a user visible option. Make
SPARSE_IRQ visible based on MAY_HAVE_SPARSE_IRQ instead of depending
on HAVE_SPARSE_IRQ.
With this, SPARSE_IRQ is not visible on C6X and ARM.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
Kconfig: acpi: Fix typo in comment.
misc latin1 to utf8 conversions
devres: Fix a typo in devm_kfree comment
btrfs: free-space-cache.c: remove extra semicolon.
fat: Spelling s/obsolate/obsolete/g
SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
tools/power turbostat: update fields in manpage
mac80211: drop spelling fix
types.h: fix comment spelling for 'architectures'
typo fixes: aera -> area, exntension -> extension
devices.txt: Fix typo of 'VMware'.
sis900: Fix enum typo 'sis900_rx_bufer_status'
decompress_bunzip2: remove invalid vi modeline
treewide: Fix comment and string typo 'bufer'
hyper-v: Update MAINTAINERS
treewide: Fix typos in various parts of the kernel, and fix some comments.
clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
leds: Kconfig: Fix typo 'D2NET_V2'
sound: Kconfig: drop unknown symbol ARCH_CLPS7500
...
Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
kconfig additions, close to removed commented-out old ones)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPBUxHAAoJEEFnBt12D9kBfB8QAIj/sGqC4WYpQBU9PyGiSAfq
Xz5mBqAoILAjLD5SK1eR/N6+CHFt2scsyRkOubfg/4bn0I9Kr37oSXPcVqY24EKw
zn2tEcDDJdCMzW/CFBEcOfcKfcNl1dgEXodhoOrSAeyaZeL24V5O1SvrwKGINQIp
NPO/nHeEf7tRncscfeCFSK6h98dIhRZt+xpthIIZxYym5jd9vzRjk21dG6egcHOo
RTfW2CwA0cu3h2EtR1DCCMwfL+Ze1na4iYcXcTT/5xEUjmkJrM/I44djb9blfWLR
X0gPuj7q6t+eBKDZqnJufDbY79oOpQN8mgAeCrBwF5I27zAzn6FIRQ7Wobmv0DXr
NrqsduI+nImAyVR1jeWBGAnowNPXfxd9YboATkph5LTpelQ1cZ/tikxVDteBtDSb
voH68FKFmFZl3QlpFjJnIsodn57ETmUuezA4Xq+iJTHBMpLpcLCZ4RyAoiJRHca6
lGwBArLo+wpuRdd/BB+rI+CV6ACbJqwb8Yd6ZTw5lW0eaMPTT93ffteChfEQPMq0
mGVTYY6kVM9ReFPEkcxnQ3RyPF0XqP2TTp37gXC4/Y2LVMqg0UrrWxtwaU2har/Z
bUF1JOKiZ6+CWJJwMw7coP7rBCPUz6FDApuDmQo9+I7MjxjFkzYzPtisUs9y7wjx
/XA9z93mB/O+wcbuvqCo
=CI17
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6
devicetree/next changes queued for v3.3 merge window
* tag 'devicetree-for-linus-20120104' of git://git.secretlab.ca/git/linux-2.6:
ARM: prom.h: Fix build error by removing unneeded header file
irq: check domain hwirq range for DT translate
dt: add empty of_get_node/of_put_node functions
of/pdt: fix section mismatch warning
i2c-designware: add OF binding support
dt/i2c: Enumerate some of the known trivial i2c devices
dt: reform for_each_property to for_each_property_of_node
ARM/of: allow *machine_desc.dt_compat to be const
of/base: Take NULL string into account for property with multiple strings
OF/device-tree: Add some entries to vendor-prefixes.txt
Fix up trivial add-add conflicts in include/linux/of.h
A DT node may have more than 1 domain associated with it, so make sure
the hwirq number is within range when doing DT translation.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
irqdomain support is used in interrupt controller drivers that may not
have device tree support but only need the basic HW->Linux irq
translation. Rather than having each of these implement their own IRQ
domain, allow them to use the simple ops.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rob Herring <robherring2@gmail.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In irq_wait_for_interrupt(), the should_stop member is verified before
setting the task's state to TASK_INTERRUPTIBLE and calling schedule().
In case kthread_stop sets should_stop and wakes up the process after
should_stop is checked by the irq thread but before the task's state
is changed, the irq thread might never exit:
kthread_stop irq_wait_for_interrupt
------------ ----------------------
...
... while (!kthread_should_stop()) {
kthread->should_stop = 1;
wake_up_process(k);
wait_for_completion(&kthread->exited);
...
set_current_state(TASK_INTERRUPTIBLE);
...
schedule();
}
Fix this by checking if the thread should stop after modifying the
task's state.
[ tglx: Simplified it a bit ]
Signed-off-by: Ido Yariv <ido@wizery.com>
Link: http://lkml.kernel.org/r/1322740508-22640-1-git-send-email-ido@wizery.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Commit fa27271bc8d2("genirq: Fixup poll handling") introduced a
regression that broke irqfixup/irqpoll for some hardware configurations.
Amidst reorganizing 'try_one_irq', that patch removed a test that
checked for 'action->handler' returning IRQ_HANDLED, before acting on
the interrupt. Restoring this test back returns the functionality lost
since 2.6.39. In the current set of tests, after 'action' is set, it
must precede '!action->next' to take effect.
With this and my previous patch to irq/spurious.c, c75d720fca, all
IRQ regressions that I have encountered are fixed.
Signed-off-by: Edward Donovan <edward.donovan@numble.net>
Reported-and-tested-by: Rogério Brito <rbrito@ime.usp.br>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org (2.6.39+)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The power management functions related to interrupts do not know
(yet) about per-cpu interrupts and end up calling the wrong
low-level methods to enable/disable interrupts.
This leads to all kind of interesting issues (action taken on one
CPU only, updating a refcount which is not used otherwise...).
The workaround for the time being is simply to flag these interrupts
with IRQF_NO_SUSPEND. At least on ARM, these interrupts are actually
dealt with at the architecture level.
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1321446459-31409-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include <linux/module.h>
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include <linux/module.h>
net: sch_generic remove redundant use of <linux/module.h>
net: inet_timewait_sock doesnt need <linux/module.h>
...
Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
commit d05c65fff0 ("genirq: spurious: Run only one poller at a time")
introduced a regression, leaving the boot options 'irqfixup' and
'irqpoll' non-functional. The patch placed tests in each function, to
exit if the function is already running. The test in 'misrouted_irq'
exited when it should have proceeded, effectively disabling
'misrouted_irq' and 'poll_spurious_irqs'.
The check for an already running poller needs to be "!= 1" not "== 1"
as "1" is the value when the first poller starts running.
Signed-off-by: Edward Donovan <edward.donovan@numble.net>
Cc: maciej.rutecki@gmail.com
Link: http://lkml.kernel.org/r/1320175784-6745-1-git-send-email-edward.donovan@numble.net
Cc: stable@vger.kernel.org # >= 2.6.39
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'next/dt' of git://git.linaro.org/people/arnd/arm-soc:
ARM: gic: use module.h instead of export.h
ARM: gic: fix irq_alloc_descs handling for sparse irq
ARM: gic: add OF based initialization
ARM: gic: add irq_domain support
irq: support domains with non-zero hwirq base
of/irq: introduce of_irq_init
ARM: at91: add at91sam9g20 and Calao USB A9G20 DT support
ARM: at91: dt: at91sam9g45 family and board device tree files
arm/mx5: add device tree support for imx51 babbage
arm/mx5: add device tree support for imx53 boards
ARM: msm: Add devicetree support for msm8660-surf
msm_serial: Add devicetree support
msm_serial: Use relative resources for iomem
Fix up conflicts in arch/arm/mach-at91/{at91sam9260.c,at91sam9g45.c}
Recent commit "irq: Track the owner of irq descriptor" in
commit ID b6873807a7 placed module.h into linux/irq.h
but we are trying to limit module.h inclusion to just C files
that really need it, due to its size and number of children
includes. This targets just reversing that include.
Add in the basic "struct module" since that is all we really need
to ensure things compile. In theory, b687380 should have added the
module.h include to the irqdesc.h header as well, but the implicit
module.h everywhere presence masked this from showing up. So give
it the "struct module" as well.
As for the C files, irqdesc.c is only using THIS_MODULE, so it
does not need module.h - give it export.h instead. The C file
irq/manage.c is now (as of b687380) using try_module_get and
module_put and so it needs module.h (which it already has).
Also convert the irq_alloc_descs variants to macros, since all
they really do is is call the __irq_alloc_descs primitive.
This avoids including export.h and no debug info is lost.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
These files were getting <linux/module.h> via an implicit non-obvious
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason. Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Interrupt controllers can have non-zero starting value for h/w irq numbers.
Adding support in irq_domain allows the domain hwirq numbering to match
the interrupt controllers' numbering.
As this makes looping over irqs for a domain more complicated, add loop
iterators to iterate over all hwirqs and irqs for a domain.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Tested-by: Thomas Abraham <thomas.abraham@linaro.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
* 'gpio/next' of git://git.secretlab.ca/git/linux-2.6:
h8300: Move gpio.h to gpio-internal.h
gpio: pl061: add DT binding support
gpio: fix build error in include/asm-generic/gpio.h
gpiolib: Ensure struct gpio is always defined
irq: Add EXPORT_SYMBOL_GPL to function of irq generic-chip
gpio-ml-ioh: Use NUMA_NO_NODE not GFP_KERNEL
gpio-pch: Use NUMA_NO_NODE not GFP_KERNEL
gpio: langwell: ensure alternate function is cleared
gpio-pch: Support interrupt function
gpio-pch: Save register value in suspend()
gpio-pch: modify gpio_nums and mask
gpio-pch: support ML7223 IOH n-Bus
gpio-pch: add spinlock in suspend/resume processing
gpio-pch: Delete invalid "restore" code in suspend()
gpio-ml-ioh: Fix suspend/resume issue
gpio-ml-ioh: Support interrupt function
gpio-ml-ioh: Delete unnecessary code
gpio/mxc: add chained_irq_enter/exit() to mx3_gpio_irq_handler()
gpio/nomadik: use genirq core to track enablement
gpio/nomadik: disable clocks when unused
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier
genirq: Fix fatfinered fixup really
genirq: percpu: allow interrupt type to be set at enable time
genirq: Add support for per-cpu dev_id interrupts
genirq: Add IRQCHIP_SKIP_SET_WAKE flag
Some functions of irq generic-chip is undefined, because
EXPORT_SYMBOL_GPL is not set to these.
ERROR: "irq_setup_generic_chip" [drivers/gpio/gpio-pch.ko] undefined!
ERROR: "irq_alloc_generic_chip" [drivers/gpio/gpio-pch.ko] undefined!
ERROR: "irq_setup_generic_chip" [drivers/gpio/gpio-ml-ioh.ko] undefined!
ERROR: "irq_alloc_generic_chip" [drivers/gpio/gpio-ml-ioh.ko] undefined!
This is revised that EXPORT_SYMBOL_GPL can be added and referred
to in functions.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.
Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.
This issue was introduced by 676dc3cf5b "xen: Use IRQF_FORCE_RESUME".
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk
Cc: stable@kernel.org (at least to 2.6.32.y)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As request_percpu_irq() doesn't allow for a percpu interrupt to have
its type configured (it is generally impossible to configure it on all
CPUs at once), add a 'type' argument to enable_percpu_irq().
This allows some low-level, board specific init code to be switched to
a generic API.
[ tglx: Added WARN_ON argument ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The ARM GIC interrupt controller offers per CPU interrupts (PPIs),
which are usually used to connect local timers to each core. Each CPU
has its own private interface to the GIC, and only sees the PPIs that
are directly connect to it.
While these timers are separate devices and have a separate interrupt
line to a core, they all use the same IRQ number.
For these devices, request_irq() is not the right API as it assumes
that an IRQ number is visible by a number of CPUs (through the
affinity setting), but makes it very awkward to express that an IRQ
number can be handled by all CPUs, and yet be a different interrupt
line on each CPU, requiring a different dev_id cookie to be passed
back to the handler.
The *_percpu_irq() functions is designed to overcome these
limitations, by providing a per-cpu dev_id vector:
int request_percpu_irq(unsigned int irq, irq_handler_t handler,
const char *devname, void __percpu *percpu_dev_id);
void free_percpu_irq(unsigned int, void __percpu *);
int setup_percpu_irq(unsigned int irq, struct irqaction *new);
void remove_percpu_irq(unsigned int irq, struct irqaction *act);
void enable_percpu_irq(unsigned int irq);
void disable_percpu_irq(unsigned int irq);
The API has a number of limitations:
- no interrupt sharing
- no threading
- common handler across all the CPUs
Once the interrupt is requested using setup_percpu_irq() or
request_percpu_irq(), it must be enabled by each core that wishes its
local interrupt to be delivered.
Based on an initial patch by Thomas Gleixner.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some irq chips need the irq_set_wake() functionality, but do not
require a irq_set_wake() callback. Instead of forcing an empty
callback to be implemented add a flag which notes this fact. Check for
the flag in set_irq_wake_real() and return success when set.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
If an irq_chip provides .irq_shutdown(), but neither of .irq_disable() or
.irq_mask(), free_irq() crashes when jumping to NULL.
Fix this by only trying .irq_disable() and .irq_mask() if there's no
.irq_shutdown() provided.
This revives the symmetry with irq_startup(), which tries .irq_startup(),
.irq_enable(), and irq_unmask(), and makes it consistent with the comment for
irq_chip.irq_shutdown() in <linux/irq.h>, which says:
* @irq_shutdown: shut down the interrupt (defaults to ->disable if NULL)
This is also how __free_irq() behaved before the big overhaul, cfr. e.g.
3b56f0585f ("genirq: Remove bogus conditional"),
where the core interrupt code always overrode .irq_shutdown() to
.irq_disable() if .irq_shutdown() was NULL.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Link: http://lkml.kernel.org/r/1315742394-16036-2-git-send-email-geert@linux-m68k.org
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This reverts commit f3637a5f2e.
It turns out that this breaks several drivers, one example being OMAP
boards which use the on-board OMAP UARTs and the omap-serial driver that
will not boot to userspace after the commit.
Paul Walmsley reports that enabling CONFIG_DEBUG_SHIRQ reveals 'IRQ
handler type mismatch' errors:
IRQ handler type mismatch for IRQ 74
current handler: serial idle
...
and the reason is that setting IRQF_ONESHOT will now result in those
interrupt handlers having different IRQF flags, and thus being
unsharable. So the commit log in the reverted commit:
"Since it is required for those users and
there is no difference for others it makes sense to add this flag
unconditionally."
is simply not true: there may not be any difference from a "actions at
irq time", but there is a *big* difference wrt this flag testing irq
management (see __setup_irq() in kernel/irq/manage.c).
One solution may be to stop verifying IRQF_ONESHOT in __setup_irq(), but
right now the safe course of action is to revert the change. Let's
revisit this in a later merge window.
Reported-by: Paul Walmsley <paul@pwsan.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Requested-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix kernel-doc warning in irqdesc.c:
Warning(kernel/irq/irqdesc.c:353): No description found for parameter 'owner'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq: Track the owner of irq descriptor
irq: Always set IRQF_ONESHOT if no primary handler is specified
genirq: Fix wrong bit operation
Interrupt descriptors can be allocated from modules. The interrupts
are used by other modules, but we have no refcount on the module which
provides the interrupts and there is no way to establish one on the
device level as the interrupt using module is agnostic to the fact
that the interrupt is provided by a module rather than by some builtin
interrupt controller.
To prevent removal of the interrupt providing module, we can track the
owner of the interrupt descriptor, which also provides the relevant
irq chip functions in the irq descriptor.
request/setup_irq() can now acquire a refcount on the owner module to
prevent unloading. free_irq() drops the refcount.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Link: http://lkml.kernel.org/r/20110711101731.GA13804@Chamillionaire.breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If no primary handler is specified then a default one is assigned
which always returns IRQ_WAKE_THREAD. This handler requires the
IRQF_ONESHOT flag on LEVEL / EIO typed irqs because the source of
interrupt is not disabled. Since it is required for those users and
there is no difference for others it makes sense to add this flag
unconditionally.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/1310070737-18514-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_domain_generate_simple() is an easy way to generate an irq translation
domain for simple irq controllers. It assumes a flat 1:1 mapping from
hardware irq number to an offset of the first linux irq number assigned
to the controller
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds irq_domain infrastructure for translating from
hardware irq numbers to linux irqs. This is particularly important
for architectures adding device tree support because the current
implementation (excluding PowerPC and SPARC) cannot handle
translation for more than a single interrupt controller. irq_domain
supports device tree translation for any number of interrupt
controllers.
This patch converts x86, Microblaze, ARM and MIPS to use irq_domain
for device tree irq translation. x86 is untested beyond compiling it,
irq_domain is enabled for MIPS and Microblaze, but the old behaviour is
preserved until the core code is modified to actually register an
irq_domain yet. On ARM it works and is required for much of the new
ARM device tree board support.
PowerPC has /not/ been converted to use this new infrastructure. It
is still missing some features before it can replace the virq
infrastructure already in powerpc (see documentation on
irq_domain_map/unmap for details). Followup patches will add the
missing pieces and migrate PowerPC to use irq_domain.
SPARC has its own method of managing interrupts from the device tree
and is unaffected by this change.
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
devres uses the pointer value as key after it's freed, which is safe but
triggers spurious use-after-free warnings on some static analysis tools.
Rearrange code to avoid such warnings.
Signed-off-by: Maxin B. John <maxin.john@gmail.com>
Reviewed-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a regression introduced by e59347a "arm: orion:
Use generic irq chip".
Depending on the device, interrupts acknowledgement is done by setting
or by clearing a dedicated register. Replace irq_gc_ack() with some
{set,clr}_bit variants allows to handle both cases.
Note that this patch affects the following SoCs: Davinci, Samsung and
Orion. Except for this last, the change is minor: irq_gc_ack() is just
renamed into irq_gc_ack_set_bit().
For the Orion SoCs, the edge GPIO interrupts support is currently
broken. irq_gc_ack() try to acknowledge a such interrupt by setting
the corresponding cause register bit. The Orion GPIO device expect the
opposite. To fix this issue, the irq_gc_ack_clr_bit() variant is used.
Tested on Network Space v2.
Reported-by: Joey Oravec <joravec@drewtech.com>
Signed-off-by: Simon Guinot <sguinot@lacie.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In kernel/irq/manage.c::irq_set_irq_wake() we call
irq_get_desc_buslock() which may return NULL, but the code
dereferences the result unconditionally.
irq_set_irq_wake() has lots of callers - I checked a few and I couldn't
find anything that guarantees that they won't call it with some input that
will cause irq_get_desc_buslock() to return NULL, so I think it's a good
thing to test and -EINVAL was the most sane error code in this situation
that I could think of.
Not all callers test the return value of irq_set_irq_wake(), but those
that do take != 0 to mean error as far as I can see, so they should be
fine. I guess those that don't test actually should, but that's a
different issue.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1106092300360.17868@swampdragon.chaosbits.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When irq_alloc_descs() is called with no base IRQ specified then it will
search for a range of IRQs starting from a specified base address. In the
case where an IRQ is specified it still does this search in order to ensure
that none of the requested range is already allocated and it still uses the
from parameter to specify the base for the search. This means that in the
case where a base is specified but from is zero (which is reasonable as
any IRQ number is in the range specified by a zero from) the function will
get confused and try to allocate the first suitably sized block of free IRQs
it finds.
Instead use a specified IRQ as the base address for the search, and insist
that any from that is specified can support that IRQ.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Link: http://lkml.kernel.org/r/1307037313-15733-1-git-send-email-broonie@opensource.wolfsonmicro.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The genirq changes are initializing descriptors for sparse IRQs quite
differently from how non-sparse (stacked?) IRQs are initialized, with
the effect that on my platform all IRQs are default-disabled on sparse
IRQs and default-enabled if non-sparse IRQs are used, crashing some
GPIO driver.
Fix this by refactoring the non-sparse IRQs to use the same descriptor
init function as the sparse IRQs.
Signed-off: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/1306858479-16622-1-git-send-email-linus.walleij@stericsson.com
Cc: stable@kernel.org # 2.6.39
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The detection of spurios interrupts is currently limited to first level
handler. In force-threaded mode we never notice if the threaded irq does
not feel responsible.
This patch catches the return value of the threaded handler and forwards
it to the spurious detector. If the primary handler returns only
IRQ_WAKE_THREAD then the spourious detector ignores it because it gets
called again from the threaded handler.
[ tglx: Report the erroneous return value early and bail out ]
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Link: http://lkml.kernel.org/r/1306824972-27067-2-git-send-email-sebastian@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In forced threaded mode (or with an explicit threaded handler) we only
see the primary handler, but not the threaded handler.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Link: http://lkml.kernel.org/r/1306824972-27067-1-git-send-email-sebastian@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
commit 4b06042(bitmap, irq: add smp_affinity_list interface to
/proc/irq) causes the following warning:
[ 274.239500] WARNING: at fs/proc/generic.c:850 remove_proc_entry+0x24c/0x27a()
[ 274.251761] remove_proc_entry: removing non-empty directory 'irq/184',
leaking at least 'smp_affinity_list'
Remove the new file in the exit path.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/4DDDE094.6050505@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
cpu count is large.
Setting smp affinity to cpus 256 to 263 would be:
echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity
instead of:
echo 256-263 > smp_affinity_list
Think about what it looks like for cpus around say, 4088 to 4095.
We already have many alternate "list" interfaces:
/sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
/sys/devices/system/cpu/cpuX/topology/thread_siblings_list
/sys/devices/system/cpu/cpuX/topology/core_siblings_list
/sys/devices/system/node/nodeX/cpulist
/sys/devices/pci***/***/local_cpulist
Add a companion interface, smp_affinity_list to use cpu lists instead of
cpu maps. This conforms to other companion interfaces where both a map
and a list interface exists.
This required adding a bitmap_parselist_user() function in a manner
similar to the bitmap_parse_user() function.
[akpm@linux-foundation.org: make __bitmap_parselist() static]
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Export handle_simple_irq, irq_modify_status, irq_alloc_descs,
irq_free_descs and generic_handle_irq to allow their usage in
modules. First user is IIO, which wants to be built modular, but needs
to be able to create irq chips, allocate and configure interrupt
descriptors and handle demultiplexing interrupts.
[ tglx: Moved the uninlinig of generic_handle_irq to a separate patch ]
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Link: http://lkml.kernel.org/r/%3C1305711544-505-1-git-send-email-jic23%40cam.ac.uk%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
generic_handle_irq() is missing a NULL pointer check for the result of
irq_to_desc. This was a not a big problem, but we want to expose it to
drivers, so we better have sanity checks in place. Add a return value
as well, which indicates that the irq number was valid and the handler
was invoked.
Based on the pure code move from Jonathan Cameron.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
kernel/irq/ is only built when CONFIG_GENERIC_HARDIRQS=y. So making
code inside of kernel/irq/ conditional on CONFIG_GENERIC_HARDIRQS is
pointless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
commit ab7798ffcf ("genirq: Expand generic
show_interrupts()") added the Kconfig option GENERIC_IRQ_SHOW_LEVEL to
accomodate PowerPC, but this doesn't actually enable the functionality due
to a typo in the #ifdef check.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linux/PPC Development <linuxppc-dev@lists.ozlabs.org>
Link: http://lkml.kernel.org/r/%3Calpine.DEB.2.00.1104302251370.19068%40ayla.of.borg%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
These callbacks are only called in the syscore suspend/resume code on
interrupt chips which have been registered via the generic irq chip
mechanism. Calling those callbacks per irq would be rather icky, but
with the generic irq chip mechanism we can call this per registered
chip.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Implement a generic interrupt chip, which is configurable and is able
to handle the most common irq chip implementations.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by; Kevin Hilman <khilman@ti.com>
This adds support for disabling threading on a per-IRQ basis via the IRQ
status instead of the IRQ flow, which is necessary for interrupts that
don't follow the natural IRQ flow channels, such as those that are
virtually created.
The new APIs added are simply:
irq_set_thread()
irq_set_nothread()
which follow the rest of the IRQ status routines.
Chained handlers also have IRQ_NOTHREAD set on them automatically, making
the lack of threading explicit rather than implicit. Subsequently, the
nothread flag can be viewed through the standard genirq debugging
facilities.
[ tglx: Fixed cleanup fallout ]
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Link: http://lkml.kernel.org/r/%3C20110406210135.GF18426%40linux-sh.org%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file
The allocated cpumask should be freed in __setup_irq().
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
LKML-Reference: <1301744375-6812-1-git-send-email-dfeng@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The only subtle difference is that alpha uses ACTUAL_NR_IRQS and
prints the IRQF_DISABLED flag.
Change the generic implementation to deal with ACTUAL_NR_IRQS if
defined.
The IRQF_DISABLED printing is pointless, as we nowadays run all
interrupts with irqs disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The late night fixup missed to convert the data type from irq_desc to
irq_data, which results in a harmless but annoying warning.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I missed the CONFIG_GENERIC_PENDING_IRQ dependency in the affinity
related functions and the IRQ_LEVEL propagation into irq_data
state. Did not pop up on my main test platforms. :(
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Daney <ddaney@caviumnetworks.com>
Fix new irq-related kernel-doc warnings in 2.6.38:
Warning(kernel/irq/manage.c:149): No description found for parameter 'mask'
Warning(kernel/irq/manage.c:149): Excess function parameter 'cpumask' description in 'irq_set_affinity'
Warning(include/linux/irq.h:161): No description found for parameter 'state_use_accessors'
Warning(include/linux/irq.h:161): Excess struct/union/enum/typedef member 'state_use_accessor' description in 'irq_data'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <20110318093356.b939558d.randy.dunlap@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is a replacment for the cell flow handler which is in the way of
cleanups. Must be selected to avoid general bloat.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We really need these flags for some of the interrupt chips. Move it
from internal state to irq_data and provide proper accessors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Daney <ddaney@caviumnetworks.com>
The .irq_cpu_online() and .irq_cpu_offline() functions may need to
adjust affinity, but they are called with the descriptor lock held.
Create __irq_set_affinity_locked() which is called with the lock held.
Make irq_set_affinity() just a wrapper that acquires the lock.
[ tglx: Changed the argument to irq_data, added a !desc check and
moved the !irq_set_affinity check where it belongs ]
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
LKML-Reference: <1301081931-11240-4-git-send-email-ddaney@caviumnetworks.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a flag which indicates that the on/offline callback should only be
called on enabled interrupts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx: Removed the enabled argument as this is now available in
irq_data ]
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
LKML-Reference: <1301081931-11240-3-git-send-email-ddaney@caviumnetworks.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some irq_chip implementation require to know the disabled state of the
interrupt in certain callbacks. Add a state flag and accessor to
irq_data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The helper macros and functions like for_each_active_irq() don't work
unless the irq is in the allocated_irqs set.
In the case of !CONFIG_SPARSE_IRQ, instead of forcing all users of the
irq infrastructure to explicitly call irq_reserve_irq(), do it for
them.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
LKML-Reference: <1301081931-11240-2-git-send-email-ddaney@caviumnetworks.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some archs want to print extra information for certain irq_chips which
is per irq and not per chip. Allow them to provide a chip callback to
print the chip name and the extra information.
PowerPC wants to print the LEVEL/EDGE type information. Make it configurable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
goto out_thread is called before we take the lock. It causes a gcc
warning: "kernel/irq/manage.c:858: warning: ‘flags’ may be used
uninitialized in this function"
[ tglx: Moved unlock before free_cpumask_var() ]
Signed-off-by: Dan Carpenter <error27@gmail.com>
LKML-Reference: <20110317114307.GJ2008@bicker>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On suspend we disable all interrupts in the core code, but this does
not mask the interrupt line in the default implementation as we use a
lazy disable approach. That means we mark the interrupt disabled, but
leave the hardware unmasked. That's an optimization because we avoid
the hardware access for the common case where no interrupt happens
after we marked it disabled. If an interrupt happens, then the
interrupt flow handler masks the line at the hardware level and marks
it pending.
Suspend makes use of this delayed disable as it "disables" all
interrupts when preparing the suspend transition. Right before the
system goes into hardware suspend state it checks whether one of the
interrupts which is marked as a wakeup interrupt came in after
disabling it.
Most interrupt chips have a separate register which selects the
interrupts which can wake up the system from suspend, so we don't have
to mask any on the non wakeup interrupts.
But now we have to deal with brilliant designed hardware which lacks
such a wakeup configuration facility. For such hardware it's necessary
to mask all non wakeup interrupts before going into suspend in order
to avoid the wakeup from random interrupts.
Rather than working around this in the affected interrupt chip
implementations we can solve this elegant in the core code itself.
Add a flag IRQCHIP_MASK_ON_SUSPEND which can be set by the irq chip
implementation to indicate, that the interrupts which are not selected
as wakeup sources must be masked in the suspend path. Mask them in the
loop which checks the wakeup interrupts pending flag.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
LKML-Reference: <alpine.LFD.2.00.1103112112310.2787@localhost6.localdomain6>
The fasteoi handler must mask the interrupt line in oneshot mode
otherwise we end up with an irq storm.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a commandline parameter "threadirqs" which forces all interrupts except
those marked IRQF_NO_THREAD to run threaded. That's mostly a debug option to
allow retrieving better debug data from crashing interrupt handlers. If
"threadirqs" is not enabled on the kernel command line, then there is no
impact in the interrupt hotpath.
Architecture code needs to select CONFIG_IRQ_FORCED_THREADING after
marking the interrupts which cant be threaded IRQF_NO_THREAD. All
interrupts which have IRQF_TIMER set are implict marked
IRQF_NO_THREAD. Also all PER_CPU interrupts are excluded.
Forced threading hard interrupts also forces all soft interrupt
handling into thread context.
When enabled it might slow down things a bit, but for debugging problems in
interrupt code it's a reasonable penalty as it does not immediately
crash and burn the machine when an interrupt handler is buggy.
Some test results on a Core2Duo machine:
Cache cold run of:
# time git grep irq_desc
non-threaded threaded
real 1m18.741s 1m19.061s
user 0m1.874s 0m1.757s
sys 0m5.843s 0m5.427s
# iperf -c server
non-threaded
[ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec
threaded
[ 3] 0.0-10.0 sec 1.09 GBytes 939 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 937 Mbits/sec
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.772668648@linutronix.de>
Support ONESHOT on shared interrupts, if all drivers agree on it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.483640430@linutronix.de>
For level type interrupts we need to track how many threads are on
flight to avoid useless interrupt storms when not all thread handlers
have finished yet. Keep track of the woken threads and only unmask
when there are no more threads in flight.
Yes, I'm lazy and using a bitfield. But not only because I'm lazy, the
main reason is that it's way simpler than using a refcount. A refcount
based solution would need to keep track of various things like
crashing the irq thread, spurious interrupts coming in,
disables/enables, free_irq() and some more. The bitfield keeps the
tracking simple and makes things just work. It's also nicely confined
to the thread code pathes and does not require additional checks all
over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.388095876@linutronix.de>
The WARN_ON_ONCE in handle_percpu_event() which emits a warning when
an action handler returns with interrupts enabled is not really
useful. It does not reveal the interrupt number and handler function
which caused it. Make it WARN_ONCE() and add the information.
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
"def_bool n" without prompt is pointless, these should be just "bool".
[ tglx: Adapted to latest changes ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D5D3309020000780003264A@vpn.id2.novell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
note_interrupt wants to be called with the combined result of all
handlers called, not with the last one. If it's a shared interrupt
then the last handler might return IRQ_NONE often enough to trigger
the spurious dectector which turns off a perfectly fine working
interrupt line. Bug was introduced in commit 1277a532(genirq: Simplify
handle_irq_event()).
Yes, I really messed up there. First the variable ret should not have
been named differently to avoid similarity with retval. Second it
should have been declared in the do {} loop.
Rename it to res and move it into the do {} loop and vanish under a
huge brown paperbag.
Reported-bisected-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The switch case in __irq_set_trigger() lacks a break, which emits a
pr_err unconditionally on success.
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The runtime expansion of nr_irqs does not take into account that
bitmap_find_next_zero_area() returns "start" + size in case the search
for an matching zero area fails. That results in a start value which
can be completely off and is not covered by the following
expand_nr_irqs() and possibly outside of the absolute limit. But we
use it without further checking.
Use IRQ_BITMAP_BITS as the limit for the bitmap search and expand
nr_irqs when the start bit is beyond nr_irqs. So start is always
pointing to the correct area in the bitmap. nr_irqs is just the limit
for irq enumerations, not the real limit for the irq space.
[ tglx: Let irq_expand_nr_irqs() take the new upper end so we do not
expand nr_irqs more than necessary. Made changelog readable ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D6014F9.8040605@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We lazy disable interrupt lines, so only mark the line masked, when
the chip provides an irq_disable callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
No need to lookup the irq descriptor when calling from a chip callback
function which has irq_data already handy.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some chips want irq_eoi() only called when an interrupt is actually
handled. So they have checks for INPROGRESS and DISABLED in their
irq_eoi callbacks. Add a chip flag, which allows to handle that in the
generic code. No impact on the fastpath.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
sparc64 needs to call a preflow handler on certain interrupts befor
calling the action chain. Integrate it into handle_fasteoi_irq. Must
be enabled via CONFIG_IRQ_FASTEOI_PREFLOW. No impact when disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David S. Miller <davem@davemloft.net>
Most of the managing functions get the irq descriptor and lock it -
either with or without buslock. Instead of open coding this over and
over provide a common function to do that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If everything uses the right accessors, then enabling
GENERIC_HARDIRQS_NO_COMPAT should just work. If not it will tell you.
Don't be lazy and use the trick which I use in the core code!
git grep status_use_accessors
will unearth it in a split second. Offenders are tracked down and not
slapped with stinking trouts. This time we use frozen shark for a
better educational value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some irq_chips need to know the state of wakeup mode for
setting the trigger type etc. Reflect it in irq_data state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_chips, which require to mask the chip before changing the trigger
type should set this flag. So the core takes care of it and the
requirement for looking into desc->status in the chip goes away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
That's the data structure chip functions get provided. Also allow them
to signal the core code that they updated the flags in irq_data.state
by returning IRQ_SET_MASK_OK_NOCOPY. The default is unchanged.
The type bits should be accessed via:
val = irqd_get_trigger_type(irqdata);
and
irqd_set_trigger_type(irqdata, val);
Coders who access them directly will be tracked down and slapped with
stinking trouts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
That's the right data structure to look at for arch code.
Accessor functions are provided.
irqd_is_per_cpu(irqdata);
irqd_can_balance(irqdata);
Coders who access them directly will be tracked down and slapped with
stinking trouts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The saving of this switch is minimal versus the ifdef mess it
creates. Simple enable PER_CPU unconditionally and remove the config
switch.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
chip implementations need to know about it. Keep status in sync until
all users are fixed.
Accessor function: irqd_is_setaffinity_pending(irqdata)
Coders who access them directly will be tracked down and slapped with
stinking trouts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We need to maintain the flag for now in both fields status and istate.
Add a CONFIG_GENERIC_HARDIRQS_NO_COMPAT switch to allow testing w/o
the status one. Wrap the access to status IRQ_INPROGRESS in a inline
which can be turned of with CONFIG_GENERIC_HARDIRQS_NO_COMPAT along
with the define.
There is no reason that anything outside of core looks at this. That
needs some modifications, but we'll get there.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq_desc.status field will either go away or renamed to
settings. Anyway we need to maintain compatibility to avoid breaking
the world and some more. While moving bits into the core, I need to
avoid that I use any of the still existing IRQ_ bits in the core code
by typos. So that file will hold the inline wrappers and some nasty
CPP tricks to break the build when typoed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
That field will contain internal state information which is not going
to be exposed to anything outside the core code - except via accessor
functions. I'm tired of everyone fiddling in irq_desc.status.
core_internal_state__do_not_mess_with_it is clear enough, annoying to
type and easy to grep for. Offenders will be tracked down and slapped
with stinking trouts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All archs implement show_interrupts() in more or less the same
way. That's tons of duplicated code with different bugs with no
value. Implement a generic version and deprecate show_interrupts()
Unfortunately we need some ifdeffery for !GENERIC_HARDIRQ archs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's safe to drop the IRQ_INPROGRESS flag between action chain walks
as we are protected by desc->lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Core code replacement for the ugly camel case. It contains all the
code which is shared in all handlers.
clear status flags
set INPROGRESS flag
unlock
call action chain
note_interrupt
lock
clr INPROGRESS flag
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
IRQ_MASKED is set in mask_ack_irq() anyway. Remove it from
handle_edge_irq() to allow simpler ab^HHreuse of that function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110202212551.918484270@linutronix.de>
Now that everything uses the wrappers, we can remove the default
functions. None of those functions is performance critical.
That makes the IRQ_MASKED flag tracking fully consistent.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Aside of duplicated code some of the startup/shutdown sites do not
handle the MASKED/DISABLED flags and the depth field at all. Move that
to a helper function and take care of it there.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110202212551.787481468@linutronix.de>
The if (chip->irq_shutdown) check will always evaluate to true, as we
fill in chip->irq_shutdown with default_shutdown in
irq_chip_set_defaults() if the chip does not provide its own function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110202212551.667607458@linutronix.de>
With the chip.end() function gone we might run into a situation where
a poll call runs and the real interrupt comes in, sees IRQ_INPROGRESS
and disables the line. That might be a perfect working one, which will
then be masked forever.
So mark them polled while the poll runs. When the real handler sees
IRQ_INPROGRESS it checks the poll flag and waits for the polling to
complete. Add the necessary amount of sanity checks to it to avoid
deadlocks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is no point in polling disabled lines.
percpu does not make sense at all because we only poll on the cpu
we're currently running on. Also polling per_cpu interrupts is racy as
hell. The handler runs without locking so we might get a huge
surprise.
If the timer interrupt needs polling, then we wont get there anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
try_one_irq() contains redundant code and lots of useless checks for
shared interrupts. Check for shared before setting IRQ_INPROGRESS and
then call handle_IRQ_event() while pending. Shorter version with the
same functionality.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We run all handlers with interrupts disabled and expect them not to
enable them. Warn when we catch one who does.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We cannot walk the action chain unlocked. Even if IRQ_INPROGRESS is
set an action can be removed and we follow a null pointer. It's safe
to take the lock there, because the code which removes the action will
call synchronize_irq() which waits unlocked for IRQ_INPROGRESS going
away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
While rumaging through arch code I found that there are a few
workarounds which deal with the fact that the initial affinity setting
from request_irq() copies the mask into irq_data->affinity before the
chip code is called. In the normal path we unconditionally copy the
mask when the chip code returns 0.
Copy after the code is called and add a return code
IRQ_SET_MASK_OK_NOCOPY for the chip functions, which prevents the
copy. That way we see the real mask when the chip function decided to
truncate it further as some arches do. IRQ_SET_MASK_OK is 0, which is
the current behaviour.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If the affinity had been set by the user, then a later request_irq()
will honour that setting. But online cpus can have changed. So apply
the online mask and for this case as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is lot of #ifdef CONFIG_GENERIC_PENDING_IRQ along with
duplicated code in the irq core. Move the #ifdeffery into one place
and cleanup the code so it's readable. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq namespace has become quite convoluted. My bad. Clean it up
and deprecate the old functions. All new functions follow the scheme:
irq number based:
irq_set/get/xxx/_xxx(unsigned int irq, ...)
irq_data based:
irq_data_set/get/xxx/_xxx(struct irq_data *d, ....)
irq_desc based:
irq_desc_get_xxx(struct irq_desc *desc)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
chips behind a slow bus cannot update the chip under desc->lock, but
we miss the chip_buslock/chip_bus_sync_unlock() calls around the set
type and set wake functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We face more and more the requirement to expand nr_irqs at
runtime. The reason are irq expanders which can not be detected in the
early boot stage. So we speculate nr_irqs to have enough room. Further
Xen needs extra irq numbers and we really want to avoid adding more
"detection" code into the early boot. There is no real good reason why
we need to limit nr_irqs at early boot.
Allow the allocation code to expand nr_irqs. We have already 8k extra
number space in the allocation bitmap, so lets use it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With CONFIG_SHIRQ_DEBUG=y we call a newly installed interrupt handler
in request_threaded_irq().
The original implementation (commit a304e1b8) called the handler
_BEFORE_ it was installed, but that caused problems with handlers
calling disable_irq_nosync(). See commit 377bf1e4.
It's braindead in the first place to call disable_irq_nosync in shared
handlers, but ....
Moving this call after we installed the handler looks innocent, but it
is very subtle broken on SMP.
Interrupt handlers rely on the fact, that the irq core prevents
reentrancy.
Now this debug call violates that promise because we run the handler
w/o the IRQ_INPROGRESS protection - which we cannot apply here because
that would result in a possibly forever masked interrupt line.
A concurrent real hardware interrupt on a different CPU results in
handler reentrancy and can lead to complete wreckage, which was
unfortunately observed in reality and took a fricking long time to
debug.
Leave the code here for now. We want this debug feature, but that's
not easy to fix. We really should get rid of those
disable_irq_nosync() abusers and remove that function completely.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: stable@kernel.org # .28 -> .37
Lars-Peter Clausen pointed out:
I stumbled upon this while looking through the existing archs using
SPARSE_IRQ. Even with SPARSE_IRQ the NR_IRQS is still the upper
limit for the number of IRQs.
Both PXA and MMP set NR_IRQS to IRQ_BOARD_START, with
IRQ_BOARD_START being the number of IRQs used by the core.
In various machine files the nr_irqs field of the ARM machine
defintion struct is then set to "IRQ_BOARD_START + NR_BOARD_IRQS".
As a result "nr_irqs" will greater then NR_IRQS which then again
causes the "allocated_irqs" bitmap in the core irq code to be
accessed beyond its size overwriting unrelated data.
The core code really misses a sanity check there.
This went unnoticed so far as by chance the compiler/linker places
data behind that bitmap which gets initialized later on those affected
platforms.
So the obvious fix would be to add a sanity check in early_irq_init()
and break all affected platforms. Though that check wants to be
backported to stable as well, which will require to fix all known
problematic platforms and probably some more yet not known ones as
well. Lots of churn.
A way simpler solution is to allocate a slightly larger bitmap and
avoid the whole churn w/o breaking anything. Add a few warnings when
an arch returns utter crap.
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org # .37
Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reason: irq/for-mips is provided for mips to make core independent
progress. Merge it into irq/core to avoid conflicts
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_chips that supply .irq_bus_lock/.irq_bus_sync_unlock functions,
expect that the other chip methods will be called inside of calls to
the pair. If this expectation is not met, things tend to not work.
Make setup_irq() call chip_bus_lock()/chip_bus_sync_unlock() too.
For the vast majority of irq_chips, this will be a NOP as most don't
have these bus lock functions.
[ tglx: No we don't want to call that in __setup_irq(). Way too many
error exit pathes. ]
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
LKML-Reference: <1297296265-18680-1-git-send-email-ddaney@caviumnetworks.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CONFIG_KSTAT_IRQS_ONDEMAND does not exist. It's not worth to implement
it. Use sparse irqs if you care about memory consumption of the
interrupt layer.
Found by undertaker: http://vamos.informatik.uni-erlangen.de/trac/undertaker
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq/for-xen contains new functionality to avoid Xen private irq
hackery. That branch has a single irq commit and is pulled by Xen to
base their new features on.
Merge it into irq/core as other patches modify the same code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Xen needs to reenable interrupts which are marked IRQF_NO_SUSPEND in the
resume path. Add a flag to force the reenabling in the resume code.
Tested-and-acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
move_native_irq() masks and unmasks the interrupt line
unconditionally, but the interrupt line might be masked due to a
threaded oneshot handler in progress. Unmasking the line in that case
can lead to interrupt storms. Observed on PREEMPT_RT.
Originally-from: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
The new code of commit cd7eab44e(genirq: Add IRQ affinity notifiers)
references irq_desc.affinity which fails to compile with
CONFIG_GENERIC_HARDIRQS_NO_DEPRECATED=y.
Use irq_desc.irq_data.affinity instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ben Hutchings <bhutchings@solarflare.com>
When initiating I/O on a multiqueue and multi-IRQ device, we may want
to select a queue for which the response will be handled on the same
or a nearby CPU. This requires a reverse-map of IRQ affinity. Add a
notification mechanism to support this.
This is based closely on work by Thomas Gleixner <tglx@linutronix.de>.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: linux-net-drivers@solarflare.com
Cc: Tom Herbert <therbert@google.com>
Cc: David Miller <davem@davemloft.net>
LKML-Reference: <1295470904.11126.84.camel@bwh-desktop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All architectures are finally converted. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Use modern per_cpu API to increment {soft|hard}irq counters, and use
per_cpu allocation for (struct irq_desc)->kstats_irq instead of an array.
This gives better SMP/NUMA locality and saves few instructions per irq.
With small nr_cpuids values (8 for example), kstats_irq was a small array
(less than L1_CACHE_BYTES), potentially source of false sharing.
In the !CONFIG_SPARSE_IRQ case, remove the huge, NUMA/cache unfriendly
kstat_irqs_all[NR_IRQS][NR_CPUS] array.
Note: we still populate kstats_irq for all possible irqs in
early_irq_init(). We probably could use on-demand allocations. (Code
included in alloc_descs()). Problem is not all IRQS are used with a prior
alloc_descs() call.
kstat_irqs_this_cpu() is not used anymore, remove it.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Function-scope statics are discouraged because they are
easily overlooked and can cause subtle bugs/races due to
their global (non-SMP safe) nature.
Linus noticed that we did this for sched_param - at minimum
make the const.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: Message-ID: <AANLkTinotRxScOHEb0HgFgSpGPkq_6jKTv5CfvnQM=ee@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since commit a1afb637(switch /proc/irq/*/spurious to seq_file) all
/proc/irq/XX/spurious files show the information of irq 0.
Current irq_spurious_proc_open() passes on NULL as the 3rd argument,
which is used as an IRQ number in irq_spurious_proc_show(), to the
single_open(). Because of this, all the /proc/irq/XX/spurious file
shows IRQ 0 information regardless of the IRQ number.
To fix the problem, irq_spurious_proc_open() must pass on the
appropreate data (IRQ number) to single_open().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
LKML-Reference: <4CF4B778.90604@jp.fujitsu.com>
Cc: stable@kernel.org [2.6.33+]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix up irq_node() for irq_data changes.
genirq: Add single IRQ reservation helper
genirq: Warn if enable_irq is called before irq is set up
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
semaphore: Remove mutex emulation
staging: Final semaphore cleanup
jbd2: Convert jbd2_slab_create_sem to mutex
hpfs: Convert sbi->hpfs_creation_de to mutex
Fix up trivial change/delete conflicts with deleted 'dream' drivers
(drivers/staging/dream/camera/{mt9d112.c,mt9p012_fox.c,mt9t013.c,s5k3e2fx.c})
In /proc/stat, the number of per-IRQ event is shown by making a sum each
irq's events on all cpus. But we can make use of kstat_irqs().
kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
it's not a big cost. (Both of the number of cpus and irqs are small.)
If a system is very big and CONFIG_GENERIC_HARDIRQ, it does
for_each_irq()
for_each_cpu()
- look up a radix tree
- read desc->irq_stat[cpu]
This seems not efficient. This patch adds kstat_irqs() for
CONFIG_GENRIC_HARDIRQ and change the calculation as
for_each_irq()
look up radix tree
for_each_cpu()
- read desc->irq_stat[cpu]
This reduces cost.
A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)
%time cat /proc/stat > /dev/null
Before Patch: 2.459 sec
After Patch : .561 sec
[akpm@linux-foundation.org: unexport kstat_irqs, coding-style tweaks]
[akpm@linux-foundation.org: fix unused variable 'per_irq_sum']
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Jack Steiner <steiner@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton pointed out almost all sched_setscheduler() callers are
using fixed parameters and can be converted to static. It reduces runtime
memory use a little.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The recent changes in the genirq core unearthed a bug in arch/um which
called enable_irq() before the interrupt was set up.
Warn and return instead of crashing the machine with a NULL pointer
dereference.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Weinberger <richard@nod.at>
This option can be set to verify the full conversion to the new chip
functions. Fix the fallout of the patch rework, so the core code
compiles and works with it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The allocator functions are now called outside of preempt disabled
regions. Switch to GFP_KERNEL.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The move_irq_desc() function was only used due to the problem that the
allocator did not free the old descriptors. So the descriptors had to
be moved in create_irq_nr(). That's history.
The code would have never been able to move active interrupt
descriptors on affinity settings. That can be done in a completely
different way w/o all this horror.
Remove all of it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Use the cleanup functions of the dynamic allocator. No need to have
separate implementations.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
This function should have not been there in the first place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
sparse irq sets up NR_IRQS_LEGACY irq descriptors and archs then go
ahead and allocate more.
Use the unused return value of arch_probe_nr_irqs() to let the
architecture return the number of early allocations. Fix up all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Make irq_to_desc_alloc_node() a wrapper around the new allocator.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Mark a range of interrupts as allocated. In the SPARSE_IRQ=n case we
need this to update the bitmap for the legacy irqs so the enumerator
via irq_get_next_irq() works.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/proc/irq never removes any entries, but when irq descriptors can be
freed for real this is necessary. Otherwise we'd reference a freed
descriptor in /proc/irq/N
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The current sparse_irq allocator has several short comings due to
failures in the design or the lack of it:
- Requires iteration over the number of active irqs to find a free slot
(Some architectures have grown their own workarounds for this)
- Removal of entries is not possible
- Racy between create_irq_nr and destroy_irq (plugged by horrible
callbacks)
- Migration of active irq descriptors is not possible
- No bulk allocation of irq ranges
- Sprinkeled irq_desc references all over the place outside of kernel/irq/
(The previous chip functions series is addressing this issue)
Implement a sane allocator which fixes the above short comings (though
migration of active descriptors needs a full tree wide cleanup of the
direct and mostly unlocked access to irq_desc).
The new allocator still uses a radix_tree, but uses a bitmap for
keeping track of allocated irq numbers. That allows:
- Fast lookup of a free slot
- Allows the removal of descriptors
- Prevents the create/destroy race
- Bulk allocation of consecutive irq ranges
- Basic design is ready for migration of life descriptors after
further cleanups
The bitmap is also used in the SPARSE_IRQ=n case for lookup and
raceless (de)allocation of irq numbers. So it removes the requirement
for looping through the descriptor array to find slots.
Right now it uses sparse_irq_lock to protect the bitmap and the radix
tree, but after cleaning up all users we should be able convert that
to a mutex and to switch the radix_tree and decriptor allocations to
GFP_KERNEL.
[ Folded in a bugfix from Yinghai Lu ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Arch code sets it's own irq_desc.status flags right after boot and for
dynamically allocated interrupts. That might involve iterating over a
huge array.
Allow ARCH_IRQ_INIT_FLAGS to set separate flags aside of IRQ_DISABLED
which is the default.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The statistics accessor is only used by proc/stats and
show_interrupts(). Both are compiled in.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
early_init_irq_lock_class() is called way before anything touches the
irq descriptors. In case of SPARSE_IRQ=y this is a NOP operation
because the radix tree is empty at this point. For the SPARSE_IRQ=n
case it's sufficient to set the lock class in early_init_irq().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
kernel/irq/handle.c has become a dumpground for random code in random
order. Split out the irq descriptor management and the dummy irq_chip
implementation into separate files. Cleanup the include maze while at
it.
No code change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Get the data structure from the core and provide inline wrappers to
access the irq_data members.
Provide accessor inlines for irq_data as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Provide a irq_desc.status modifier function to cleanup the direct
access to irq_desc in arch and driver code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
This option covers now the old chip functions and the irq_desc data
fields which are moving to struct irq_data. More stuff will follow.
Pretty handy for testing a conversion, whether something broke or not.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function retrigger() until the migration is complete
and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121843.025801092@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function set_wake() until the migration is complete
and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.927527393@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function set_type() until the migration is complete
and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.832261548@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function set_affinity() until the migration is
complete and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.732894108@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function startup() until the migration is complete and
the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.635152961@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip functions disable() and shutdown() until the
migration is complete and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.532070631@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function enable() until the migration is complete and
the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.437159182@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function eoi() until the migration is complete and
the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.339657617@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function mask_ack() until the migration is complete
and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.240806983@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function ack() until the migration is complete and
the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.142624725@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function unmask() until the migration is complete
and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121842.043608928@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip function mask() until the migration is complete and
the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121841.940355859@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Wrap the old chip functions for bus_lock/bus_sync_unlock until the
migration is complete and the old chip functions are removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121841.842536121@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The compat functions go away when the core code is converted.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Convert all references in the core code to orq, chip, handler_data,
chip_data, msi_desc, affinity to irq_data.*
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Low level chip functions need access to irq_desc->handler_data,
irq_desc->chip_data and irq_desc->msi_desc. We hand down the irq
number to the low level functions, so they need to lookup irq_desc.
With sparse irq this means a radix tree lookup.
We could hand down irq_desc itself, but low level chip functions have
no need to fiddle with it directly and we want to restrict access to
irq_desc further.
Preparatory patch for new chip functions.
Note, that the ugly anon union/struct is there to avoid a full tree
wide clean up for now. This is not going to last 3 years like __do_IRQ()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121841.645542300@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The generic irq Kconfig options are copied around all archs. Provide a
generic Kconfig file which can be included.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121843.217333624@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
3 years transition phase is enough. Cleanup the last users and remove
the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Leo Chen <leochen@broadcom.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.
Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The set_type() function can change the chip implementation when the
trigger mode changes. That might result in using an non-initialized
irq chip when called from __setup_irq() or when called via
set_irq_type() on an already enabled irq.
The set_irq_type() function should not be called on an enabled irq,
but because we forgot to put a check into it, we have a bunch of users
which grew the habit of doing that and it never blew up as the
function is serialized via desc->lock against all users of desc->chip
and they never hit the non-initialized irq chip issue.
The easy fix for the __setup_irq() issue would be to move the
irq_chip_set_defaults(desc->chip) call after the trigger setting to
make sure that a chip change is covered.
But as we have already users, which do the type setting after
request_irq(), the safe fix for now is to call irq_chip_set_defaults()
from __irq_set_trigger() when desc->set_type() changed the irq chip.
It needs a deeper analysis whether we should refuse to change the chip
on an already enabled irq, but that'd be a large scale change to fix
all the existing users. So that's neither stable nor 2.6.35 material.
Reported-by: Esben Haabendal <eha@doredevelopment.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>
Cc: stable@kernel.org
When an interrupt is disabled and torn down, the CPU mask returned
through affinity_hint right now is all CPUs. Also, for drivers that
don't provide an affinity_hint mask, this can be misleading. There
should be no hint at all, meaning an empty CPU mask.
[ tglx: use zalloc_cpumask_var instead of clearing it under the lock ]
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: davem@davemloft.net
Cc: arjan@linux.jf.intel.com
Cc: bhutchings@solarflare.com
LKML-Reference: <20100505205638.5426.87189.stgit@ppwaskie-hc2.jf.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds a cpumask affinity hint to the irq_desc structure,
along with a registration function and a read-only proc entry for each
interrupt.
This affinity_hint handle for each interrupt can be used by underlying
drivers that need a better mechanism to control interrupt affinity.
The underlying driver can register a cpumask for the interrupt, which
will allow the driver to provide the CPU mask for the interrupt to
anything that requests it. The intent is to extend the userspace
daemon, irqbalance, to help hint to it a preferred CPU mask to balance
the interrupt into.
[ tglx: Fixed compile warnings, added WARN_ON, made SMP only ]
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: davem@davemloft.net
Cc: arjan@linux.jf.intel.com
Cc: bhutchings@solarflare.com
LKML-Reference: <20100430214445.3992.41647.stgit@ppwaskie-hc2.jf.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove all code which is related to IRQF_DISABLED from the core kernel
code. IRQF_DISABLED still exists as a flag, but becomes a NOOP and
will be removed after a grace period. That way we can easily revert to
the previous behaviour by just restoring the core code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@osdl.org>
LKML-Reference: <20100326000405.991244690@linutronix.de>
Running interrupt handlers with interrupts enabled can cause stack
overflows. That has been observed with multiqueue NICs delivering all
their interrupts to a single core. We might band aid that somehow by
checking the interrupt stacks, but the real safe fix is to run the irq
handlers with interrupts disabled.
Drivers for whacky hardware still can reenable them in the handler
itself, if the need arises. (They do already due to lockdep)
The risk of doing this is rather low:
- lockdep already enforces this
- CONFIG_NOHZ has shaken out the drivers which relied on jiffies updates
- time keeping is not longer sensitive to the timer interrupt being delayed
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@osdl.org>
LKML-Reference: <20100326000405.758579387@linutronix.de>
Now that we enjoy threaded interrupts, we're starting to see irq_chip
implementations (wm831x, pca953x) that make use of threaded interrupts
for the controller, and nested interrupts for the client interrupt. It
all works very well, with one drawback:
Drivers requesting an IRQ must now know whether the handler will
run in a thread context or not, and call request_threaded_irq() or
request_irq() accordingly.
The problem is that the requesting driver sometimes doesn't know
about the nature of the interrupt, specially when the interrupt
controller is a discrete chip (typically a GPIO expander connected
over I2C) that can be connected to a wide variety of otherwise perfectly
supported hardware.
This patch introduces the request_any_context_irq() function that mostly
mimics the usual request_irq(), except that it checks whether the irq
level is configured as nested or not, and calls the right backend.
On success, it also returns either IRQC_IS_HARDIRQ or IRQC_IS_NESTED.
[ tglx: Made return value an enum, simplified code and made the export
of request_any_context_irq GPL ]
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Cc: <joachim.eastwood@jotron.com>
LKML-Reference: <927ea285bd0c68934ddae1a47e44a9ba@localhost>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Force MSI irq handlers to run with interrupts disabled
Network folks reported that directing all MSI-X vectors of their multi
queue NICs to a single core can cause interrupt stack overflows when
enough interrupts fire at the same time.
This is caused by the fact that we run interrupt handlers by default
with interrupts enabled unless the driver reuqests the interrupt with
the IRQF_DISABLED set. The NIC handlers do not set this flag, so
simultaneous interrupts can nest unlimited and cause the stack
overflow.
The only safe counter measure is to run the interrupt handlers with
interrupts disabled. We can't switch to this mode in general right
now, but it is safe to do so for MSI interrupts.
Force IRQF_DISABLED for MSI interrupt handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@osdl.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: stable@kernel.org
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Move two IRQ functions from .init.text to .text
genirq: Protect access to irq_desc->action in can_request_irq()
genirq: Prevent oneshot irq thread race
Both functions should not be marked as __init, since they be called
from modules after the init section is freed.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1269431961-5731-1-git-send-email-henne@nachtwindheim.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
can_request_irq() accesses and dereferences irq_desc->action w/o
holding irq_desc->lock. So action can be freed on another CPU before
it's dereferenced. Unlikely, but ...
Protect it with desc->lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Expose irq_desc->node as /proc/irq/*/node.
This file provides device hardware locality information for apps
desiring to include hardware locality in irq mapping decisions.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Lars-Peter pointed out that the oneshot threaded interrupt handler
code has the following race:
CPU0 CPU1
hande_level_irq(irq X)
mask_ack_irq(irq X)
handle_IRQ_event(irq X)
wake_up(thread_handler)
thread handler(irq X) runs
finalize_oneshot(irq X)
does not unmask due to
!(desc->status & IRQ_MASKED)
return from irq
does not unmask due to
(desc->status & IRQ_ONESHOT)
This leaves the interrupt line masked forever.
The reason for this is the inconsistent handling of the IRQ_MASKED
flag. Instead of setting it in the mask function the oneshot support
sets the flag after waking up the irq thread.
The solution for this is to set/clear the IRQ_MASKED status whenever
we mask/unmask an interrupt line. That's the easy part, but that
cleanup opens another race:
CPU0 CPU1
hande_level_irq(irq)
mask_ack_irq(irq)
handle_IRQ_event(irq)
wake_up(thread_handler)
thread handler(irq) runs
finalize_oneshot_irq(irq)
unmask(irq)
irq triggers again
handle_level_irq(irq)
mask_ack_irq(irq)
return from irq due to IRQ_INPROGRESS
return from irq
does not unmask due to
(desc->status & IRQ_ONESHOT)
This requires that we synchronize finalize_oneshot_irq() with the
primary handler. If IRQ_INPROGESS is set we wait until the primary
handler on the other CPU has returned before unmasking the interrupt
line again.
We probably have never seen that problem because it does not happen on
UP and on SMP the irqbalancer protects us by pinning the primary
handler and the thread to the same CPU.
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Use radix_tree irq_desc_tree instead of irq_desc_ptrs.
-v2: according to Eric and cyrill to use radix_tree_lookup_slot and
radix_tree_replace_slot
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-32-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add replace_irq_desc() instead of poking at the array directly.
-v2: remove unneeded boundary check in replace_irq_desc
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-31-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
mem_init is moved early already.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-29-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Keep chip_data in create_irq_nr and destroy_irq.
When two drivers are setting up MSI-X at the same time via
pci_enable_msix() there is a race. See this dmesg excerpt:
[ 85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
[ 85.170611] alloc irq_desc for 99 on node -1
[ 85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
[ 85.170614] alloc kstat_irqs on node -1
[ 85.170616] alloc irq_2_iommu on node -1
[ 85.170617] alloc irq_desc for 100 on node -1
[ 85.170619] alloc kstat_irqs on node -1
[ 85.170621] alloc irq_2_iommu on node -1
[ 85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
[ 85.170626] alloc irq_desc for 101 on node -1
[ 85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
[ 85.170630] alloc kstat_irqs on node -1
[ 85.170631] alloc irq_2_iommu on node -1
[ 85.170635] alloc irq_desc for 102 on node -1
[ 85.170636] alloc kstat_irqs on node -1
[ 85.170639] alloc irq_2_iommu on node -1
[ 85.170646] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000088
As you can see igb and ixgbe are both alternating on create_irq_nr()
via pci_enable_msix() in their probe function.
ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
NULL via dynamic_irq_init().
igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:
cfg_new = irq_desc_ptrs[102]->chip_data;
if (cfg_new->vector != 0)
continue;
This hits the NULL deref.
Another possible race exists via pci_disable_msix() in a driver or in
the number of error paths that call free_msi_irqs():
destroy_irq()
dynamic_irq_cleanup() which sets desc->chip_data = NULL
...race window...
desc->chip_data = cfg;
Remove the save and restore code for cfg in create_irq_nr() and
destroy_irq() and take the desc->lock when checking the irq_cfg.
Reported-and-analyzed-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-3-git-send-email-yinghai@kernel.org>
Signed-off-by: Brandon Phililps <bphilips@suse.de>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
* 'timers-for-linus-ntp' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ntp: Provide compability defines (You say MOD_NANO, I say ADJ_NANO)
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: do not execute DEBUG_SHIRQ when irq setup failed
single_open data argument must be PDE(inode)->data instead of NULL
otherwise seq_file->private is always NULL and we always read the
spurious data of irq 0.
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx: compacted it a bit ]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
LKML-Reference: <20090828181743.GA14050@x200.localdomain>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If a parent directory (ie /proc/irq/<irq>) could not be created
we should not attempt to create subdirectories. Otherwise it
would lead that "smp_affinity" and "spurious" entries are may be
registered under /proc root instead of a proper place.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20091026202811.GD5321@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
commit 74296a8ed added this function for debug purposes, but it was
never used for anything. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix docbook comments to match the actual function names
(set_irq_msi, handle_percpu_irq).
Signed-off-by: Liuwenyi <qingshenlwy@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
* 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Do not mask oneshot edge type interrupts
genirq: Support nested threaded irq handling
genirq: Add buslock support
genirq: Add oneshot support
when there is no ram on node 0.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <4A95C32D.5040605@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Masking oneshot edge type interrupts is wrong as we might lose an
interrupt which is issued when the threaded handler is handling the
device. We can keep the irq unmasked safely as with edge type
interrupts there is no danger of interrupt floods. If the threaded
handler has not yet finished then IRQTF_RUNTHREAD is set which will
keep the handler thread active.
Debugged and verified in preempt-rt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The wake_up_process() of the new irq thread in __setup_irq() is too
early as the irqaction is not yet fully initialized especially
action->irq is not yet set. The interrupt thread might dereference the
wrong irq descriptor.
Move the wakeup after the action is installed and action->irq has been
set.
Reported-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Buesch <mb@bu3sch.de>
Interrupt chips which are behind a slow bus (i2c, spi ...) and
demultiplex other interrupt sources need to run their interrupt
handler in a thread.
The demultiplexed interrupt handlers need to run in thread context as
well and need to finish before the demux handler thread can reenable
the interrupt line. So the easiest way is to run the sub device
handlers in the context of the demultiplexing handler thread.
To avoid that a separate thread is created for the subdevices the
function set_nested_irq_thread() is provided which sets the
IRQ_NESTED_THREAD flag in the interrupt descriptor.
A driver which calls request_threaded_irq() must not be aware of the
fact that the threaded handler is called in the context of the
demultiplexing handler thread. The setup code checks the
IRQ_NESTED_THREAD flag which was set from the irq chip setup code and
does not setup a separate thread for the interrupt. The primary
function which is provided by the device driver is replaced by an
internal dummy function which warns when it is called.
For the demultiplexing handler a helper function handle_nested_irq()
is provided which calls the demux interrupt thread function in the
context of the caller and does the proper interrupt accounting and
takes the interrupt disabled status of the demultiplexed subdevice
into account.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Trilok Soni <soni.trilok@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Brian Swetland <swetland@google.com>
Cc: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: m.szyprowski@samsung.com
Cc: t.fujak@samsung.com
Cc: kyungmin.park@samsung.com,
Cc: David Brownell <david-b@pacbell.net>
Cc: Daniel Ribeiro <drwyrm@gmail.com>
Cc: arve@android.com
Cc: Barry Song <21cnbao@gmail.com>
Some interrupt chips are connected to a "slow" bus (i2c, spi ...). The
bus access needs to sleep and therefor cannot be called in atomic
contexts.
Some of the generic interrupt management functions like disable_irq(),
enable_irq() ... call interrupt chip functions with the irq_desc->lock
held and interrupts disabled. This does not work for such devices.
Provide a separate synchronization mechanism for such interrupt
chips. The irq_chip structure is extended by two optional functions
(bus_lock and bus_sync_and_unlock).
The idea is to serialize the bus access for those operations in the
core code so that drivers which are behind that bus operated interrupt
controller do not have to worry about it and just can use the normal
interfaces. To achieve this we add two function pointers to the
irq_chip: bus_lock and bus_sync_unlock.
bus_lock() is called to serialize access to the interrupt controller
bus.
Now the core code can issue chip->mask/unmask ... commands without
changing the fast path code at all. The chip implementation merily
stores that information in a chip private data structure and
returns. No bus interaction as these functions are called from atomic
context.
After that bus_sync_unlock() is called outside the atomic context. Now
the chip implementation issues the bus commands, waits for completion
and unlocks the interrupt controller bus.
The irq_chip implementation as pseudo code:
struct irq_chip_data {
struct mutex mutex;
unsigned int irq_offset;
unsigned long mask;
unsigned long mask_status;
}
static void bus_lock(unsigned int irq)
{
struct irq_chip_data *data = get_irq_desc_chip_data(irq);
mutex_lock(&data->mutex);
}
static void mask(unsigned int irq)
{
struct irq_chip_data *data = get_irq_desc_chip_data(irq);
irq -= data->irq_offset;
data->mask |= (1 << irq);
}
static void unmask(unsigned int irq)
{
struct irq_chip_data *data = get_irq_desc_chip_data(irq);
irq -= data->irq_offset;
data->mask &= ~(1 << irq);
}
static void bus_sync_unlock(unsigned int irq)
{
struct irq_chip_data *data = get_irq_desc_chip_data(irq);
if (data->mask != data->mask_status) {
do_bus_magic_to_set_mask(data->mask);
data->mask_status = data->mask;
}
mutex_unlock(&data->mutex);
}
The device drivers can use request_threaded_irq, free_irq, disable_irq
and enable_irq as usual with the only restriction that the calls need
to come from non atomic context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Trilok Soni <soni.trilok@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Brian Swetland <swetland@google.com>
Cc: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: m.szyprowski@samsung.com
Cc: t.fujak@samsung.com
Cc: kyungmin.park@samsung.com,
Cc: David Brownell <david-b@pacbell.net>
Cc: Daniel Ribeiro <drwyrm@gmail.com>
Cc: arve@android.com
Cc: Barry Song <21cnbao@gmail.com>
For threaded interrupt handlers we expect the hard interrupt handler
part to mask the interrupt on the originating device. The interrupt
line itself is reenabled after the hard interrupt handler has
executed.
This requires access to the originating device from hard interrupt
context which is not always possible. There are devices which can only
be accessed via a bus (i2c, spi, ...). The bus access requires thread
context. For such devices we need to keep the interrupt line masked
until the threaded handler has executed.
Add a new flag IRQF_ONESHOT which allows drivers to request that the
interrupt is not unmasked after the hard interrupt context handler has
been executed and the thread has been woken. The interrupt line is
unmasked after the thread handler function has been executed.
Note that for now IRQF_ONESHOT cannot be used with IRQF_SHARED to
avoid complex accounting mechanisms.
For oneshot interrupts the primary handler simply returns
IRQ_WAKE_THREAD and does nothing else. A generic implementation
irq_default_primary_handler() is provided to avoid useless copies all
over the place. It is automatically installed when
request_threaded_irq() is called with handler=NULL and
thread_fn!=NULL.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Trilok Soni <soni.trilok@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Brian Swetland <swetland@google.com>
Cc: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: m.szyprowski@samsung.com
Cc: t.fujak@samsung.com
Cc: kyungmin.park@samsung.com,
Cc: David Brownell <david-b@pacbell.net>
Cc: Daniel Ribeiro <drwyrm@gmail.com>
Cc: arve@android.com
Cc: Barry Song <21cnbao@gmail.com>
free_irq() can remove an irqaction while the corresponding interrupt
is in progress, but free_irq() sets action->thread to NULL
unconditionally, which might lead to a NULL pointer dereference in
handle_IRQ_event() when the hard interrupt context tries to wake up
the handler thread.
Prevent this by moving the thread stop after synchronize_irq(). No
need to set action->thread to NULL either as action is going to be
freed anyway.
This fixes a boot crash reported against preempt-rt which uses the
mainline irq threads code to implement full irq threading.
[ tglx: removed local irqthread variable ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This takes care of the following entry from Dan's list:
kernel/irq/resend.c +73 check_irq_resend(17) warning: variable derefenced before check 'desc->chip'
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Julia Lawall <julia@diku.dk>
LKML-Reference: <200908062146.03638.bzolnier@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Don't move it if target node is -1.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4A785B5D.4070702@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since genirq: Delegate irq affinity setting to the irq thread
(591d2fb02e) compilation with
CONFIG_SMP=n fails with following error:
/usr/src/linux-2.6/kernel/irq/manage.c:
In function 'irq_thread_check_affinity':
/usr/src/linux-2.6/kernel/irq/manage.c:475:
error: 'struct irq_desc' has no member named 'affinity'
make[4]: *** [kernel/irq/manage.o] Error 1
That commit adds a new function irq_thread_check_affinity() which
uses struct irq_desc.affinity which is only available for CONFIG_SMP=y.
Move that function under #ifdef CONFIG_SMP.
[ tglx@brownpaperbag: compile and boot tested on UP and SMP ]
Signed-off-by: Bruno Premont <bonbons@linux-vserver.org>
LKML-Reference: <20090722222232.2eb3e1c4@neptune.home>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq_set_thread_affinity() calls set_cpus_allowed_ptr() which might
sleep, but irq_set_thread_affinity() is called with desc->lock held
and can be called from hard interrupt context as well. The code has
another bug as it does not hold a ref on the task struct as required
by set_cpus_allowed_ptr().
Just set the IRQTF_AFFINITY bit in action->thread_flags. The next time
the thread runs it migrates itself. Solves all of the above problems
nicely.
Add kerneldoc to irq_set_thread_affinity() while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
The kerneldoc comment describing suspend_device_irqs() is currently
misleading, because generally the function doesn't really disable
interrupt lines at the chip level. Replace it with a more accurate
one.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
LKML-Reference: <200907050022.35117.rjw@sisk.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq, irq.h: Fix kernel-doc warnings
genirq: fix comment to say IRQ_WAKE_THREAD
Don't hardcode to node zero for early boot IRQ setup memory allocations.
[ penberg@cs.helsinki.fi: minor cleanups ]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
Revert "x86, bts: reenable ptrace branch trace support"
tracing: do not translate event helper macros in print format
ftrace/documentation: fix typo in function grapher name
tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
tracing: add protection around module events unload
tracing: add trace_seq_vprint interface
tracing: fix the block trace points print size
tracing/events: convert block trace points to TRACE_EVENT()
ring-buffer: fix ret in rb_add_time_stamp
ring-buffer: pass in lockdep class key for reader_lock
tracing: add annotation to what type of stack trace is recorded
tracing: fix multiple use of __print_flags and __print_symbolic
tracing/events: fix output format of user stack
tracing/events: fix output format of kernel stack
tracing/trace_stack: fix the number of entries in the header
ring-buffer: discard timestamps that are at the start of the buffer
ring-buffer: try to discard unneeded timestamps
ring-buffer: fix bug in ring_buffer_discard_commit
ftrace: do not profile functions when disabled
tracing: make trace pipe recognize latency format flag
...
Presently non-legacy IRQs have their irq_desc allocated with
kzalloc_node(). This assumes that all callers of irq_to_desc_node_alloc()
will be sufficiently late in the boot process that kmalloc is available.
While porting sparseirq support to sh this blew up immediately, as at the
time that we register the CPU's interrupt vector map only bootmem is
available. Check slab_is_available() to work out which path to use.
[ Impact: fix SH early boot crash with sparseirq enabled ]
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
LKML-Reference: <20090522014008.GA2806@linux-sh.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Trying to implement a driver to use threaded irqs, I was confused when the
return value to use that was described in the comment above
request_threaded_irq was not defined.
Turns out that the enum is IRQ_WAKE_THREAD where as the comment said
IRQ_THREAD_WAKE.
[Impact: do not confuse developers with wrong comments ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <alpine.DEB.2.00.0905121431020.13338@gandalf.stny.rr.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge reason: both topics modify the APIC code but were able to do it in
parallel so far. An upcoming patch generates a conflict so
merge them to avoid the conflict.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: tracing/core was on a .30-rc1 base and was missing out on
on a handful of tracing fixes present in .30-rc5-almost.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
move_irq_desc() will try to move irq_desc to the home node if
the allocated one is not correct, in create_irq_nr().
( This can happen on devices that are on different nodes that
are using MSI, when drivers are loaded and unloaded randomly. )
v2: fix non-smp build
v3: add NUMA_IRQ_DESC to eliminate #ifdefs
[ Impact: improve irq descriptor locality on NUMA systems ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F95EAE.2050903@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 044d408409.
The commit added a warning when handle_IRQ_event() is called outside
of hard interrupt context. This breaks the generic tasklet based
interrupt resend mechanism which is used when the hardware has no way
to retrigger the interrupt. So we get a warning for a use case which
is correct and worked for years. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
"tracing: create automated trace defines" causes this compile error on s390,
as reported by Sachin Sant against linux-next:
kernel/built-in.o: In function `__do_softirq':
(.text+0x1c680): undefined reference to `__tracepoint_softirq_entry'
This happens because the definitions of the softirq tracepoints were moved
from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support
generic hardirqs handle.c doesn't get compiled and the definitions are
missing.
So move the tracepoints to softirq.c again.
[ Impact: fix build failure on s390 ]
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <20090429135139.5fac79b8@osiris.boeblingen.de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This simplifies the node awareness of the code. All our allocators
only deal with a NUMA node ID locality not with CPU ids anyway - so
there's no need to maintain (and transform) a CPU id all across the
IRq layer.
v2: keep move_irq_desc related
[ Impact: cleanup, prepare IRQ code to be NUMA-aware ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
LKML-Reference: <49F65536.2020300@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
irq_set_affinity() and move_masked_irq() try to assign affinity
before calling chip set_affinity(). Some archs are assigning it
in ->set_affinity() again.
We do something like:
cpumask_cpy(desc->affinity, mask);
desc->chip->set_affinity(mask);
But in the failure path, affinity should not be touched - otherwise
we'll end up with a different affinity mask despite the failure to
migrate the IRQ.
So try to update the afffinity only if set_affinity returns with 0.
Also call irq_set_thread_affinity accordingly.
v2: update after "irq, x86: Remove IRQ_DISABLED check in process context IRQ move"
v3: according to Ingo, change set_affinity() in irq_chip should return int.
v4: update comments by removing moving irq_desc code.
[ Impact: fix /proc/irq/*/smp_affinity setting corner case bug ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F65509.60307@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The original feature of migrating irq_desc dynamic was too fragile
and was causing problems: it caused crashes on systems with lots of
cards with MSI-X when user-space irq-balancer was enabled.
We now have new patches that create irq_desc according to device
numa node. This patch removes the leftover bits of the dynamic balancer.
[ Impact: remove dead code ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F654AF.8000808@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CPUMASKS_OFFSTACK is not defined anywhere (it is CPUMASK_OFFSTACK).
It is a typo and init_allocate_desc_masks() is called before it set
affinity to all cpus...
Split init_alloc_desc_masks() into all_desc_masks() and init_desc_masks().
Also use CPUMASK_OFFSTACK in alloc_desc_masks().
[ Impact: fix smp_affinity copying/setup when moving irq_desc between CPUs ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <49F6546E.3040406@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When requesting an IRQ, the DEBUG_SHIRQ code executes a fake IRQ just to make
sure the driver is ready to receive an IRQ immediately. The problem was that
this fake IRQ was being executed even if interrupt line failed to be allocated
by __setup_irq.
Signed-off-by: Luis Henriques <henrix@sapo.pt>
LKML-Reference: <20090401170635.GA4392@hades.domain.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ fixed bug pointed out by a warning reported by Stephen Rothwell ]
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up
Create a sub directory in include/trace called events to keep the
trace point headers in their own separate directory. Only headers that
declare trace points should be defined in this directory.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch lowers the number of places a developer must modify to add
new tracepoints. The current method to add a new tracepoint
into an existing system is to write the trace point macro in the
trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
DECLARE_TRACE, then they must add the same named item into the C file
with the macro DEFINE_TRACE(name) and then add the trace point.
This change cuts out the needing to add the DEFINE_TRACE(name).
Every file that uses the tracepoint must still include the trace/<type>.h
file, but the one C file must also add a define before the including
of that file.
#define CREATE_TRACE_POINTS
#include <trace/mytrace.h>
This will cause the trace/mytrace.h file to also produce the C code
necessary to implement the trace point.
Note, if more than one trace/<type>.h is used to create the C code
it is best to list them all together.
#define CREATE_TRACE_POINTS
#include <trace/foo.h>
#include <trace/bar.h>
#include <trace/fido.h>
Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
the cleaner solution of the define above the includes over my first
design to have the C code include a "special" header.
This patch converts sched, irq and lockdep and skb to use this new
method.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
As discussed in the thread here:
http://marc.info/?l=linux-kernel&m=123964468521142&w=2
Eric W. Biederman observed:
> It looks like some additional bugs have slipped in since last I looked.
>
> set_irq_affinity does this:
> ifdef CONFIG_GENERIC_PENDING_IRQ
> if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) {
> cpumask_copy(desc->affinity, cpumask);
> desc->chip->set_affinity(irq, cpumask);
> } else {
> desc->status |= IRQ_MOVE_PENDING;
> cpumask_copy(desc->pending_mask, cpumask);
> }
> #else
>
> That IRQ_DISABLED case is a software state and as such it has nothing to
> do with how safe it is to move an irq in process context.
[...]
>
> The only reason we migrate MSIs in interrupt context today is that there
> wasn't infrastructure for support migration both in interrupt context
> and outside of it.
Yes. The idea here was to force the MSI migration to happen in process
context. One of the patches in the series did
disable_irq(dev->irq);
irq_set_affinity(dev->irq, cpumask_of(dev->cpu));
enable_irq(dev->irq);
with the above patch adding irq/manage code check for interrupt disabled
and moving the interrupt in process context.
IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET
code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there
when we eventually submitted the patch upstream. But, looks like I did a
blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code.
Below patch fixes this. i.e., revert commit 932775a4ab
and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity
in generic code as set_affinity routines are doing it internally.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Li Shaohua" <shaohua.li@intel.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: "lcm@us.ibm.com" <lcm@us.ibm.com>
Cc: suresh.b.siddha@intel.com
LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Need to free the old cpumask for affinity and pending_mask.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49D18FF0.50707@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce helper functions allowing us to prevent device drivers from
getting any interrupts (without disabling interrupts on the CPU)
during suspend (or hibernation) and to make them start to receive
interrupts again during the subsequent resume. These functions make it
possible to keep timer interrupts enabled while the "late" suspend and
"early" resume callbacks provided by device drivers are being
executed. In turn, this allows device drivers' "late" suspend and
"early" resume callbacks to sleep, execute ACPI callbacks etc.
The functions introduced here will be used to rework the handling of
interrupts during suspend (hibernation) and resume. Namely,
interrupts will only be disabled on the CPU right before suspending
sysdevs, while device drivers will be prevented from receiving
interrupts, with the help of the new helper function, before their
"late" suspend callbacks run (and analogously during resume).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Delta patch to address the review comments.
- Implement warning when IRQ_WAKE_THREAD is requested and no
thread handler installed
- coding style fixes
Pointed-out-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some devices use devres_request_irq() for to install their interrupt
handler. Add support for threaded interrupts to devres as well.
[tglx - simplified and adapted to latest threadirq version]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add support for threaded interrupt handlers:
A device driver can request that its main interrupt handler runs in a
thread. To achive this the device driver requests the interrupt with
request_threaded_irq() and provides additionally to the handler a
thread function. The handler function is called in hard interrupt
context and needs to check whether the interrupt originated from the
device. If the interrupt originated from the device then the handler
can either return IRQ_HANDLED or IRQ_WAKE_THREAD. IRQ_HANDLED is
returned when no further action is required. IRQ_WAKE_THREAD causes
the genirq code to invoke the threaded (main) handler. When
IRQ_WAKE_THREAD is returned handler must have disabled the interrupt
on the device level. This is mandatory for shared interrupt handlers,
but we need to do it as well for obscure x86 hardware where disabling
an interrupt on the IO_APIC level redirects the interrupt to the
legacy PIC interrupt lines.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Two years migration time is enough. Remove the compability cruft.
Add the deprecated warning in kernel/irq/handle.c because marking
__do_IRQ itself is way too noisy.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: cleanup
The code is only compiled if CONFIG_GENERIC_HARDIRQS=y so another
check for this define in the code is redundant. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Export the setup_irq() and remove_irq() symbols.
I'd like to export these functions since I have timer
code that needs to use setup_irq() early on (too early
for request_irq()), and the same code can also be
compiled as a module.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120559.2926.82371.sendpatchset@rx1.opensource.se>
[ changed to _GPL as these are special APIs deep inside the irq layer. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Modify remove_irq() to match setup_irq().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120551.2926.43942.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new API
This patch adds a remove_irq() function for releasing
interrupts requested with setup_irq().
Without this patch we have no way of releasing such
interrupts since free_irq() today tries to kfree()
the irqaction passed with setup_irq().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120542.2926.56609.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make sure the genirq layer handlers are indeed running handlers
in hardirq context. That is the genirq expectation and doing
anything else is broken.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1236006812.5330.632.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new tracepoints
Add them to the generic IRQ code, that way every architecture
gets these new tracepoints, not just x86.
Using Steve's new 'TRACE_FORMAT', I can get function graph
trace as follows using the original two IRQ tracepoints:
3) | handle_IRQ_event() {
3) | /* (irq_handler_entry) irq=28 handler=eth0 */
3) | e1000_intr_msi() {
3) 2.460 us | __napi_schedule();
3) 9.416 us | }
3) | /* (irq_handler_exit) irq=28 handler=eth0 return=handled */
3) + 22.935 us | }
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
'p' stands for pointer - make it clear in setup_irq() and free_irq()
what kind of pointer it is.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that the 'pp' variable can be eliminated
altogether, and the loop can be cleaned up further.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
- separate out the loop from the actual freeing logic, this wins us
two indentation levels allowing a number of followup prettifications
- turn the WARN_ON() into a more informative WARN().
- clean up the comments and the code flow some more
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
request_irq() calls into proc code via __setup_irq() which is not safe
in an atomic context, so request_irq() can itself use the more
reliable GFP_KERNEL allocation for the action descriptor.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While being at it make every occurrence of 'do_irq_select_affinity'
have the same signature in terms of signedness of the first argument.
Fix this sparse warning:
kernel/irq/manage.c:112:5: warning: symbol 'do_irq_select_affinity' was not declared. Should it be static?
Also rename do_irq_select_affinity() to setup_affinity() - shorter name
and clearer naming.
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Acked-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simplify and make init_kstat_irqs etc more type proof, suggested by
Andrew.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: get correct kstat_irqs [/proc/interrupts] for msi/msi-x etc
need to call clear_kstat_irqs(), so when we reuse that irq_desc,
we get correct kstat in /proc/interrupts.
This makes /proc/interrupts not have <NULL> entries.
Don't need to worry about arch that doesn't support genirq, because they
will not call dynamic_irq_cleanup().
v2: simplify and make clear_kstat_irqs more robust
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Eric Paris reported:
> I have an hp dl785g5 which is unable to successfully run
> 2.6.29-0.66.rc3.fc11.x86_64 or 2.6.29-rc2-next-20090126. During bootup
> (early in userspace daemons starting) I get the below BUG, which quickly
> renders the machine dead. I assume it is because sparse_irq_lock never
> gets released when the BUG kills that task.
Adjust lock sequence when migrating a descriptor with
CONFIG_NUMA_MIGRATE_IRQ_DESC enabled.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the initialization of irq_default_affinity to early_irq_init as
core_initcall is too late.
irq_default_affinity can be used in init_IRQ and potentially timer and
SMP init as well. All of these happen before core_initcall. Moving
the initialization to early_irq_init ensures that it is initialized
before it is used.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In interrupt.h these functions are declared only if
CONFIG_GENERIC_HARDIRQS is set. We should define them under identical
conditions.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide a shared interrupt debug facility under CONFIG_DEBUG_SHIRQ:
it uses the existing irqpoll facilities to iterate through all
registered interrupt handlers and call those which can handle shared
IRQ lines.
This can be handy for suspend/resume debugging: if we call this function
early during resume we can trigger crashes in those drivers which have
incorrect assumptions about when exactly their ISRs will be called
during suspend/resume.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar wrote:
> All non-x86 architectures fail to build:
>
> In file included from /home/mingo/tip/include/linux/random.h:11,
> from /home/mingo/tip/include/linux/stackprotector.h:6,
> from /home/mingo/tip/init/main.c:17:
> /home/mingo/tip/include/linux/irqnr.h:26:63: error: asm/irq_vectors.h: No such file or directory
Do not include asm/irq_vectors.h in generic code - it's not available
on all architectures.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: reduce memory usage.
Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
memory usage bump when NR_CPUS bumped from 128 to 4096:
8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
This is only when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: Reduce memory usage.
This is the second half of the changes to make the irq_desc_ptrs be
variable sized based on nr_cpu_ids. This is done by adding a new
"max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to
return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids.
This necessitated moving the define of MAX_IO_APICS to a separate
file (asm/apicnum.h) so it could be included without the baggage
of the other asm/apicdef.h declarations.
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: allocate irq_desc_ptrs in preparation for making it variable-sized.
This addresses this memory usage bump when NR_CPUS bumped from 128 to 4096:
34816 +229376 264192 +658% irq_desc_ptrs(.data.read_mostly)
The patch is split into two parts, the first simply allocates the
irq_desc_ptrs array. Then next will deal with making it variable.
This is only when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: cleanup WARN msg.
Ingo requested:
> While at it, could you please also convert this to a WARN() construct
> instead? (in a separate commit)
... and it shall be done. ;-)
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: preparation, cleanup, add KERN_INFO printk
Modify references from NR_IRQS to nr_irqs as the later will become
variable-sized based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: fix bug where new irq_desc uses old cpumask pointers which are freed.
As Yinghai pointed out, init_copy_one_irq_desc() copies the old desc to
the new desc overwriting the cpumask pointers. Since the old_desc and
the cpumask pointers are freed, then memory corruption will occur if
these old pointers are used.
Move the allocation of these pointers to after the copy.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Impact: reduce memory usage, use new cpumask API.
Replace the affinity and pending_masks with cpumask_var_t's. This adds
to the significant size reduction done with the SPARSE_IRQS changes.
The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
in the include file so they can be inlined (and optimized out for the
!CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with
the other init*irq functions, as well as the backwards arg declaration
of "from, to" instead of the more common "to, from" standard.]
Includes a slight change to the declaration of struct irq_desc to embed
the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
references, and some small changes to Xen.
Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: virtualization@lists.osdl.org
Cc: xen-devel@lists.xensource.com
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Impact: clean up sparseirq fallout on random.c
Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS
so we could some #ifdef later if all arch support genirq
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Right now, most of the kernel boot is strictly synchronous, such that
various hardware delays are done sequentially.
In order to make the kernel boot faster, this patch introduces
infrastructure to allow doing some of the initialization steps
asynchronously, which will hide significant portions of the hardware delays
in practice.
In order to not change device order and other similar observables, this
patch does NOT do full parallel initialization.
Rather, it operates more in the way an out of order CPU does; the work may
be done out of order and asynchronous, but the observable effects
(instruction retiring for the CPU) are still done in the original sequence.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
* 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
x86: setup_per_cpu_areas() cleanup
cpumask: fix compile error when CONFIG_NR_CPUS is not defined
cpumask: use alloc_cpumask_var_node where appropriate
cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
x86: use cpumask_var_t in acpi/boot.c
x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
sched: put back some stack hog changes that were undone in kernel/sched.c
x86: enable cpus display of kernel_max and offlined cpus
ia64: cpumask fix for is_affinity_mask_valid()
cpumask: convert RCU implementations, fix
xtensa: define __fls
mn10300: define __fls
m32r: define __fls
h8300: define __fls
frv: define __fls
cris: define __fls
cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
cpumask: zero extra bits in alloc_cpumask_var_node
cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
cpumask: convert mm/
...