The FW_CDEV_IOC_GET_INFO ioctl looks at client->device->config_rom, not
at the local node's config ROM.
We could fix the implementation or the documentation. I believe the way
how it is currently implemented is more useful than the way how it is
currently documented. In fact, libdc1394 uses the ABI already as
implemented, not as documented. Hence let's change the documentation.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Impact: Undo compile breakage for archs with CONFIG_GENERIC_HARDIRQ=n
The threaded interrupt handler patches changed request_irq from extern
to inline. Architectures which do not use the generic irq code still
have request_irq() as a global function and therefor fail to compile.
Keep the extern declaration for CONFIG_GENERIC_HARDIRQ=n
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: cleanup
struct dyn_ftrace::ip has different usages in his lifecycle,
we use union for it. And also for struct dyn_ftrace::flags.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: more accurate timings
The current method of function graph tracing does not take into
account the time spent when a task is not running. This shows functions
that call schedule have increased costs:
3) + 18.664 us | }
------------------------------------------
3) <idle>-0 => kblockd-123
------------------------------------------
3) | finish_task_switch() {
3) 1.441 us | _spin_unlock_irq();
3) 3.966 us | }
3) ! 2959.433 us | }
3) ! 2961.465 us | }
This patch uses the tracepoint in the scheduling context switch to
account for time that has elapsed while a task is scheduled out.
Now we see:
------------------------------------------
3) <idle>-0 => edac-po-1067
------------------------------------------
3) | finish_task_switch() {
3) 0.685 us | _spin_unlock_irq();
3) 2.331 us | }
3) + 41.439 us | }
3) + 42.663 us | }
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This patch move the timestamp from happening in the arch specific
code into the general code. This allows for better control by the tracer
to time manipulation.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Currently inherited from sg.c bsg will submit asynchronous request
at the head-of-the-queue, (using "at_head" set in the call to
blk_execute_rq_nowait()). This is bad in situation where the queues
are full, requests will execute out of order, and can cause
starvation of the first submitted requests.
The sg_io_v4->flags member is used and a bit is allocated to denote the
Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old
code at the write/read path. SG_IO code path behavior was changed so to
be the same as write/read behavior. SG_IO was very rarely used and breaking
compatibility with it is OK at this stage.
sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first
nibble and one bit from the last nibble. Even though none of these bits
are supported by bsg, The second nibble is allocated for use by bsg. Just
in case.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
In include/linux/genhd.h: Line 335 has a comment that needs to be updated from: /* drivers/block/ll_rw_blk.c */ to /* block/blk-core.c */. Also as of kernel 2.6.16, the function definition for get_blkdev_list was removed from block/genhd.c but the function declaration is still present on line 339. This patch addresses both those fixes, by updating the comment and removing the declaration.
Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The integrity bio allocation needs its own bio_set to avoid violating
the mempool allocation rules and risking deadlocks.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The include/linux/genhd.h file, on line 338-352 declares some function
prototypes in which the comment on line 338 states that the definition of
these prototypes are to be found at drivers/block/genhd.c. The problem is
that genhd.c has been relocated to block/genhd.c. See attached patch to
correct this minor cosmetic typo.
Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
o Reducing overhead by eliminating duplicated fields between structures
o Simplifcation of the code (reduces the code size by a fair bit)
o The locking interface is now the DLM interface itself as proposed
some time ago.
o Fewer lookups of glocks when processing replies from the DLM
o Fewer memory allocations/deallocations for each glock
o Scope to do further optimisations in the future (but this patch is
more than big enough for now!)
Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.
This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Delta patch to address the review comments.
- Implement warning when IRQ_WAKE_THREAD is requested and no
thread handler installed
- coding style fixes
Pointed-out-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some devices use devres_request_irq() for to install their interrupt
handler. Add support for threaded interrupts to devres as well.
[tglx - simplified and adapted to latest threadirq version]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add support for threaded interrupt handlers:
A device driver can request that its main interrupt handler runs in a
thread. To achive this the device driver requests the interrupt with
request_threaded_irq() and provides additionally to the handler a
thread function. The handler function is called in hard interrupt
context and needs to check whether the interrupt originated from the
device. If the interrupt originated from the device then the handler
can either return IRQ_HANDLED or IRQ_WAKE_THREAD. IRQ_HANDLED is
returned when no further action is required. IRQ_WAKE_THREAD causes
the genirq code to invoke the threaded (main) handler. When
IRQ_WAKE_THREAD is returned handler must have disabled the interrupt
on the device level. This is mandatory for shared interrupt handlers,
but we need to do it as well for obscure x86 hardware where disabling
an interrupt on the IO_APIC level redirects the interrupt to the
legacy PIC interrupt lines.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The user can request to enable snooping control through VT-d page table.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This iommu_op can tell if domain have a specific capability, like snooping
control for Intel IOMMU, which can be used by other components of kernel to
adjust the behaviour.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Snooping control enabled IOMMU to guarantee DMA cache coherency and thus reduce
software effort (VMM) in maintaining effective memory type.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.
the ioctl has been already implemented in the
commit: 0a92035674
Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
IRQ injection status is either -1 (if there was no CPU found
that should except the interrupt because IRQ was masked or
ioapic was misconfigured or ...) or >= 0 in that case the
number indicates to how many CPUs interrupt was injected.
If the value is 0 it means that the interrupt was coalesced
and probably should be reinjected.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvmclock currently falls apart on machines without constant tsc.
This patch fixes it. Changes:
* keep tsc frequency in a per-cpu variable.
* handle kvmclock update using a new request flag, thus checking
whenever we need an update each time we enter guest context.
* use a cpufreq notifier to track frequency changes and force
kvmclock updates.
* send ipis to kick cpu out of guest context if needed to make
sure the guest doesn't see stale values.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Merge MSI userspace interface with IRQ routing table. Notice the API have been
changed, and using IRQ routing table would be the only interface kvm-userspace
supported.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
IRQ ack notifications assume an identity mapping between pin->gsi,
which might not be the case with, for example, HPET.
Translate before acking.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Kconfig symbols are not available in userspace, and are not stripped by
headers-install. Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently KVM has a static routing from GSI numbers to interrupts (namely,
0-15 are mapped 1:1 to both PIC and IOAPIC, and 16:23 are mapped 1:1 to
the IOAPIC). This is insufficient for several reasons:
- HPET requires non 1:1 mapping for the timer interrupt
- MSIs need a new method to assign interrupt numbers and dispatch them
- ACPI APIC mode needs to be able to reassign the PCI LINK interrupts to the
ioapics
This patch implements an interrupt routing table (as a linked list, but this
can be easily changed) and a userspace interface to replace the table. The
routing table is initialized according to the current hardwired mapping.
Signed-off-by: Avi Kivity <avi@redhat.com>
Allow clients to request notifications when the guest masks or unmasks a
particular irq line. This complements irq ack notifications, as the guest
will not ack an irq line that is masked.
Currently implemented for the ioapic only.
Signed-off-by: Avi Kivity <avi@redhat.com>
MSI is always enabled by default for msi2intx=1. But if msi2intx=0, we
have to disable MSI if guest require to do so.
The patch also discard unnecessary msi2intx judgment if guest want to update
MSI state.
Notice KVM_DEV_IRQ_ASSIGN_MSI_ACTION is a mask which should cover all MSI
related operations, though we only got one for now.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Certain clocks (such as TSC) in older 2.6 guests overaccount for lost
ticks, causing severe time drift. Interrupt reinjection magnifies the
problem.
Provide an option to disable it.
[avi: allow room for expansion in case we want to disable reinjection
of other timers]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
vmap() on guest pages hides those pages from the Linux mm for an extended
(userspace determined) amount of time. Get rid of it.
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Limit KVM_CAP_SET_GUEST_DEBUG only to those archs (currently x86) that
support it. This simplifies user space stub implementations.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Implement KVM_IA64_VCPU_[GS]ET_STACK ioctl calls. This is required
for live migrations.
Patch is based on previous implementation that was part of old
GET/SET_REGS ioctl calls.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
part, controlling the "main switch" and the single-step feature. The
arch specific part adds an x86 interface for intercepting both types of
debug exceptions separately and re-injecting them when the host was not
interested. Moveover, the foundation for guest debugging via debug
registers is layed.
To signal breakpoint events properly back to userland, an arch-specific
data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
contains the PC, the debug exception, and relevant debug registers to
tell debug events properly apart.
The availability of this new interface is signaled by
KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
provided.
Note that both SVM and VTX are supported, but only the latter was tested
yet. Based on the experience with all those VTX corner case, I would be
fairly surprised if SVM will work out of the box.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Present backing device capabilities for MTD character device files to allow
NOMMU mmap to do direct mapping where possible.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Bernd Schmidt <bernd.schmidt@analog.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
ucc_geth: Fix oops when using fixed-link support
dm9000: locking bugfix
net: update dnet.c for bus_id removal
dnet: DNET should depend on HAS_IOMEM
dca: add missing copyright/license headers
nl80211: Check that function pointer != NULL before using it
sungem: missing net_device_ops
be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle
be2net: replenish when posting to rx-queue is starved in out of mem conditions
bas_gigaset: correctly allocate USB interrupt transfer buffer
smsc911x: reset last known duplex and carrier on open
sh_eth: Fix mistake of the address of SH7763
sh_eth: Change handling of IRQ
netns: oops in ip[6]_frag_reasm incrementing stats
net: kfree(napi->skb) => kfree_skb
net: fix sctp breakage
ipv6: fix display of local and remote sit endpoints
net: Document /proc/sys/net/core/netdev_budget
tulip: fix crash on iface up with shirq debug
virtio_net: Make virtio_net support carrier detection
...
Impact: add new debugfs API
With ftrace, some tracers are registered in early initcalls
and attempt to create files on the debugfs filesystem.
Depending on when they are activated, they can try to create their
file at any time. Some checks can be done on the tracing area
but providing a helper to know if debugfs is registered make it
really more easy.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1237759847-21025-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch overloads RINGBUF_TYPE_PADDING to provide a way to discard
events from the ring buffer, for the event-filtering mechanism
introduced in a subsequent patch.
I did the initial version but thanks to Steven Rostedt for adding
the parts that actually made it work. ;-)
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use net_device_ops for usbnet device, and export for use
by other derived drivers.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor changes to queue hashing:
1. Use const on accessor functions
2. Export skb_tx_hash for use in drivers (see ixgbe)
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In two dca files copyright and license headers are missing.
This patch adds them there.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This API is used by the PCI core to rescan a bus and rediscover
newly added devices.
Over time, it is expected that the various PCI hotplug drivers
will migrate to this interface and away from the old
pci_do_scan_bus() interface.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce pci_is_root_bus helper function. This will help make code
more consistent, as well as prevent incorrect assumptions (such as
pci_bus->self == NULL on a root bus, which is not always true).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add or remove a Virtual Function after receiving a Migrate In or Out
Request.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add or remove the Virtual Function when the SR-IOV is enabled or
disabled by the device driver. This can happen anytime rather than
only at the device probe stage.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If a device has the SR-IOV capability, initialize it (set the ARI
Capable Hierarchy in the lowest numbered PF if necessary; calculate
the System Page Size for the VF MMIO, probe the VF Offset, Stride
and BARs). A lock for the VF bus allocation is also initialized if
a PF is the lowest numbered PF.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On the Compaq Evo D510 SFF/CMT, a PCI quirk activated the SMBus device
based on detection of the on-board VGA controller, but the on-board
VGA is disabled if an AGP card is inserted, so look for one of the USB
controllers instead.
Signed-off-by: David O'Shea <dcoshea@hotmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block. Ensure that the architecture back ends don't
have to know about multiple MSI.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Since most of the callers already know whether they have an MSI or
an MSI-X capability, split msi_set_mask_bits() into msi_mask_irq()
and msix_mask_irq(). The only callers which don't (mask_msi_irq()
and unmask_msi_irq()) can share code in msi_set_mask_bit(). This then
becomes the only caller of msix_flush_writes(), so we can inline it.
The flushing read can be to any address that belongs to the device,
so we can eliminate the calculation too.
We can also get rid of maskbits_mask from struct msi_desc and simply
recalculate it on the rare occasion that we need it. The single-bit
'masked' element is replaced by a copy of the 32-bit 'masked' register,
so this patch does not affect the size of msi_desc.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
MSI interrupts have a mask_pos where MSI-X have a mask_base. Use a
transparent union to get rid of some ugly casts.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
By changing from a 5-bit field to a 1-bit field, we free up some bits
that can be used by a later patch. Also rearrange the fields for better
packing on 64-bit platforms (reducing the size of msi_desc from 72 bytes
to 64 bytes).
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current acpi_find_root_bridge_handle() has a assumption that
pci_bus->self is NULL on the root pci bus. But it might not be true on
some platforms. Because of this wrong assumption, current
acpi_find_root_bridge_handle() might cause endless loop. We must check
pci_bus->parent instead.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Implement pm object for the PCI Express port driver in order to use
the new power management framework and reduce the code size.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move mtd_has_partitions() and mtd_has_cmdlinepart() inlines from a
DaVinci-specific driver to the <linux/mtd/partitions.h> header.
Use those to eliminate #ifdefs in two drivers which had their own
definitions of mtd_has_partitions().
Quite a lot of other MTD drivers could benefit from using use one or both
of these to remove #ifdeffery. Maybe some Janitors would like to help.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
dev_queue_xmit() needs to dirty fields "state", "q", "bstats" and "qstats"
On x86_64 arch, they currently span three cache lines, involving more
cache line ping pongs than necessary, making longer holding of queue spinlock.
We can reduce this to one cache line, by grouping all read-mostly fields
at the beginning of structure. (Or should I say, all highly modified fields
at the end :) )
Before patch :
offsetof(struct Qdisc, state)=0x38
offsetof(struct Qdisc, q)=0x48
offsetof(struct Qdisc, bstats)=0x80
offsetof(struct Qdisc, qstats)=0x90
sizeof(struct Qdisc)=0xc8
After patch :
offsetof(struct Qdisc, state)=0x80
offsetof(struct Qdisc, q)=0x88
offsetof(struct Qdisc, bstats)=0xa0
offsetof(struct Qdisc, qstats)=0xac
sizeof(struct Qdisc)=0xc0
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To improve manageability, it would be good to be able to disambiguate routes
added by administrator from those added by DHCP client. The only necessary
kernel change is to add value to rtnetlink include file so iproute2 utility
can use it.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/pci/hotplug/fakephp.c: In function 'pci_rescan_bus':
drivers/pci/hotplug/fakephp.c:271: warning: passing argument 1 of 'pci_bus_assign_resources' discards qualifiers from pointer target type
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
- Rename pci_osc_control_set() to acpi_pci_osc_control_set() according
to the other API names in drivers/acpi/pci_root.c.
- Move _OSC related definitions to include/linux/acpi.h because _OSC
related API is implemented in drivers/acpi/pci_root.c now.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move PCI _OSC management code from drivers/pci/pci-acpi.c to
drivers/acpi/pci_root.c. The benefits are
- We no longer need struct osc_data and its management code (contents
are moved to struct acpi_pci_root). This simplify the code, and we
no longer care about kmalloc() failure.
- We can make pci_acpi_osc_support() be a static function, which is
called only from drivers/acpi/pci_root.c.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for the right services. Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them. Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() in the current code will be used for PME and
hotplug and which of them will be used for AER if all of these
services are configured.
For these reasons, rework the allocation of interrupts for PCI
Express ports so that if MSI-X are enabled, the right vectors will be
used for the right purposes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce new function pci_msix_table_size() returning the size of
the MSI-X table of given PCI device or 0 if the device doesn't
support MSI-X.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCI Express port driver uses 'struct pcie_port_service_id' for
matching port service devices and drivers, but this structure
contains fields that duplicate information from the port device
itself (vendor, device, subvendor, subdevice) and fields that are not
used by any existing port service driver (class, class_mask,
drvier_data). Also, both existing port service drivers (AER and
PCIe HP) don't even use the vendor and device fields for device
matching. Therefore 'struct pcie_port_service_id' can be removed
altogether and the only useful members of it (port_type, service) can
be introduced directly into the port service device and port service
driver structures. That simplifies the code quite a bit and reduces
its size.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The second argument of the ->probe() callback in
struct pcie_port_service_driver is unnecessary and never used.
Remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCI Express port driver should not attempt to register service
devices that require the ability to generate interrupts if generating
interrupts is not possible. Namely, if the port has no interrupt pin
configured and we cannot set up MSI or MSI-X for it, there is no way
it can generate interrupts and in such a case the port services that
rely on interrupts (PME, PCIe HP, AER) should not be enabled for it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI Express port driver extension, as defined by struct
pcie_port_device_ext in portdrv.h, is allocated and initialized, but
never used (it also is never freed). Extend it to hold the PCI Express
port type as well as the port interrupt mode, change its name and use it
to simplify the code in portdrv_core.c .
Additionally, remove the redundant interrupt_mode member of struct
pcie_device defined in include/linux/pcieport_if.h .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Close-to-open cache consistency rules really only require us to flush out
writes on calls to close(), and require us to revalidate attributes on the
very last close of the file.
Currently we appear to be doing a lot of extra attribute revalidation
and cache flushes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This fixes a regression against FreeBSD servers as reported by Tomas
Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
don't ever react to the client closing the socket, and so commit
e06799f958 (SUNRPC: Use shutdown() instead of
close() when disconnecting a TCP socket) causes the setup to hang forever
whenever the client attempts to close and then reconnect.
We break the deadlock by adding a 'linger2' style timeout to the socket,
after which, the client will abort the connection using a TCP 'RST'.
The default timeout is set to 15 seconds. A subsequent patch will put it
under user control by means of a systctl.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When a port's link is down (except to driver restart) and the port is
configured for auto sensing, we try to sense port link type (Ethernet
or InfiniBand) in order to determine how to initialize the port. If
the port type needs to be changed, all mlx4 for the device interfaces
are unregistered and then registered again with the new port
types. Sensing is done with intervals of 3 seconds.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Clean up: Enable the use of const arguments in higher level svc_ APIs
by adding const to the arguments of the helper functions in svc_xprt.h
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.
This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:
http://article.gmane.org/gmane.linux.nfs/10375
It has also been updated thus:
* moved EXPORT_SYMBOL() to near the function it exports
* made the new struct struct seq_operations const
* used SEQ_START_TOKEN instead of ((void *)1)
* merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
printk in svc_pool_stats_*()" by Harshula Jayasuriya.
* merged fix from SGI PV 964001 "Crash reading pool_stats before
nfsds are started".
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads. When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up. As long as there are idle
threads, one will be woken up.
If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them. Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run. This situation
causes two significant problems:
1. The CPU scheduler takes over 10% of each CPU, which is robbing
the nfsd threads of valuable CPU time.
2. At a high enough load, the nfsd threads starve userspace threads
of CPU time, to the point where daemons like portmap and rpc.mountd
do not schedule for tens of seconds at a time. Clients attempting
to mount an NFS filesystem timeout at the very first step (opening
a TCP connection to portmap) because portmap cannot wake up from
select() and call accept() in time.
Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.
The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.
Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server. That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server. The server was running 128 nfsds.
Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%. This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.
This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:
http://article.gmane.org/gmane.linux.nfs/10374
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If a filesystem being written to via NFS returns a short write count
(as opposed to an error) to nfsd, nfsd treats that as a success for
the entire write, rather than the short count that actually succeeded.
For example, given a 8192 byte write, if the underlying filesystem
only writes 4096 bytes, nfsd will ack back to the nfs client that all
8192 bytes were written. The nfs client does have retry logic for
short writes, but this is never called as the client is told the
complete write succeeded.
There are probably other ways it could happen, but in my case it
happened with a fuse (filesystem in userspace) filesystem which can
rather easily have a partial write.
Here is a patch to properly return the short write count to the
client.
Signed-off-by: David Shaw <dshaw@jabberwocky.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
As part of reducing the scope of the client_mutex, and in order to
remove the need for mutexes from the callback code (so that callbacks
can be done as asynchronous rpc calls), move manipulations of the
file_hashtable under the recall_lock.
Update the relevant comments while we're here.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Alexandros Batsakis <batsakis@netapp.com>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
Delegreturn is enough a special case for preprocess_stateid_op to
warrant just open-coding it in delegreturn.
There should be no change in behavior here; we're just reshuffling code.
Thanks to Yang Hongyang for catching a critical typo.
Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The base versions handle constant folding now, none of these headers
are exported to userspace, so the __ prefixed versions are not
necessary.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Impact: build fix
Fix:
arch/x86/kvm/x86.o: In function `dma_debug_add_bus':
(.text+0x0): multiple definition of `dma_debug_add_bus'
dma_debug_add_bus() should be a static inline function.
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20090317120112.GP6159@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: better performance for if branch tracer
Use an array to count the hit and misses of a conditional instead
of using another conditional. This cuts down on saturation of branch
predictions and increases performance of modern pipelined architectures.
Signed-off-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This patch adds a new function called ring_buffer_set_clock that
allows a tracer to assign its own clock source to the buffer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: cleanup
Clean up #ifdefs and replace them with helper functions.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: cleanup/sanitization
Start from a sane state while enabling dma and interrupt-remapping, by
clearing the previous recorded faults and disabling previously
enabled queued invalidation and interrupt-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: new interfaces (not yet used)
Routines for disabling queued invalidation and interrupt remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: interface augmentation (not yet used)
Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Since creating a device node is normally an operation requiring special
privilege, Igor Zhbanov points out that it is surprising (to say the
least) that a client can, for example, create a device node on a
filesystem exported with root_squash.
So, make sure CAP_MKNOD is among the capabilities dropped when an nfsd
thread handles a request from a non-root user.
Reported-by: Igor Zhbanov <izh1979@gmail.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Impact: allow architectures to monitor busses for dma mem leakage
This patch adds checking code to detect if a device has pending DMA
operations when it is about to be unbound from its device driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This adds a function to dump the DMA mappings that the debugging code is
aware of -- either for a single device, or for _all_ devices.
This can be useful for debugging -- sticking a call to it in the DMA
page fault handler, for example, to see if the faulting address _should_
be mapped or not, and hence work out whether it's IOMMU bugs we're
seeing, or driver bugs.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This allows us to send to userspace "regulatory" events.
For now we just send an event when we change regulatory domains.
We also notify userspace when devices are using their own custom
world roaming regulatory domains.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We do this so we can later inform userspace who set the
regulatory domain and provide details of the request.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As my netpoll fix for net doesn't really work for net-next, we
need this update to move the checks into the right place. As it
stands we may pass freed skbs to netpoll_receive_skb.
This patch also introduces a netpoll_rx_on function to avoid GRO
completely if we're invoked through netpoll. This might seem
paranoid but as netpoll may have an external receive hook it's
better to be safe than sorry. I don't think we need this for
2.6.29 though since there's nothing immediately broken by it.
This patch also moves the GRO_* return values to netdevice.h since
VLAN needs them too (I tried to avoid this originally but alas
this seems to be the easiest way out). This fixes a bug in VLAN
where it continued to use the old return value 2 instead of the
correct GRO_DROP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the iptables cluster match. This match can be used
to deploy gateway and back-end load-sharing clusters. The cluster
can be composed of 32 nodes maximum (although I have only tested
this with two nodes, so I cannot tell what is the real scalability
limit of this solution in terms of cluster nodes).
Assuming that all the nodes see all packets (see below for an
example on how to do that if your switch does not allow this), the
cluster match decides if this node has to handle a packet given:
(jhash(source IP) % total_nodes) & node_mask
For related connections, the master conntrack is used. The following
is an example of its use to deploy a gateway cluster composed of two
nodes (where this is the node 1):
iptables -I PREROUTING -t mangle -i eth1 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth1 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth1 \
-m mark ! --mark 0xffff -j DROP
iptables -A PREROUTING -t mangle -i eth2 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth2 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth2 \
-m mark ! --mark 0xffff -j DROP
And the following commands to make all nodes see the same packets:
ip maddr add 01:00:5e:00:01:01 dev eth1
ip maddr add 01:00:5e:00:01:02 dev eth2
arptables -I OUTPUT -o eth1 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:01
arptables -I INPUT -i eth1 --h-length 6 \
--destination-mac 01:00:5e:00:01:01 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
arptables -I OUTPUT -o eth2 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:02
arptables -I INPUT -i eth2 --h-length 6 \
--destination-mac 01:00:5e:00:01:02 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
In the case of TCP connections, pickup facility has to be disabled
to avoid marking TCP ACK packets coming in the reply direction as
valid.
echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
BTW, some final notes:
* This match mangles the skbuff pkt_type in case that it detects
PACKET_MULTICAST for a non-multicast address. This may be done in
a PKTTYPE target for this sole purpose.
* This match supersedes the CLUSTERIP target.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit 784544739a (netfilter: iptables:
lock free counters) broke a number of modules whose rule data referenced
itself. A reallocation would not reestablish the correct references, so
it is best to use a separate struct that does not fall under RCU.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all. This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock. This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock. For now,
epoll remains the only user, but a future patch will use it to protect
f_flags as well.
Cc: Davide Libenzi <davidel@xmailserver.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The results is very unlikely change every so often so we
hardly need to divide again after doing that once for a
connection. Yet, if divide still becomes necessary we
detect that and do the right thing and again settle for
non-divide state. Takes the u16 space which was previously
taken by the plain xmit_size_goal.
This should take care part of the tso vs non-tso difference
we found earlier.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).
Bring xmit_goal_size parts into tcp.c
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
On x86_64, its rather unfortunate that "wait_queue_head_t wait"
field of "struct socket" spans two cache lines (assuming a 64
bytes cache line in current cpus)
offsetof(struct socket, wait)=0x30
sizeof(wait_queue_head_t)=0x18
This might explain why Kenny Chang noticed that his multicast workload
was performing bad with 64 bit kernels, since more cache lines ping pongs
were involved.
This litle patch moves "wait" field next "fasync_list" so that both
fields share a single cache line, to speedup sock_def_readable()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/home/rmk/linux-2.6-arm: (23 commits)
[ARM] Fix virtual to physical translation macro corner cases
[ARM] update mach-types
[ARM] 5421/1: ftrace: fix crash due to tracing of __naked functions
MX1 fix include
[ARM] 5419/1: ep93xx: fix build warnings about struct i2c_board_info
[ARM] 5418/1: restore lr before leaving mcount
ARM: OMAP: board-omap3beagle: set i2c-3 to 100kHz
ARM: OMAP: Allow I2C bus driver to be compiled as a module
ARM: OMAP: sched_clock() corrected
ARM: OMAP: Fix compile error if pm.h is included
[ARM] orion5x: pass dram mbus data to xor driver
[ARM] S3C64XX: Fix s3c64xx_setrate_clksrc
[ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/irq.c
[ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/s3c6400-clock.c
[ARM] S3C64XX: Fix USB host clock mux list
[ARM] S3C64XX: Fix name of USB host clock.
[ARM] S3C64XX: Rename IRQ_UHOST to IRQ_USBH
[ARM] S3C64XX: Do gpiolib configuration earlier
[ARM] S3C64XX: Staticise s3c64xx_init_irq_eint()
[ARM] SMDK6410: Declare iodesc table static
...
Stricter gfp_mask might be required for clone allocation.
For example, request-based dm may clone bio in interrupt context
so it has to use GFP_ATOMIC.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...
SUNRPC: xprt_connect() don't abort the task if the transport isn't bound
SUNRPC: Fix an Oops due to socket not set up yet...
Bug 11061, NFS mounts dropped
NFS: Handle -ESTALE error in access()
NLM: Fix GRANT callback address comparison when IPv6 is enabled
NLM: Shrink the IPv4-only version of nlm_cmp_addr()
NFSv3: Fix posix ACL code
NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)
SUNRPC: Tighten up the task locking rules in __rpc_execute()
Impact: cleanup
Add a new vm flag VM_PFN_AT_MMAP to identify a PFNMAP that is
fully mapped with remap_pfn_range. Patch removes the overloading
of VM_INSERTPAGE from the earlier patch.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Nick Piggin <npiggin@suse.de>
LKML-Reference: <20090313233543.GA19909@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I found the PPP subsystem to not work properly when connecting channels
with different speeds to the same bundle.
Problem Description:
As the "ppp_mp_explode" function fragments the sk_buff buffer evenly
among the PPP channels that are connected to a certain PPP unit to
make up a bundle, if we are transmitting using an upper layer protocol
that requires an Ack before sending the next packet (like TCP/IP for
example), we will have a bandwidth bottleneck on the slowest channel
of the bundle.
Let's clarify by an example. Let's consider a scenario where we have
two PPP links making up a bundle: a slow link (10KB/sec) and a fast
link (1000KB/sec) working at the best (full bandwidth). On the top we
have a TCP/IP stack sending a 1000 Bytes sk_buff buffer down to the
PPP subsystem. The "ppp_mp_explode" function will divide the buffer in
two fragments of 500B each (we are neglecting all the headers, crc,
flags etc?.). Before the TCP/IP stack sends out the next buffer, it
will have to wait for the ACK response from the remote peer, so it
will have to wait for both fragments to have been sent over the two
PPP links, received by the remote peer and reconstructed. The
resulting behaviour is that, rather than having a bundle working
@1010KB/sec (the sum of the channels bandwidths), we'll have a bundle
working @20KB/sec (the double of the slowest channels bandwidth).
Problem Solution:
The problem has been solved by redesigning the "ppp_mp_explode"
function in such a way to make it split the sk_buff buffer according
to the speeds of the underlying PPP channels (the speeds of the serial
interfaces respectively attached to the PPP channels). Referring to
the above example, the redesigned "ppp_mp_explode" function will now
divide the 1000 Bytes buffer into two fragments whose sizes are set
according to the speeds of the channels where they are going to be
sent on (e.g . 10 Byets on 10KB/sec channel and 990 Bytes on
1000KB/sec channel). The reworked function grants the same
performances of the original one in optimal working conditions (i.e. a
bundle made up of PPP links all working at the same speed), while
greatly improving performances on the bundles made up of channels
working at different speeds.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It closes a race in phy_stop_machine when reprogramming of phy_timer
(from phy_state_machine) happens between del_timer_sync and cancel_work_sync.
Without this change it could lead to crash if phy_device would be freed after
phy_stop_machine (timer would fire and schedule freed work).
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since bsg.h has recently been added to the list of kernel
headers that should be exported to the user space, this
attachment makes bsg.h more user space "friendly".
Specifically autotools dislike headers that don't compile
freestanding and bsg.h's use of __u32 types (and friends)
are not standard C (C90 or C99). The inclusion of
linux/types.h fixes that.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
dma_map_sg could return a value different to 'nents' argument of
dma_map_sg so the ide stack needs to save it for the later usage
(e.g. for_each_sg).
The ide stack also needs to save the original sg_nents value for
pci_unmap_sg.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
[bart: backport to Linus' tree]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This adds support to provide Fiber Channel over Ethernet (FCoE) offload
through net_device's net_device_ops struct. The offload through net_device
for FCoE is enabled in kernel as built-in or module driver.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Define feature flags for FCoE offloads.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reclaim 8 upper bits of netdev->features from GSO.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds eth type ETH_P_FCOE for Fibre Channel over Ethernet (FCoE),
consequently, the ETH_P_FCOE from fc_fcoe.h and fcoe skb->protocol
is not set as ETH_P_FCOE.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Impact: new feature
This adds the generic support for syscalls tracing. This is
currently exploited through a devoted tracer but other tracing
engines can use it. (They just have to play with
{start,stop}_ftrace_syscalls() and use the display callbacks
unless they want to override them.)
The syscalls prototypes definitions are abused here to steal
some metadata informations:
- syscall name, param types, param names, number of params
The syscall addr is not directly saved during this definition
because we don't know if its prototype is available in the
namespace. But we don't really need it. The arch has just to
build a function able to resolve the syscall number to its
metadata struct.
The current tracer prints the syscall names, parameters names
and values (and their types optionally). Currently the value is
a raw hex but higher level values diplaying is on my TODO list.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, potential bugfix
Not sure what changed to expose this, but clearly that numa_node_id()
doesn't belong in mmzone.h (the inline in gfp.h is probably overkill, too).
In file included from include/linux/topology.h:34,
from arch/x86/mm/numa.c:2:
/home/rusty/patches-cpumask/linux-2.6/arch/x86/include/asm/topology.h:64:1: warning: "numa_node_id" redefined
In file included from include/linux/topology.h:32,
from arch/x86/mm/numa.c:2:
include/linux/mmzone.h:770:1: warning: this is the location of the previous definition
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Mike Travis <travis@sgi.com>
LKML-Reference: <200903132343.37661.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: documentation
struct irqaction is not documented. Add kernel doc comments and add
interrupt.h to the genirq docbook.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: save memory
The struct dyn_ftrace table is very large, this patch will save
about 50%.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49BA2C9F.8020009@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, no code changed
Remove an ugly #ifdef CONFIG_SMP from panic(), by providing
an smp_send_stop() wrapper on UP too.
LKML-Reference: <49B91A7E.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide basic callbacks to do syscall tracing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <1236401580-5758-2-git-send-email-fweisbec@gmail.com>
[ simplified it to a trace_printk() for now. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
node_to_cpumask (and the blecherous node_to_cpumask_ptr which
contained a declaration) are replaced now everyone implements
cpumask_of_node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: fix false positive PAT warnings - also fix VirtalBox hang
Use of vma->vm_pgoff to identify the pfnmaps that are fully
mapped at mmap time is broken. vm_pgoff is set by generic mmap
code even for cases where drivers are setting up the mappings
at the fault time.
The problem was originally reported here:
http://marc.info/?l=linux-kernel&m=123383810628583&w=2
Change is_linear_pfn_mapping logic to overload VM_INSERTPAGE
flag along with VM_PFNMAP to mean full PFNMAP setup at mmap
time.
Problem also tracked at:
http://bugzilla.kernel.org/show_bug.cgi?id=12800
Reported-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha>@intel.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: "ebiederm@xmission.com" <ebiederm@xmission.com>
Cc: <stable@kernel.org> # only for 2.6.29.1, not .28
LKML-Reference: <20090313004527.GA7176@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create a 'softirq_to_name' array, which is indexed by softirq #, so
that we can easily convert between the softirq index # and its name, in
order to get more meaningful output messages.
LKML-Reference: <20090312183336.GB3352@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: fix callsites with dynamic format strings
Since its new binary implementation, trace_printk() internally uses static
containers for the format strings on each callsites. But the value is
assigned once at build time, which means that it can't take dynamic
formats.
So this patch unearthes the raw trace_printk implementation for the callers
that will need trace_printk to be able to carry these dynamic format
strings. The trace_printk() macro will use the appropriate implementation
for each callsite. Most of the time however, the binary implementation will
still be used.
The other impact of this patch is that mmiotrace_printk() will use the old
implementation because it calls the low level trace_vprintk and we can't
guess here whether the format passed in it is dynamic or not.
Some parts of this patch have been written by Steven Rostedt (most notably
the part that chooses the appropriate implementation for each callsites).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: cleanup
The naming clashes with upcoming softirq tracepoints, so rename the
APIs to lockdep_*().
Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is a fix for the following crash observed in 2.6.29-rc3:
http://lkml.org/lkml/2009/1/29/150
On ARM it doesn't make sense to trace a naked function because then
mcount is called without stack and frame pointer being set up and there
is no chance to restore the lr register to the value before mcount was
called.
Reported-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Tested-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Cc: Abhishek Sagar <sagar.abhishek@gmail.com>
Cc: Steven Rostedt <rostedt@home.goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Modify remove_irq() to match setup_irq().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120551.2926.43942.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new API
This patch adds a remove_irq() function for releasing
interrupts requested with setup_irq().
Without this patch we have no way of releasing such
interrupts since free_irq() today tries to kfree()
the irqaction passed with setup_irq().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120542.2926.56609.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
mm_cpumask() now and won't see a difference as the changes roll into
linux-next.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
tsk_cpumask() now and won't see a difference as the changes roll into
linux-next.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Provide an api to attempt to load any necessary kernel RPC
client transport module automatically. By convention, the
desired module name is "xprt"+"transport name". For example,
when NFS mounting with "-o proto=rdma", attempt to load the
"xprtrdma" module.
Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The following patch is a combination of a patch by myself and Peter
Staubach.
Trond: If we allow other processes to dirty pages while a process is doing
a consistency sync to disk, we can end up never making progress.
Peter: Attached is a patch which addresses a continuing problem with
the NFS client generating out of order WRITE requests. While
this is compliant with all of the current protocol
specifications, there are servers in the market which can not
handle out of order WRITE requests very well. Also, this may
lead to sub-optimal block allocations in the underlying file
system on the server. This may cause the read throughputs to
be reduced when reading the file from the server.
Peter: There has been a lot of work recently done to address out of
order issues on a systemic level. However, the NFS client is
still susceptible to the problem. Out of order WRITE
requests can occur when pdflush is in the middle of writing
out pages while the process dirtying the pages calls
generic_file_buffered_write which calls
generic_perform_write which calls
balance_dirty_pages_rate_limited which ends up calling
writeback_inodes which ends up calling back into the NFS
client to writes out dirty pages for the same file that
pdflush happens to be working with.
Signed-off-by: Peter Staubach <staubach@redhat.com>
[modification by Trond to merge the two similar patches]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Certain asynchronous operations such as write() do not expect
(or care) that other metadata such as the file owner, mode, acls, ...
change. All they want to do is update and/or check the change attribute,
ctime, and mtime.
By skipping the file owner and group update, we also avoid having to do a
potential idmapper upcall for these asynchronous RPC calls.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
There is no point in using anything other than umode_t, since we copy the
content pretty much directly into inode->i_mode.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We don't need the bitmap[] field anymore, since the 'valid' field tells us
all we need to know about which attributes were filled in...
Also move the pre-op attributes in order to improve the structure packing.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, filling struct nfs_fattr is more or less an all or nothing
operation, since NFSv2 and NFSv3 have only mandatory attributes.
In NFSv4, some attributes are optional, and so we may simply not be able to
fill in those fields. Furthermore, NFSv4 allows you to specify which
attributes you are interested in retrieving, thus permitting you to
optimise away retrieval of attributes that you know will no change...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The WM8400 is a highly integrated audio CODEC and power management unit
intended for mobile multimedia application. This driver supports the
primary audio CODEC features, including:
- 1W speaker driver
- Fully differential headphone output
- Up to 4 differential microphone inputs
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Clean up/micro-optimatization: Make the AF_INET-only version of
nlm_cmp_addr() smaller. This matches the style of
nlm_privileged_requester(), and makes the AF_INET-only version of
nlm_cmp_addr() nearly the same size as it was before IPv6 support.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix a memory leak due to allocation in the XDR layer. In cases where the
RPC call needs to be retransmitted, we end up allocating new pages without
clearing the old ones. Fix this by moving the allocation into
nfs3_proc_setacls().
Also fix an issue discovered by Kevin Rudd, whereby the amount of memory
reserved for the acls in the xdr_buf->head was miscalculated, and causing
corruption.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Impact: clean up
There existed a lot of <space><tab>'s in the tracing code. This
patch removes them.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up / comments
Kosaki Motohiro asked about an explanation to the TRACE_EVENT macro.
Ingo Molnar replied with a nice description.
This patch takes the description that Ingo wrote (with some slight
modifications) and adds it to the tracepoint.h file.
Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up
In trying to stay consistant with the C style format in the TRACE_EVENT
macro, it makes more sense to do the printk after the assigning of
the variables.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: code reorganization
Separate out embedding first chunk setup helper from x86 embedding
first chunk allocator and put it in mm/percpu.c. This will be used by
the default percpu first chunk allocator and possibly by other archs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: cleanup, more flexibility for first chunk init
Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
This restriction stemmed from implementation detail and made things a
bit less intuitive. This patch allows @dyn_size to be specified
regardless of @unit_size and swaps the positions of @dyn_size and
@unit_size so that the parameter order makes more sense (static,
reserved and dyn sizes followed by enclosing unit_size).
While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: clean up
The TRACE_EVENT_FORMAT macro is no longer used by trace points
and only the DECLARE_TRACE, TRACE_FORMAT or TRACE_EVENT macros should
be used by them. Although the TRACE_EVENT_FORMAT macro is still used
by the internal tracing utility, it should not be used in core
kernel code.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up and enhancement
The TRACE_EVENT_FORMAT macro looks quite ugly and is limited in its
ability to save data as well as to print the record out. Working with
Ingo Molnar, we came up with a new format that is much more pleasing to
the eye of C developers. This new macro is more C style than the old
macro, and is more obvious to what it does.
Here's the example. The only updated macro in this patch is the
sched_switch trace point.
The old method looked like this:
TRACE_EVENT_FORMAT(sched_switch,
TP_PROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TP_ARGS(rq, prev, next),
TP_FMT("task %s:%d ==> %s:%d",
prev->comm, prev->pid, next->comm, next->pid),
TRACE_STRUCT(
TRACE_FIELD(pid_t, prev_pid, prev->pid)
TRACE_FIELD(int, prev_prio, prev->prio)
TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
next_comm,
TP_CMD(memcpy(TRACE_ENTRY->next_comm,
next->comm,
TASK_COMM_LEN)))
TRACE_FIELD(pid_t, next_pid, next->pid)
TRACE_FIELD(int, next_prio, next->prio)
),
TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
);
The above method is hard to read and requires two format fields.
The new method:
/*
* Tracepoint for task switches, performed by the scheduler:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_switch,
TP_PROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TP_ARGS(rq, prev, next),
TP_STRUCT__entry(
__array( char, prev_comm, TASK_COMM_LEN )
__field( pid_t, prev_pid )
__field( int, prev_prio )
__array( char, next_comm, TASK_COMM_LEN )
__field( pid_t, next_pid )
__field( int, next_prio )
),
TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
__entry->next_comm, __entry->next_pid, __entry->next_prio),
TP_fast_assign(
memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
__entry->prev_pid = prev->pid;
__entry->prev_prio = prev->prio;
memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
__entry->next_pid = next->pid;
__entry->next_prio = next->prio;
)
);
This macro is called TRACE_EVENT, it is broken up into 5 parts:
TP_PROTO: the proto type of the trace point
TP_ARGS: the arguments of the trace point
TP_STRUCT_entry: the structure layout of the entry in the ring buffer
TP_printk: the printk format
TP_fast_assign: the method used to write the entry into the ring buffer
The structure is the definition of how the event will be saved in the
ring buffer. The printk is used by the internal tracing in case of
an oops, and the kernel needs to print out the format of the record
to the console. This the TP_printk gives a means to show the records
in a human readable format. It is also used to print out the data
from the trace file.
The TP_fast_assign is executed directly. It is basically like a C function,
where the __entry is the handle to the record.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up
The macros TPPROTO, TPARGS, TPFMT, TPRAWFMT, and TPCMD all look a bit
ugly. This patch adds an underscore to their names.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
[randy.dunlap@oracle.com: don't use bus_id]
[dtor@mail.ru: locking and other fixups]
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
This reverts commit e088e4c9cd.
Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.
Course of action:
- Find out the remaining causes of overheating, and fix them
if possible. ACPI should be doing the right thing automatically.
If it isn't, we need to fix that.
- mark p4-clockmod ui as deprecated
- try again with the removal in six months.
It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
Signed-off-by: Dave Jones <davej@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
p54: fix race condition in memory management
cfg80211: test before subtraction on unsigned
iwlwifi: fix error flow in iwl*_pci_probe
rt2x00 : more devices to rt73usb.c
rt2x00 : more devices to rt2500usb.c
bonding: Fix device passed into ->ndo_neigh_setup().
vlan: Fix vlan-in-vlan crashes.
net: Fix missing dev->neigh_setup in register_netdevice().
tmspci: fix request_irq race
pkt_sched: act_police: Fix a rate estimator test.
tg3: Fix 5906 link problems
SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
IPv6: add "disable" module parameter support to ipv6.ko
sungem: another error printed one too early
aoe: error printed 1 too early
net pcmcia: worklimit reaches -1
net: more timeouts that reach -1
net: fix tokenring license
dm9601: new vendor/product IDs
netlink: invert error code in netlink_set_err()
...
Impact: micro-optimization
trace_printk() does this unconditionally:
trace_printk_fmt = fmt;
Where trace_printk_fmt is an entry into a global array. This is
very SMP-unfriendly.
So only write it once per bootup.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds a generic driver for rotary encoders connected to GPIO
pins of a system. It relies on gpiolib and generic hardware irqs. The
documentation that also comes with this patch explains the concept and
how to use the driver.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmatest: fix use after free in dmatest_exit
ipu_idmac: fix spinlock type
iop-adma, mv_xor: fix mem leak on self-test setup failure
fsldma: fix off by one in dma_halt
I/OAT: fail self-test if callback test reaches timeout
I/OAT: update driver version and copyright dates
I/OAT: list usage cleanup
I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
I/OAT: cancel watchdog before dma remove
I/OAT: fail initialization on zero channels detection
I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
dmaengine: update kerneldoc
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ata: add CFA specific identify data words
remove stale comment from <linux/hdreg.h>
AT91: initialize Compact Flash on AT91SAM9263 cpu
ide: add at91_ide driver
ide: allow to wrap interrupt handler
ide-iops: fix odd-length ATAPI PIO transfers
ide: NULL noise: drivers/ide/ide-*.c
ide: expiry() returns int, negative expiry() return values won't be noticed
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Don't trust current capacity values in identify words 57-58
libata: make sure port is thawed when skipping resets
sata_nv: fix module parameter description
ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
libata: don't use on-stack sense buffer
libata: align ap->sector_buf
libata: fix dma_unmap_sg misuse
libata: change drive ready wait after hard reset to 5s
Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.
Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Impact: faster and lighter tracing
Now that we have trace_bprintk() which is faster and consume lesser
memory than trace_printk() and has the same purpose, we can now drop
the old implementation in favour of the binary one from trace_bprintk(),
which means we move all the implementation of trace_bprintk() to
trace_printk(), so the Api doesn't change except that we must now use
trace_seq_bprintk() to print the TRACE_PRINT entries.
Some changes result of this:
- Previously, trace_bprintk depended of a single tracer and couldn't
work without. This tracer has been dropped and the whole implementation
of trace_printk() (like the module formats management) is now integrated
in the tracing core (comes with CONFIG_TRACING), though we keep the file
trace_printk (previously trace_bprintk.c) where we can find the module
management. Thus we don't overflow trace.c
- changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.
- change a bit trace_printk/trace_vprintk macros to support non-builtin formats
constants, and fix 'const' qualifiers warnings. But this is all transparent for
developers.
- etc...
V2:
- Rebase against last changes
- Fix mispell on the changelog
V3:
- Rebase against last changes (moving trace_printk() to kernel.h)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add a generic printk() for tracing, like trace_printk()
trace_bprintk() uses the infrastructure to record events on ring_buffer.
[ fweisbec@gmail.com: ported to latest -tip, made it work if
!CONFIG_MODULES, never free the format strings from modules
because we can't keep track of them and conditionnaly create
the ftrace format strings section (reported by Steven Rostedt) ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: save on memory for tracing
Current tracers are typically using a struct(like struct ftrace_entry,
struct ctx_switch_entry, struct special_entr etc...)to record a binary
event. These structs can only record a their own kind of events.
A new kind of tracer need a new struct and a lot of code too handle it.
So we need a generic binary record for events. This infrastructure
is for this purpose.
[fweisbec@gmail.com: rebase against latest -tip, make it safe while sched
tracing as reported by Steven Rostedt]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new APIs for binary trace printk infrastructure
vbin_printf(): write args to binary buffer, string is copied
when "%s" is occurred.
bstr_printf(): read from binary buffer for args and format a string
[fweisbec@gmail.com: rebase]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1236356510-8381-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.
A mutex has been chosen so that kprobes, the main user of this, can sleep
during memory allocation between the memory read of the instructions it
must replace and the memory write of the breakpoint.
Other user of this interface: immediate values.
Paravirt and alternatives are always done when SMP is inactive, so there
is no need to use locks.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <49B142D8.7020601@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix relocation overflow during module load
x86_64 uses 32bit relocations for symbol access and static percpu
symbols whether in core or modules must be inside 2GB of the percpu
segement base which the dynamic percpu allocator doesn't guarantee.
This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
first chunk so that module percpu areas are always allocated from the
first chunk which is always inside the relocatable range.
This problem exists for any percpu allocator but is easily triggered
when using the embedding allocator because the second chunk is located
beyond 2GB on it.
This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
that it only indicates the size of the area to reserve for dynamic
allocation as static and dynamic areas can be separate. New
PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
the reserved area separation eats away some allocatable space and
having slightly more headroom (currently between 4 and 8k after
minimal boot sans module area) makes sense for common case
performance.
x86_32 can address anywhere from anywhere and doesn't need reserving.
Mike Galbraith first reported the problem first and bisected it to the
embedding percpu allocator commit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Impact: add reserved allocation functionality and use it for module
percpu variables
This patch implements reserved allocation from the first chunk. When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator. This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.
If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation. If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation. The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.
If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: argument semantic cleanup
In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant
auto-sizing. It's okay for @unit_size as 0 doesn't make sense but 0
dynamic reserve size is valid. Alos, if arch @dyn_size is calculated
from other parameters, it might end up passing in 0 @dyn_size and
malfunction when the size is automatically adjusted.
This patch makes both @unit_size and @dyn_size ssize_t and use -1 for
auto sizing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: cosmetic, preparation for future changes
Make the following renames in pcpur_setup_first_chunk() in preparation
for future changes.
* s/free_size/dyn_size/
* s/static_vm/first_vm/
* s/static_chunk/schunk/
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: cleaup
Make the following cleanups.
* There isn't much arch-specific about PERCPU_MODULE_RESERVE. Always
define it whether arch overrides PERCPU_ENOUGH_ROOM or not.
* blackfin overrides PERCPU_ENOUGH_ROOM to align static area size. Do
it by default.
* percpu allocation sizes doesn't have much to do with the page size.
Don't use PAGE_SHIFT in their definition.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bryan Wu <cooloney@kernel.org>
This adds SSB functionality to register a fallback SPROM image from the
architecture setup code.
Weird architectures exist that have half-assed SSB devices without SPROM attached to
their PCI busses. The architecture can register a fallback SPROM image that is
used if no SPROM is found on the SSB device.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Florian Fainelli <florian@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Declare CFA specific identify data words 162 and 163 for future use.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
[bart: update patch summary/description]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Impact: cleanup
The functions tracing_start/tracing_stop have been moved to kernel.h.
These are not the functions a developer most likely wants to use
when they want to insert a place to stop tracing and restart it from
user space.
tracing_start/tracing_stop was created to work with things like
suspend to ram, where even calling smp_processor_id() can crash the
system. The tracing_start/tracing_stop was used to stop the tracer from
doing anything. These are still light weight functions, but add a bit
more overhead to be able to stop the tracers. They also have no interface
back to userland. That is, if the kernel calls tracing_stop, userland
can not start tracing.
What a developer most likely wants to use is tracing_on/tracing_off.
These are very light weight functions (simply sets or clears a bit).
These functions just stop recording into the ring buffer. The tracers
don't even know that this happens except that they would receive NULL
from the ring_buffer_lock_reserve function.
Also, there's a way for the user land to enable or disable this bit.
In debugfs/tracing/tracing_on, a user may echo "0" (same as tracing_off())
or echo "1" (same as tracing_on()) into this file. This becomes handy when
a kernel developer is debugging and wants tracing to turn off when it
hits an anomaly. Then the developer can examine the trace, and restart
tracing if they want to try again (echo 1 > tracing_on).
This patch moves the prototypes for tracing_on/tracing_off to kernel.h
and comments their use, so that a kernel developer will know how
to use them.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
ap->sector_buf is used as DMA target and should at least be aligned on
cacheline. This caused problems on some embedded machines.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata passes the returned value of dma_map_sg() to
dma_unmap_sg(),which is the misuse of dma_unmap_sg().
DMA-mapping.txt says:
To unmap a scatterlist, just call:
pci_unmap_sg(pdev, sglist, nents, direction);
Again, make sure DMA activity has already finished.
PLEASE NOTE: The 'nents' argument to the pci_unmap_sg call must be
the _same_ one you passed into the pci_map_sg call,
it should _NOT_ be the 'count' value _returned_ from the
pci_map_sg call.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This fixes problems during resume with drives that take longer than 1s to
be ready. The ATA-6 spec appears to allow 5 seconds for a drive to be
ready.
On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.
Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which also
times out, then an IDENTIFY command is sent to the drive which times out,
and finally the error handler will try to send another hard reset which
will finally get things working.
Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Impact: decrease hangs risks with the graph tracer on slow systems
Since the function graph tracer can spend too much time on timer
interrupts, it's better now to use the more lightweight local
clock. Anyway, the function graph traces are more reliable on a
per cpu trace.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <49af243d.06e9300a.53ad.ffff840c@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make common utility functions such as trace_printk() and
tracing_start()/tracing_stop() generally available to kernel
code.
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Use a more generic name - this also allows the prototype to move
to kernel.h and be generally available to kernel developers who
want to do some quick tracing.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As analyzed by Patrick McHardy, vlan needs to reset it's
netdev_ops pointer in it's ->init() function but this
leaves the compat method pointers stale.
Add a netdev_resync_ops() and call it from the vlan code.
Any other driver which changes ->netdev_ops after register_netdevice()
will need to call this new function after doing so too.
With help from Patrick McHardy.
Tested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is possible to do just about everything with the arp table
from user space except treat an entry like you are using it. To that end
implement and a flag NTF_USE that when set in a netwlink update request
treats the neighbour table entry like the kernel does on the output path.
This allows user space applications to share the kernel's arp cache.
Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current "comp" crypto interface supports one-shot (de)compression only,
i.e. the whole data buffer to be (de)compressed must be passed at once, and
the whole (de)compressed data buffer will be received at once.
In several use-cases (e.g. compressed file systems that store files in big
compressed blocks), this workflow is not suitable.
Furthermore, the "comp" type doesn't provide for the configuration of
(de)compression parameters, and always allocates workspace memory for both
compression and decompression, which may waste memory.
To solve this, add a "pcomp" partial (de)compression interface that provides
the following operations:
- crypto_compress_{init,update,final}() for compression,
- crypto_decompress_{init,update,final}() for decompression,
- crypto_{,de}compress_setup(), to configure (de)compression parameters
(incl. allocating workspace memory).
The (de)compression methods take a struct comp_request, which was mimicked
after the z_stream object in zlib, and contains buffer pointer and length
pairs for input and output.
The setup methods take an opaque parameter pointer and length pair. Parameters
are supposed to be encoded using netlink attributes, whose meanings depend on
the actual (name of the) (de)compression algorithm.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ring_buffer_read_page was broken if it were to only copy part
of the page. This patch fixes that up as well as adds a parameter
to allow a length field, in order to only copy part of the buffer page.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: don't allow setuid to succeed if the user does not have rt bandwidth
sched_rt: don't start timer when rt bandwidth disabled
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Teach RCU that idle task is not quiscent state at boot
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
fix warning in io_mapping_map_wc()
x86: i915 needs pgprot_writecombine() and is_io_mapping_possible()
The current definition of CALLER_ADDRx isn't suitable for all platforms.
E.g. for ARM __builtin_return_address(N) doesn't work for N > 0 and
AFAIK for powerpc there are no frame pointers needed to have a working
__builtin_return_address. This patch allows defining the CALLER_ADDRx
macros in <asm/ftrace.h> and let these take precedence.
Because now <asm/ftrace.h> is included unconditionally in
<linux/ftrace.h> all archs that don't already had this include get an
empty one for free.
Signed-off-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Fix skbuff.h kernel-doc for timestamps: must include "struct" keyword,
otherwise there are kernel-doc errors:
Error(linux-next-20090227//include/linux/skbuff.h:161): cannot understand prototype: 'struct skb_shared_hwtstamps '
Error(linux-next-20090227//include/linux/skbuff.h:177): cannot understand prototype: 'union skb_shared_tx '
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the device to give the driver RX data with reorder information.
When that is done, the device will indicate the driver if a packet has
to be held in a (sorted) queue. It will also tell the driver when held
packets have to be released to the OS.
This is done to improve the WiMAX-protocol level retransmission
support when missing frames are detected.
The code docs provide details about the implementation.
In general, this just hooks into the RX path in rx.c; if a packet with
the reorder bit in the RX header is detected, the reorder information
in the header is extracted and one of the four main reorder operations
are executed. In one case (queue) no packet will be delivered to the
networking stack, just queued, whereas in the others (reset, update_ws
and queue_update_ws), queued packet might be delivered depending on
the window start for the specific queue.
The modifications to files other than rx.c are:
- control.c: during device initialization, enable reordering support
if the rx_reorder_disabled module parameter is not enabled
- driver.c: expose a rx_reorder_disable module parameter and call
i2400m_rx_setup/release() to initialize/shutdown RX reorder
support.
- i2400m.h: introduce members in 'struct i2400m' needed for
implementing reorder support.
- linux/i2400m.h: introduce TLVs, commands and constant definitions
related to RX reorder
Last but not least, the rx reorder code includes an small circular log
where the last N reorder operations are recorded to be displayed in
case of inconsistency. Otherwise diagnosing issues would be almost
impossible.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer i2400m firmwares (>= v1.4) extend the data RX protocol so that
each packet has a 16 byte header. This header is mainly used to
implement host reordeing (which is addressed in later commits).
However, this header also allows us to overwrite it (once data has
been extracted) with an Ethernet header and deliver to the networking
stack without having to reallocate the skb (as it happened in fw <=
v1.3) to make room for it.
- control.c: indicate the device [dev_initialize()] that the driver
wants to use the extended data RX protocol. Also involves adding the
definition of the needed data types in include/linux/wimax/i2400m.h.
- rx.c: handle the new payload type for the extended RX data
protocol. Prepares the skb for delivery to
netdev.c:i2400m_net_erx().
- netdev.c: Introduce i2400m_net_erx() that adds the fake ethernet
address to a prepared skb and delivers it to the networking
stack.
- cleanup: in most instances in rx.c, the variable 'single' was
renamed to 'single_last' for it better conveys its meaning.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For power saving reasons, WiMAX links can be put in idle mode while
connected after a certain time of the link not being used for tx or
rx. In this mode, the device pages the base-station regularly and when
data is ready to be transmitted, the link is revived.
This patch allows the user to control the time the device has to be
idle before it decides to go to idle mode from a sysfs
interace.
It also updates the initialization code to acknowledge the module
variable 'idle_mode_disabled' when the firmware is a newer version
(upcoming 1.4 vs 2.6.29's v1.3).
The method for setting the idle mode timeout in the older firmwares is
much more limited and can be only done at initialization time. Thus,
the sysfs file will return -ENOSYS on older ones.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also fixes insignificant bug that would cause sending of stale
SACK block (would occur in some corner cases).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: standardize IO on cached ops
On modern CPUs it is almost always a bad idea to use non-temporal stores,
as the regression in this commit has shown it:
30d697f: x86: fix performance regression in write() syscall
The kernel simply has no good information about whether using non-temporal
stores is a good idea or not - and trying to add heuristics only increases
complexity and inserts fragility.
The regression on cached write()s took very long to be found - over two
years. So dont take any chances and let the hardware decide how it makes
use of its caches.
The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are
absolutely sure that another entity (the GPU) will pick up the dirty
data immediately and that the CPU will not touch that data before the
GPU will.
Also, keep the _nocache() primitives to make it easier for people to
experiment with these details. There may be more clear-cut cases where
non-cached copies can be used, outside of filemap.c.
Cc: Salman Qazi <sqazi@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 17581ad812.
Sitsofe Wheeler reported that /dev/dri/card0 is MIA on his EeePC 900
and bisected it to this commit.
Graphics card is an i915 in an EeePC 900:
00:02.0 VGA compatible controller [0300]:
Intel Corporation Mobile 915GM/GMS/910GML
Express Graphics Controller [8086:2592] (rev 04)
( Most likely the ioremap() of the driver failed and hence the card
did not initialize. )
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Bisected-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The DCB netlink interface is required for building the userspace tools
available at e1000.sourceforge.net
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) add an include for <linux/types.h>
2) change dcbmsg.dcb_family from unsigned char to __u8 to be more
consistent with use of kernel types
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style
faster tracing for the sched subsystem trace points.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: enable DMAR by default
xen: disable interrupts early, as start_kernel expects
gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_t
gpu/drm, x86, PAT: Handle io_mapping_create_wc() errors in a clean way
x86, Voyager: fix compile by lifting the degeneracy of phys_cpu_present_map
x86, doc: fix references to Documentation/x86/i386/boot.txt
free_uid() and free_user_ns() are corecursive when CONFIG_USER_SCHED=n,
but free_user_ns() is called from free_uid() by way of uid_hash_remove(),
which requires uidhash_lock to be held. free_user_ns() then calls
free_uid() to complete the destruction.
Fix this by deferring the destruction of the user_namespace.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The TX/RX packet counters are needed to fill in RADIUS Accounting
attributes Acct-Output-Packets and Acct-Input-Packets. We already
collect the needed information, but only the TX/RX bytes were
previously exposed through nl80211. Allow applications to fetch the
packet counters, too, to provide more complete support for accounting.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Impact: fix hung task with certain (non-default) rt-limit settings
Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.
On failure, return EINVAL, which is also returned in
CONFIG_CGROUP_SCHED.
Reported-by: Corey Hickey <bugfood-ml@fatooh.org>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of keeping the single vector -> single linux irq mapping
we extend the intc code to support merging of vectors to a single
linux irq. This helps processors such as sh7750, sh7780 and sh7785
which have more vectors than masking ability. With this patch in
place we can modify the intc tables to use one irq per maskable
irq source. Please note the following:
- If multiple vectors share the same enum then only the
first vector will be available as a linux irq.
- Drivers may need to be rewritten to get pending irq
source from the hardware block instead of irq number.
This patch together with the sh7785 specific intc tables solves
DMA controller irq issues related to buggy interrupt masking.
Reported-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Applications include this header in order to use RDS sockets.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RDS is a reliable datagram protocol used for IPC on Oracle
database clusters. This adds address and protocol family numbers
for it.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates the maple bus to support asynchronous block reads
and writes as well as generally improving the quality of the code and
supporting concurrency (all needed to support the Dreamcast visual
memory unit - a driver will also be posted for that).
Changes in the bus driver necessitate some changes in the two maple bus
input drivers that are currently in mainline.
As well as supporting block reads and writes this code clean up removes
some poor handling of locks, uses an atomic status variable to serialise
access to devices and more robusly handles the general performance
problems of the bus.
Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allow CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures to still specify
that their sched_clock() implementation is reliable.
This will be used by x86 to switch on a faster sched_clock_cpu()
implementation on certain CPU types.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: implement new tracing timestamp APIs
Add three trace clock variants, with differing scalability/precision
tradeoffs:
- local: CPU-local trace clock
- medium: scalable global clock with some jitter
- global: globally monotonic, serialized clock
Make the ring-buffer use the local trace clock internally.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes a bug located by Vegard Nossum with the aid of
kmemcheck, updated based on review comments from Nick Piggin,
Ingo Molnar, and Andrew Morton. And cleans up the variable-name
and function-name language. ;-)
The boot CPU runs in the context of its idle thread during boot-up.
During this time, idle_cpu(0) will always return nonzero, which will
fool Classic and Hierarchical RCU into deciding that a large chunk of
the boot-up sequence is a big long quiescent state. This in turn causes
RCU to prematurely end grace periods during this time.
This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
function to ignore the idle task as a quiescent state until the
system has started up the scheduler in rest_init(), introducing a
new non-API function rcu_idle_now_means_idle() to inform RCU of this
transition. RCU maintains an internal rcu_idle_cpu_truthful variable
to track this state, which is then used by rcu_check_callback() to
determine if it should believe idle_cpu().
Because this patch has the effect of disallowing RCU grace periods
during long stretches of the boot-up sequence, this patch also introduces
Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
no-op if num_online_cpus() returns 1. This allows boot-time code that
calls synchronize_rcu() to proceed normally. Note, however, that RCU
callbacks registered by call_rcu() will likely queue up until later in
the boot sequence. Although rcuclassic and rcutree can also use this
same optimization after boot completes, rcupreempt must restrict its
use of this optimization to the portion of the boot sequence before the
scheduler starts up, given that an rcupreempt RCU read-side critical
section may be preeempted.
In addition, this patch takes Nick Piggin's suggestion to make the
system_state global variable be __read_mostly.
Changes since v4:
o Changes the name of the introduced function and variable to
be less emotional. ;-)
Changes since v3:
o WARN_ON(nr_context_switches() > 0) to verify that RCU
switches out of boot-time mode before the first context
switch, as suggested by Nick Piggin.
Changes since v2:
o Created rcu_blocking_is_gp() internal-to-RCU API that
determines whether a call to synchronize_rcu() is itself
a grace period.
o The definition of rcu_blocking_is_gp() for rcuclassic and
rcutree checks to see if but a single CPU is online.
o The definition of rcu_blocking_is_gp() for rcupreempt
checks to see both if but a single CPU is online and if
the system is still in early boot.
This allows rcupreempt to again work correctly if running
on a single CPU after booting is complete.
o Added check to rcupreempt's synchronize_sched() for there
being but one online CPU.
Tested all three variants both SMP and !SMP, booted fine, passed a short
rcutorture test on both x86 and Power.
Located-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra warned that TPPROTO and TPARGS might become something
other than a simple copy of itself. To prevent this from having
side effects in the TRACE_FORMAT macro in tracepoint.h, we add a
PARAMS() macro to be defined as just a wrapper.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should
be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it
TRACE_FORMAT.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
UP __alloc_percpu() triggered WARN_ON_ONCE() if the requested
alignment is larger than that of unsigned long long, which is too
small for all the cacheline aligned allocations. Bump it up to
SMP_CACHE_BYTES which kmalloc allocations generally guarantee.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
During host driver module removal del_gendisk() results in a final
put on drive->gendev and freeing the drive by drive_release_dev().
Convert device drivers from using struct kref to use struct device
so device driver's object holds reference on ->gendev and prevents
drive from prematurely going away.
Also fix ->remove methods to not erroneously drop reference on a
host driver by using only put_device() instead of ide*_put().
Reported-by: Stanislaw Gruszka <stf_xl@wp.pl>
Tested-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Impact: build fix
the !SMP branch had a 'gfp' leftover:
include/linux/percpu.h: In function '__alloc_percpu':
include/linux/percpu.h:160: error: 'gfp' undeclared (first use in this function)
include/linux/percpu.h:160: error: (Each undeclared identifier is reported only once
include/linux/percpu.h:160: error: for each function it appears in.)
Use GFP_KERNEL like the SMP version does.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
the code so that we can use CSD_FLAG_LOCK for both purposes.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make io_mapping_create_wc and io_mapping_free go through PAT to make sure
that there are no memory type aliases.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
io_mapping_create_wc should take a resource_size_t parameter in place of
unsigned long. With unsigned long, there will be no way to map greater than 4GB
address in i386/32 bit.
On x86, greater than 4GB addresses cannot be mapped on i386 without PAE. Return
error for such a case.
Patch also adds a structure for io_mapping, that saves the base, size and
type on HAVE_ATOMIC_IOMAP archs, that can be used to verify the offset on
io_mapping_map calls.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, enable future change
Add a 'total bytes copied' parameter to __copy_from_user_*nocache(),
and update all the callsites.
The parameter is not used yet - architecture code can use it to
more intelligently decide whether the copy should be cached or
non-temporal.
Cc: Salman Qazi <sqazi@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch changes the return value of nlmsg_notify() as follows:
If NETLINK_BROADCAST_ERROR is set by any of the listeners and
an error in the delivery happened, return the broadcast error;
else if there are no listeners apart from the socket that
requested a change with the echo flag, return the result of the
unicast notification. Thus, with this patch, the unicast
notification is handled in the same way of a broadcast listener
that has set the NETLINK_BROADCAST_ERROR socket flag.
This patch is useful in case that the caller of nlmsg_notify()
wants to know the result of the delivery of a netlink notification
(including the broadcast delivery) and take any action in case
that the delivery failed. For example, ctnetlink can drop packets
if the event delivery failed to provide reliable logging and
state-synchronization at the cost of dropping packets.
This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
This allows for the developers to place format strings and
args in with their tracepoint declaration. A tracer may now
override the DEFINE_TRACE_FMT macro and use it to record
a default format.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
The unit in which user-space can set the bus timeout value is jiffies
for historical reasons (back when HZ was always 100.) This is however
not good because user-space doesn't know how long a jiffy lasts. The
timeout value should instead be set in a fixed time unit. Given the
original value of HZ, this unit should be 10 ms, for compatibility.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
iptables imports headers from (the unifdefed headers of a)
kernel tree, but some headers happened to not be installed.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Impact: more latitude for first percpu chunk allocation
The first percpu chunk serves the kernel static percpu area and may or
may not contain extra room for further dynamic allocation.
Initialization of the first chunk needs to be done before normal
memory allocation service is up, so it has its own init path -
pcpu_setup_static().
It seems archs need more latitude while initializing the first chunk
for example to take advantage of large page mapping. This patch makes
the following changes to allow this.
* Define PERCPU_DYNAMIC_RESERVE to give arch hint about how much space
to reserve in the first chunk for further dynamic allocation.
* Rename pcpu_setup_static() to pcpu_setup_first_chunk().
* Make pcpu_setup_first_chunk() much more flexible by fetching page
pointer by callback and adding optional @unit_size, @free_size and
@base_addr arguments which allow archs to selectively part of chunk
initialization to their likings.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: allow larger alignment for early vmalloc area allocation
Some early vmalloc users might want larger alignment, for example, for
custom large page mapping. Add @align to vm_area_register_early().
While at it, drop docbook comment on non-existent @size.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Impact: cleanup and addition of missing interface wrapper
The interface functions in bootmem.h was ordered in not so orderly
manner. Reorder them such that
* functions allocating the same area group together -
ie. alloc_bootmem group and alloc_bootmem_low group.
* functions w/o node parameter come before the ones w/ node parameter.
* nopanic variants are immediately below their panicky counterparts.
While at it, add alloc_bootmem_pages_node_nopanic() which was missing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
Impact: cleaner and consistent bootmem wrapping
By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
arch-specific wrappers for bootmem allocation. However, this is done
a bit strangely in that only the high level convenience macros can be
changed while lower level, but still exported, interface functions
can't be wrapped. This not only is messy but also leads to strange
situation where alloc_bootmem() does what the arch wants it to do but
the equivalent __alloc_bootmem() call doesn't although they should be
able to be used interchangeably.
This patch updates bootmem such that archs can override / wrap the
backend function - alloc_bootmem_core() instead of the highlevel
interface functions to allow simpler and consistent wrapping. Also,
HAVE_ARCH_BOOTMEM_NODE is renamed to HAVE_ARCH_BOOTMEM.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netns: fix double free at netns creation
veth : add the set_mac_address capability
sunlance: Beyond ARRAY_SIZE of ib->btx_ring
sungem: another error printed one too early
ISDN: fix sc/shmem printk format warning
SMSC: timeout reaches -1
smsc9420: handle magic field of ethtool_eeprom
sundance: missing parentheses?
smsc9420: fix another postfixed timeout
wimax/i2400m: driver loads firmware v1.4 instead of v1.3
vlan: Update skb->mac_header in __vlan_put_tag().
cxgb3: Add support for PCI ID 0x35.
tcp: remove obsoleted comment about different passes
TG3: &&/|| confusion
ATM: misplaced parentheses?
net/mv643xx: don't disable the mib timer too early and lock properly
net/mv643xx: use GFP_ATOMIC while atomic
atl1c: Atheros L1C Gigabit Ethernet driver
net: Kill skb_truesize_check(), it only catches false-positives.
net: forcedeth: Fix wake-on-lan regression
Although it allows for better cacheline use, it is unnecessary to save a
copy of the cache's min_partial value in each kmem_cache_node.
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.
This is based on an earlier patch from Linus.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some hardware platforms, the TS-7800[1] is one for example, can
supply the kernel with an entropy source, albeit a slow one for
TS-7800 users, by just reading a particular IO address. This
source must not be read above a certain rate otherwise the quality
suffers.
The driver is then hooked into by calling
platform_device_(register|add|del) passing a structure similar to:
------
static struct timeriomem_rng_data ts78xx_ts_rng_data = {
.address = (u32 *__iomem) TS_RNG,
.period = 1000000, /* one second */
};
static struct platform_device ts78xx_ts_rng_device = {
.name = "timeriomem_rng",
.id = -1,
.dev = {
.platform_data = &ts78xx_ts_rng_data,
},
.num_resources = 0,
};
------
[1] http://www.embeddedarm.com/products/board-detail.php?product=TS-7800
Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* hibernate:
PM: Fix suspend_console and resume_console to use only one semaphore
PM: Wait for console in resume
PM: Fix pm_notifiers during user mode hibernation
swsusp: clean up shrink_all_zones()
swsusp: dont fiddle with swappiness
PM: fix build for CONFIG_PM unset
PM/hibernate: fix "swap breaks after hibernation failures"
PM/resume: wait for device probing to finish
Consolidate driver_probe_done() loops into one place
there's a few places that currently loop over driver_probe_done(), and
I'm about to add another one. This patch abstracts it into a helper
to reduce duplication.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Intel 8257x Ethernet boards have a feature called Serial Over Lan.
This feature works by emulating a serial port, and it is detected by
kernel as a normal 8250 port. However, this emulation is not perfect, as
also noticed on changeset 7500b1f602.
Before this patch, the kernel were trying to check if the serial TX is
capable of work using IRQ's.
This were done with a code similar this:
serial_outp(up, UART_IER, UART_IER_THRI);
lsr = serial_in(up, UART_LSR);
iir = serial_in(up, UART_IIR);
serial_outp(up, UART_IER, 0);
if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
up->bugs |= UART_BUG_TXEN;
This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.
Due to that, UART_BUG_TXEN is sometimes enabled. However, as TX IRQ keeps
working, and the TX polling is now enabled, the driver miss-interprets the
IRQ received later, hanging up the machine until a key is pressed at the
serial console.
This is the 6 version of this patch. Previous versions were trying to
introduce a large enough delay between serial_outp and serial_in(up,
UART_IIR), but not taking forever. However, the needed delay couldn't be
safely determined.
At the experimental tests, a delay of 1us solves most of the cases, but
still hangs sometimes. Increasing the delay to 5us was better, but still
doesn't solve. A very high delay of 50 ms seemed to work every time.
However, poking around with delays and pray for it to be enough doesn't
seem to be a good approach, even for a quirk.
So, instead of playing with random large arbitrary delays, let's just
disable UART_BUG_TXEN for all SoL ports.
[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds more documentation of the lowlevel API to avoid future bugs.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kzfree() is a wrapper for kfree() that additionally zeroes the underlying
memory before releasing it to the slab allocator.
Currently there is code which memset()s the memory region of an object
before releasing it back to the slab allocator to make sure
security-sensitive data are really zeroed out after use.
These callsites can then just use kzfree() which saves some code, makes
users greppable and allows for a stupid destructor that isn't necessarily
aware of the actual object size.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ACPI code currently carries its own thermal trip handling, meaning that
any other thermal implementation will need to reimplement it. Move the code
to the generic thermal layer.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Impact: cleanup
Remove an #ifdef from notify_page_fault(). The function still
compiles to nothing in the !CONFIG_KPROBES case.
Introduce kprobes_built_in() and kprobe_fault_handler() helpers
to allow this - they returns 0 if !CONFIG_KPROBES.
No code changed:
text data bss dec hex filename
4618 32 24 4674 1242 fault.o.before
4618 32 24 4674 1242 fault.o.after
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Remove an #ifdef from kmmio_fault() - we can do this by
providing default implementations for is_kmmio_active()
and kmmio_handler(). The compiler optimizes it all away
in the !CONFIG_MMIOTRACE case.
Also, while at it, clean up mmiotrace.h a bit:
- standard header guards
- standard vertical spaces for structure definitions
No code changed (both with mmiotrace on and off in the config):
text data bss dec hex filename
2947 12 12 2971 b9b fault.o.before
2947 12 12 2971 b9b fault.o.after
Cc: Pekka Paalanen <pq@iki.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch creates the weak functions: ftrace_arch_code_modify_prepare
and ftrace_arch_code_modify_post_process that are called before and
after the stop machine is called to modify the kernel text.
If the arch needs to do pre or post processing, it only needs to define
these functions.
[ Update: Ingo Molnar suggested using the name ftrace_arch_code_modify_*
over using ftrace_arch_modify_* ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
The thermal API currently uses strings to pass values to userspace. This
makes it difficult to use from within the kernel. Change the interface
to use integers and fix up the consumers.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Increase the maximum object size in SLUB so that 8k objects are not
passed through to the page allocator anymore. The network stack uses 8k
objects for performance critical operations.
The patch is motivated by a SLAB vs. SLUB regression in the netperf
benchmark. The problem is that the kfree(skb->head) call in
skb_release_data() that is subject to page allocator pass-through as the
size passed to __alloc_skb() is larger than 4 KB in this test.
As explained by Yanmin Zhang:
I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server
pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's
result is about 2.3 times of SLUB's. After applying the reverting patch,
the result difference between SLUB and SLAB becomes 1% which we might
consider as fluctuation.
[ penberg@cs.helsinki.fi: fix oops in kmalloc() ]
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Kernel module providing implementation of LED netfilter target. Each
instance of the target appears as a led-trigger device, which can be
associated with one or more LEDs in /sys/class/leds/
Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:
1) it prevents changes to table state (xt_replace) while table is in use.
This is now handled by doing rcu on the xt_table. When table is
replaced, the new table(s) are put in and the old one table(s) are freed
after RCU period.
2) it provides synchronization when accesing the counter values.
This is now handled by swapping in new table_info entries for each cpu
then summing the old values, and putting the result back onto one
cpu. On a busy system it may cause sampling to occur at different
times on each cpu, but no packet/byte counts are lost in the process.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds NETLINK_BROADCAST_ERROR which is a netlink
socket option that the listener can set to make netlink_broadcast()
return errors in the delivery to the caller. This option is useful
if the caller of netlink_broadcast() do something with the result
of the message delivery, like in ctnetlink where it drops a network
packet if the event delivery failed, this is used to enable reliable
logging and state-synchronization. If this socket option is not set,
netlink_broadcast() only reports ESRCH errors and silently ignore
ENOBUFS errors, which is what most netlink_broadcast() callers
should do.
This socket option is based on a suggestion from Patrick McHardy.
Patrick McHardy can exchange this patch for a beer from me ;).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: new scalable dynamic percpu allocator which allows dynamic
percpu areas to be accessed the same way as static ones
Implement scalable dynamic percpu allocator which can be used for both
static and dynamic percpu areas. This will allow static and dynamic
areas to share faster direct access methods. This feature is optional
and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
arch. Please read comment on top of mm/percpu.c for details.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Impact: two more public map/unmap functions
Implement map_kernel_range_noflush() and unmap_kernel_range_noflush().
These functions respectively map and unmap address range in kernel VM
area but doesn't do any vcache or tlb flushing. These will be used by
new percpu allocator.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Impact: allow multiple early vm areas
There are places where kernel VM area needs to be allocated before
vmalloc is initialized. This is done by allocating static vm_struct,
initializing several fields and linking it to vmlist and later vmalloc
initialization picking up these from vmlist. This is currently done
manually and if there's more than one such areas, there's no defined
way to arbitrate who gets which address.
This patch implements vm_area_register_early(), which takes vm_area
struct with flags and size initialized, assigns address to it and puts
it on the vmlist. This way, multiple early vm areas can determine
which addresses they should use. The only current user - alpha mm
init - is converted to use it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: kill unused functions
percpu_alloc() and its friends never saw much action. It was supposed
to replace the cpu-mask unaware __alloc_percpu() but it never happened
and in fact __percpu_alloc_mask() itself never really grew proper
up/down handling interface either (no exported interface for
populate/depopulate).
percpu allocation is about to go through major reimplementation and
there's no reason to carry this unused interface around. Replace it
with __alloc_percpu() and free_percpu().
Signed-off-by: Tejun Heo <tj@kernel.org>
This prepares for a real __alloc_percpu, by adding an alignment argument.
Only one place uses __alloc_percpu directly, and that's for a string.
tj: af_inet also uses __alloc_percpu(), update it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Impact: cleanup
There are two allocated per-cpu accessor macros with almost identical
spelling. The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.
tj: kill percpu_ptr() and update UP too
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
After moving mac addresses in __vlan_put_tag() skb->mac_header needs
to be updated.
Reported-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
block: fix booting from partitioned md array
block: revert part of 18ce3751cc
cciss: PCI power management reset for kexec
paride/pg.c: xs(): &&/|| confusion
fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free
block: fix bad definition of BIO_RW_SYNC
bsg: Fix sense buffer bug in SG_IO
Since I don't work for SUSE any more and the bwalle@suse.de address is
invalid, correct it in the copyright headers and documentation.
Signed-off-by: Bernhard Walle <bernhard.walle@gmx.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I have a Digi Neo 8 PCI card (114f:00b1) Serial controller: Digi
International Digi Neo 8 (rev 05)
that works with the jsm driver after using the following patch.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: Scott H Kilau <Scott_Kilau@digi.com>
Cc: Wendy Xiong <wendyx@us.ibm.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.
To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:
BUG_ON(page_zone(start_page) != page_zone(end_page));
Once I knew this is what was happening, I added some annotations:
if (unlikely(page_zone(start_page) != page_zone(end_page))) {
printk(KERN_ERR "move_freepages: Bogus zones: "
"start_page[%p] end_page[%p] zone[%p]\n",
start_page, end_page, zone);
printk(KERN_ERR "move_freepages: "
"start_zone[%p] end_zone[%p]\n",
page_zone(start_page), page_zone(end_page));
printk(KERN_ERR "move_freepages: "
"start_pfn[0x%lx] end_pfn[0x%lx]\n",
page_to_pfn(start_page), page_to_pfn(end_page));
printk(KERN_ERR "move_freepages: "
"start_nid[%d] end_nid[%d]\n",
page_to_nid(start_page), page_to_nid(end_page));
...
And here's what I got:
move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
move_freepages: start_nid[1] end_nid[0]
My memory layout on this box is:
[ 0.000000] Zone PFN ranges:
[ 0.000000] Normal 0x00000000 -> 0x0081ff5d
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[8] active PFN ranges
[ 0.000000] 0: 0x00000000 -> 0x00020000
[ 0.000000] 1: 0x00800000 -> 0x0081f7ff
[ 0.000000] 1: 0x0081f800 -> 0x0081fe50
[ 0.000000] 1: 0x0081fed1 -> 0x0081fed8
[ 0.000000] 1: 0x0081feda -> 0x0081fedb
[ 0.000000] 1: 0x0081fedd -> 0x0081fee5
[ 0.000000] 1: 0x0081fee7 -> 0x0081ff51
[ 0.000000] 1: 0x0081ff59 -> 0x0081ff5d
So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.
This patch:
Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
I think it makes fix-for-memmap-init not easy.
This patch moves all declaration to include/linux/mm.h
After this,
if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
-> Use static definition in include/linux/mm.h
else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
-> Use generic definition in mm/page_alloc.c
else
-> per-arch back end function will be called.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The conversion of atmel-mci to dma_request_channel missed the
initialization of the channel dma_slave information. The filter_fn passed
to dma_request_channel is responsible for initializing the channel's
private data. This implementation has the additional benefit of enabling
a generic client-channel data passing mechanism.
Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for
cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty).
Additionally, there is some inconsistency about when task_dirty_inc is
called. It is used for dirty balancing, however it even gets called for
__set_page_dirty_no_writeback.
So rather than increment it in a set_page_dirty wrapper, move it down to
exactly where the dirty page accounting stats are incremented.
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As requested by Michael, add a missing check for valid flags in
timerfd_settime(), and make it return EINVAL in case some extra bits are
set.
Michael said:
If this is to be any use to userland apps that want to check flag
support (perhaps it is too late already), then the sooner we get it
into the kernel the better: 2.6.29 would be good; earlier stables as
well would be even better.
[akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org> [2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently seq_read assumes that the offset passed to it is always the
offset it passed to user space. In the case pread this assumption is
broken and we do the wrong thing when presented with pread.
To solve this I introduce an offset cache inside of struct seq_file so we
know where our logical file position is. Then in seq_read if we try to
read from another offset we reset our data structures and attempt to go to
the offset user space wanted.
[akpm@linux-foundation.org: restore FMODE_PWRITE]
[pjt@google.com: seq_open needs its fmode opened up to take advantage of this]
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Turner <pjt@google.com>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Separate FMODE_PREAD and FMODE_PWRITE into separate flags to reflect the
reality that the read and write paths may have independent restrictions.
A git grep verifies that these flags are always cleared together so this
new behavior will only apply to interfaces that change to clear flags
individually.
This is required for "seq_file: properly cope with pread", a post-2.6.25
regression fix.
[akpm@linux-foundation.org: add comment]
Signed-off-by: Paul Turner <pjt@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have get_vm_area_caller() and __get_vm_area() but not
__get_vm_area_caller()
On powerpc, I use __get_vm_area() to separate the ranges of addresses
given to vmalloc vs. ioremap (various good reasons for that) so in order
to be able to implement the new caller tracking in /proc/vmallocinfo, I
need a "_caller" variant of it.
(akpm: needed for ongoing powerpc development, so merge it early)
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is nothing really arch specific of the push and pop functions
used by the function graph tracer. This patch moves them to generic
code.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: new timer API
Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:
introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.
(regular mod_timer() re-activates non-pending timers.)
This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.
Also while at it:
- optimize the regular mod_timer() path some more, the
timer-stat and a debug check was needlessly duplicated
in __mod_timer().
- make the exports come straight after the function, as
most other exports in timer.c already did.
- eliminate __mod_timer() as an external API, change the
users to mod_timer().
The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.
Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change to proper type on private pointer rather than anonymous void.
Keep active elements on same cache line.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In NFLOG the per-rule qthreshold should overrides per-instance only
it is set. With current code, the per-rule qthreshold is 1 if not set
and it overrides the per-instance qthreshold.
This patch modifies the default xt_NFLOG threshold from 1 to
0. Thus a value of 0 means there is no per-rule setting and the instance
parameter has to apply.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO
and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before
213d9417fe.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is based on a report and patch by Geert Uytterhoeven.
The functions crypto_alloc_tfm and create_create_tfm return a
pointer that needs to be adjusted by the caller when successful
and otherwise an error value. This means that the caller has
to check for the error and only perform the adjustment if the
pointer returned is valid.
Since all callers want to make the adjustment and we know how
to adjust it ourselves, it's much easier to just return adjusted
pointer directly.
The only caveat is that we have to return a void * instead of
struct crypto_tfm *. However, this isn't that bad because both
of these functions are for internal use only (by types code like
shash.c, not even algorithms code).
This patch also moves crypto_alloc_tfm into crypto/internal.h
(crypto_create_tfm is already there) to reflect this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.
skb_truesize_check() was added in order to try and catch this error
more systematically.
However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix this sparse warnings:
drivers/net/hamradio/hdlcdrv.c:274:34: warning: incorrect type in argument 2 (different signedness)
drivers/net/hamradio/hdlcdrv.c:279:47: warning: incorrect type in argument 2 (different signedness)
drivers/net/hamradio/hdlcdrv.c:288:39: warning: incorrect type in argument 2 (different signedness)
drivers/net/hamradio/hdlcdrv.c:300:47: warning: incorrect type in argument 2 (different signedness)
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, vm86: fix preemption bug
x86, olpc: fix model detection without OFW
x86, hpet: fix for LS21 + HPET = boot hang
x86: CPA avoid repeated lazy mmu flush
x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
x86/cpa: make sure cpa is safe to call in lazy mmu mode
x86, ptrace, mm: fix double-free on race
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling
ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages
ext4: Initialize preallocation list_head's properly
ext4: Fix lockdep warning
ext4: Fix to read empty directory blocks correctly in 64k
jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()
Revert "ext4: wait on all pending commits in ext4_sync_fs()"
jbd2: Fix return value of jbd2_journal_start_commit()
Impact: clean up
Ingo Molnar did not like the _hook naming convention used by the
select function tracer. Luis Claudio R. Goncalves suggested using
the "_probe" extension. This patch implements the change of
calling the functions and variables "_hook" and replacing them
with "_probe".
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This patch adds a call back for the tracers that have hooks to
selected functions. This allows the tracer to show better output
in the set_ftrace_filter file.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: new feature
Currently, the function tracer only gives you an ability to hook
a tracer to all functions being traced. The dynamic function trace
allows you to pick and choose which of those functions will be
traced, but all functions being traced will call all tracers that
registered with the function tracer.
This patch adds a new feature that allows a tracer to hook to specific
functions, even when all functions are being traced. It allows for
different functions to call different tracer hooks.
The way this is accomplished is by a special function that will hook
to the function tracer and will set up a hash table knowing which
tracer hook to call with which function. This is the most general
and easiest method to accomplish this. Later, an arch may choose
to supply their own method in changing the mcount call of a function
to call a different tracer. But that will be an exercise for the
future.
To register a function:
struct ftrace_hook_ops {
void (*func)(unsigned long ip,
unsigned long parent_ip,
void **data);
int (*callback)(unsigned long ip, void **data);
void (*free)(void **data);
};
int register_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
void *data);
glob is a simple glob to search for the functions to hook.
ops is a pointer to the operations (listed below)
data is the default data to be passed to the hook functions when traced
ops:
func is the hook function to call when the functions are traced
callback is a callback function that is called when setting up the hash.
That is, if the tracer needs to do something special for each
function, that is being traced, and wants to give each function
its own data. The address of the entry data is passed to this
callback, so that the callback may wish to update the entry to
whatever it would like.
free is a callback for when the entry is freed. In case the tracer
allocated any data, it is give the chance to free it.
To unregister we have three functions:
void
unregister_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
void *data)
This will unregister all hooks that match glob, point to ops, and
have its data matching data. (note, if glob is NULL, blank or '*',
all functions will be tested).
void
unregister_ftrace_function_hook_func(char *glob,
struct ftrace_hook_ops *ops)
This will unregister all functions matching glob that has an entry
pointing to ops.
void unregister_ftrace_function_hook_all(char *glob)
This simply unregisters all funcs.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Allow for other tracers to add their own commands for function
selection. This interface gives a trace the ability to name a
command for function selection. Right now it is pretty limited
in what it offers, but this is a building step for more features.
The :mod: command is converted to this interface and also serves
as a template for other implementations.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
The sctp crc32c checksum is always generated in little endian.
So, we clean up the code to treat it as little endian and remove
all the __force casts.
Suggested by Herbert Xu.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The additional per-packet information (16 bytes for time stamps, 1
byte for flags) is stored for all packets in the skb_shared_info
struct. This implementation detail is hidden from users of that
information via skb_* accessor functions. A separate struct resp.
union is used for the additional information so that it can be
stored/copied easily outside of skb_shared_info.
Compared to previous implementations (reusing the tstamp field
depending on the context, optional additional structures) this
is the simplest solution. It does not extend sk_buff itself.
TX time stamping is implemented in software if the device driver
doesn't support hardware time stamping.
The new semantic for hardware/software time stamping around
ndo_start_xmit() is based on two assumptions about existing
network device drivers which don't support hardware time
stamping and know nothing about it:
- they leave the new skb_shared_tx unmodified
- the keep the connection to the originating socket in skb->sk
alive, i.e., don't call skb_orphan()
Given that skb_shared_tx is new, the first assumption is safe.
The second is only true for some drivers. As a result, software
TX time stamping currently works with the bnx2 driver, but not
with the unmodified igb driver (the two drivers this patch series
was tested with).
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
User space can request hardware and/or software time stamping.
Reporting of the result(s) via a new control message is enabled
separately for each field in the message because some of the
fields may require additional computation and thus cause overhead.
User space can tell the different kinds of time stamps apart
and choose what suits its needs.
When a TX timestamp operation is requested, the TX skb will be cloned
and the clone will be time stamped (in hardware or software) and added
to the socket error queue of the skb, if the skb has a socket
associated with it.
The actual TX timestamp will reach userspace as a RX timestamp on the
cloned packet. If timestamping is requested and no timestamping is
done in the device driver (potentially this may use hardware
timestamping), it will be done in software after the device's
start_hard_xmit routine.
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mapping from a struct timecounter to a time returned by functions like
ktime_get_real() is implemented. This is sufficient to use this code
in a network device driver which wants to support hardware time
stamping and transformation of hardware time stamps to system time.
The interface could have been made more versatile by not depending on
a time counter, but this wasn't done to avoid writing glue code
elsewhere.
The method implemented here is the one used and analyzed under the name
"assisted PTP" in the LCI PTP paper:
http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far struct clocksource acted as the interface between time/timekeeping.c
and hardware. This patch generalizes the concept so that a similar
interface can also be used in other contexts. For that it introduces
new structures and related functions *without* touching the existing
struct clocksource.
The reasons for adding these new structures to clocksource.[ch] are
* the APIs are clearly related
* struct clocksource could be cleaned up to use the new structs
* avoids proliferation of files with similar names (timesource.h?
timecounter.h?)
As outlined in the discussion with John Stultz, this patch adds
* struct cyclecounter: stateless API to hardware which counts clock cycles
* struct timecounter: stateful utility code built on a cyclecounter which
provides a nanosecond counter
* only the function to read the nanosecond counter; deltas are used internally
and not exposed to users of timecounter
The code does no locking of the shared state. It must be called at least
as often as the cycle counter wraps around to detect these wrap arounds.
Both is the responsibility of the timecounter user.
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: fix build error
to fix:
tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here
tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Base versions handle constant folding now. For headers exposed to
userspace, we must only expose the __ prefixed versions.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use cpu_to_le32 directly as it handles constant folding now, replace direct
uses of __constant_cpu_to_{endian} as well.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
kvm_arch_sync_events is introduced to quiet down all other events may happen
contemporary with VM destroy process, like IRQ handler and work struct for
assigned device.
For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so
the state of KVM here is legal and can provide a environment to quiet down other
events.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Kconfig symbols are not available in userspace, and are not stripped by
headers-install. Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.
Signed-off-by: Avi Kivity <avi@redhat.com>
s/HELD_OVER/ENABLED/g
so that its similar to the hard and soft-irq names.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
s/\(LOCKF\?_ENABLED_[^ ]*\)S\(_READ\)\?\>/\1\2/g
So that the USED_IN and ENABLED have the same names.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Here is another version, with the incremental patch rolled up, and
added reclaim context annotation to kswapd, and allocation tracing
to slab allocators (which may only ever reach the page allocator
in rare cases, so it is good to put annotations here too).
Haven't tested this version as such, but it should be getting closer
to merge worthy ;)
--
After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
allocation when it should not have been, I thought it might be a good idea to
try to catch this kind of thing with lockdep.
I coded up a little idea that seems to work. Unfortunately the system has to
actually be in __GFP_FS page reclaim, then take the lock, before it will mark
it. But at least that might still be some orders of magnitude more common
(and more debuggable) than an actual deadlock condition, so we have some
improvement I hope (the concept is no less complete than discovery of a lock's
interrupt contexts).
I guess we could even do the same thing with __GFP_IO (normal reclaim), and
even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
code paths, so let's start there and see how it goes.
It *seems* to work. I did a quick test.
=================================
[ INFO: inconsistent lock state ]
2.6.28-rc6-00007-ged31348-dirty #26
---------------------------------
inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
(testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
{in-reclaim-W} state was registered at:
[<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60
[<ffffffff80268f71>] lock_acquire+0x91/0xc0
[<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310
[<ffffffffa002002b>] brd_init+0x2b/0x216 [brd]
[<ffffffff8020903b>] _stext+0x3b/0x170
[<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
[<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 3929
hardirqs last enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310
hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310
softirqs last enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0
softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0
other info that might help us debug this:
1 lock held by modprobe/8526:
#0: (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
stack backtrace:
Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged31348-dirty #26
Call Trace:
[<ffffffff80265483>] print_usage_bug+0x193/0x1d0
[<ffffffff80266530>] mark_lock+0xaf0/0xca0
[<ffffffff80266735>] mark_held_locks+0x55/0xc0
[<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
[<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60
[<ffffffff80285005>] __alloc_pages_internal+0x475/0x580
[<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310
[<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
[<ffffffffa002006a>] brd_init+0x6a/0x216 [brd]
[<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
[<ffffffff8020903b>] _stext+0x3b/0x170
[<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10
[<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180
[<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190
[<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
[<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This modifies the timer code in a way to allow lockdep to detect
deadlocks resulting from a lock being taken in the timer function
as well as around the del_timer_sync() call.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
This patch adds basic scan capability to cfg80211/nl80211 and
changes mac80211 to use it. The BSS list that cfg80211 maintains
is made driver-accessible with a private area in each BSS struct,
but mac80211 doesn't yet use it. That's another large project.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: Only register AC97 bus if it's not done already
ALSA: hda - Add snd_hda_multi_out_dig_cleanup()
ALSA: hda - Add missing terminator in slave dig-out array
ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model
ALSA: hda - Register (new) devices at reconfig
ALSA: mtpav - Fix initial value for input hwport
ALSA: hda - add id for Intel IbexPeak integrated HDMI codec
ALSA: hda - compute checksum in HDMI audio infoframe
ALSA: hda - enable HDMI audio pin out at module loading time
ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present
ASoC: Update SDP3430 machine driver for snd_soc_card
ALSA: hda - Add quirk for Asus z37e (1043:8284)
sound: Remove OSSlib stuff from linux/soundcard.h
ASoC: WM8990: Fix kcontrol's private value use in put callback
ASoC: TLV320AIC3X: Fix kcontrol's private value use in put callback
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface
net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform
bnx2: Update version to 1.9.2 and copyright.
bnx2: Fix jumbo frames error handling.
bnx2: Update 5709 firmware.
bnx2: Update 5706/5708 firmware.
3c505: do not set pcb->data.raw beyond its size
Documentation/connector/cn_test.c: don't use gfp_any()
net: don't use in_atomic() in gfp_any()
IRDA: cnt is off by 1
netxen: remove pcie workaround
sun3: print when lance_open() fails
qlge: bugfix: Add missing rx buf clean index on early exit.
qlge: bugfix: Fix RX scaling values.
qlge: bugfix: Fix TSO breakage.
qlge: bugfix: Add missing dev_kfree_skb_any() call.
qlge: bugfix: Add missing put_page() call.
qlge: bugfix: Fix fatal error recovery hang.
qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().
...
Impact: avoid corruption in system time accounting
Martin Schwidefsky told me that there was an issue with NMIs and
system accounting. The problem is that the accounting code is
not reentrant, and if an NMI goes off after an interrupt it can
corrupt the accounting.
For now, the best we can do is to treat NMIs like SMIs and they
are not accounted for.
This patch changes nmi_enter to not call __irq_enter and to do
the preempt-count and tracing calls directly.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
To add a bit in the preempt_count to be set when in NMI context, we
found that some archs did not have enough bits to spare. This is
due to the hardirq_count being a mask that can hold NR_IRQS.
Some archs allow for over 16000 IRQs, and that would require a mask
of 14 bits. The sofitrq mask is 8 bits and the preempt disable mask
is also 8 bits. The PREEMP_ACTIVE bit is bit 30, and bit 31 would
make the preempt_count (which is type int) a negative number.
A negative preempt_count is a sign of failure.
Add them up 14+8+8+1+1 you get 32 bits. No room for the NMI bit.
But the hardirq_count is to track the number of nested IRQs, not
the number of total IRQs. This originally took the paranoid approach
of setting the max nesting to NR_IRQS. But when we have archs with
over 1000 IRQs, it is not practical to think they will ever all
nest on a single CPU. Not to mention that this would most definitely
cause a stack overflow.
This patch sets a max of 10 bits to be used for IRQ nesting.
I did a 'git grep HARDIRQ' to examine all users of HARDIRQ_BITS and
HARDIRQ_MASK, and found that making it a max of 10 would not hurt
anyone. I did find that the m68k expected it to be 8 bits, so
I allow for the archs to set the number to be less than 10.
I removed the setting of HARDIRQ_BITS from the archs that set it
to more than 10. This includes ALPHA, ia64 and avr32.
This will always allow room for the NMI bit, and if we need to allow
for NMI nesting, we have 4 bits to play with.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This patch allows LSM modules to determine whether current process is in an
execve operation or not so that they can behave differently while an execve
operation is in progress.
This patch is needed by TOMOYO. Please see another patch titled "LSM adapter
functions." for backgrounds.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Some of the kerneldoc comments in the dmaengine header describe
already removed structure members. Remove them.
Also add a short description for dma_device->device_is_tx_complete.
Signed-off-by: Johannes Weiner <jw@emlix.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Based on discussions on linux-audit, as per Steve Grubb's request
http://lkml.org/lkml/2009/2/6/269, the following changes were made:
- forced audit result to be either 0 or 1.
- made template names const
- Added new stand-alone message type: AUDIT_INTEGRITY_RULE
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
With the new system call defines we get this on uml:
arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'
Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask. However sys_sigprocmask is missing
because of this.
To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I enabled all cgroup subsystems when compiling kernel, and then:
# mount -t cgroup -o net_cls xxx /mnt
# mkdir /mnt/0
This showed up immediately:
BUG: MAX_LOCKDEP_SUBCLASSES too low!
turning off the locking correctness validator.
It's caused by the cgroup hierarchy lock:
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
if (ss->root == root)
mutex_lock_nested(&ss->hierarchy_mutex, i);
}
Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.
This patch uses different lockdep keys for different subsystems.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: fix TIMER_ABSTIME for process wide cpu timers
timers: split process wide cpu clocks/timers, fix
x86: clean up hpet timer reinit
timers: split process wide cpu clocks/timers, remove spurious warning
timers: split process wide cpu clocks/timers
signal: re-add dead task accumulation stats.
x86: fix hpet timer reinit for x86_64
sched: fix nohz load balancer on cpu offline
Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.
Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().
The fix follows a proposal from Oleg Nesterov.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.
This fixes a flood of warnings reported by Mike Galbraith.
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andrew had some suggestions for the latencytop file; this patch takes care
of most of these:
* Add documentation
* Turn account_scheduler_latency into an inline function
* Don't report negative values to userspace
* Make the file operations struct const
* Fix a few checkpatch.pl warnings
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Based on comments from Mike Frysinger and Randy Dunlap:
(http://lkml.org/lkml/2009/2/9/262)
- moved ima.h include before CONFIG_SHMEM test to fix compiler error
on Blackfin:
mm/shmem.c: In function 'shmem_zero_setup':
mm/shmem.c:2670: error: implicit declaration of function 'ima_shm_check'
- added 'struct linux_binprm' in ima.h to fix compiler warning on Blackfin:
In file included from mm/shmem.c:32:
include/linux/ima.h:25: warning: 'struct linux_binprm' declared inside
parameter list
include/linux/ima.h:25: warning: its scope is only this definition or
declaration, which is probably not what you want
- moved fs.h include within _LINUX_IMA_H definition
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: James Morris <jmorris@namei.org>
Using u32 in this header breaks the build of iptables.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix regression due to 5a6fe12595,
"Do not account for the address space used by hugetlbfs using VM_ACCOUNT"
which added an argument to the function hugetlb_file_setup() but not to
the macro hugetlb_file_setup().
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
bridge: Fix LRO crash with tun
IPv6: fix to set device name when new IPv6 over IPv6 tunnel device is created.
gianfar: Fix boot hangs while bringing up gianfar ethernet
netfilter: xt_sctp: sctp chunk mapping doesn't work
netfilter: ctnetlink: fix echo if not subscribed to any multicast group
netfilter: ctnetlink: allow changing NAT sequence adjustment in creation
netfilter: nf_conntrack_ipv6: don't track ICMPv6 negotiation message
netfilter: fix tuple inversion for Node information request
netxen: fix msi-x interrupt handling
de2104x: force correct order when writing to rx ring
tun: Fix unicast filter overflow
drivers/isdn: introduce missing kfree
drivers/atm: introduce missing kfree
sunhme: Don't match PCI devices in SBUS probe.
9p: fix endian issues [attempt 3]
net_dma: call dmaengine_get only if NET_DMA enabled
3c509: Fix resume from hibernation for PnP mode.
sungem: Soft lockup in sungem on Netra AC200 when switching interface up
RxRPC: Fix a potential NULL dereference
r8169: Don't update statistics counters when interface is down
...
When overcommit is disabled, the core VM accounts for pages used by anonymous
shared, private mappings and special mappings. It keeps track of VMAs that
should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
with VM_NORESERVE.
Overcommit for hugetlbfs is much riskier than overcommit for base pages
due to contiguity requirements. It avoids overcommiting on both shared and
private mappings using reservation counters that are checked and updated
during mmap(). This ensures (within limits) that hugepages exist in the
future when faults occurs or it is too easy to applications to be SIGKILLed.
As hugetlbfs makes its own reservations of a different unit to the base page
size, VM_ACCOUNT should never be set. Even if the units were correct, we would
double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
be set because an application can request no reserves be made for hugetlbfs
at the risk of getting killed later.
With commit fc8744adc8, VM_NORESERVE and
VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
breaks the accounting for both the core VM and hugetlbfs, can trigger an
OOM storm when hugepage pools are too small lockups and corrupted counters
otherwise are used. This patch brings hugetlbfs more in line with how the
core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we race with commit code setting i_transaction to NULL, we could
possibly dereference it. Proper locking requires the journal pointer
(to access journal->j_list_lock), which we don't have. So we have to
change the prototype of the function so that filesystem passes us the
journal pointer. Also add a more detailed comment about why the
function jbd2_journal_begin_ordered_truncate() does what it does and
how it should be used.
Thanks to Dan Carpenter <error27@gmail.com> for pointing to the
suspitious code.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Joel Becker <joel.becker@oracle.com>
CC: linux-ext4@vger.kernel.org
CC: ocfs2-devel@oss.oracle.com
CC: mfasheh@suse.de
CC: Dan Carpenter <error27@gmail.com>
This kills of HAVE_ALLOC_SKB and HAVE_ALIGNABLE_SKB.
Nothing in-tree uses them and nothing in-tree has used them
since 2.0.x times.
Signed-off-by: David S. Miller <davem@davemloft.net>
ELF core dump is used for both user land core dump and kernel crash
dump. Depending on architecture, register might need to be accessed
differently for userland and kernel. Allow architectures to define
ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
register dump.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Removed OSSlib stuff from linux/soundcard.h to fix the warnings for
'make headers_check'.
This patch breaks building against OSSlib with the kernel headers
instead of its own headers. It should still work with any
version of the library from the 2003 onwards which provide
their own headers for the latest interface.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This adds support for the SSB PMU.
A PMU is found on Low-Power devices.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In certain cases it is required to perform board specific actions
before activating libertas G-SPI interface. These actions may include
power up of the chip, GPIOs setup, proper pin-strapping and SPI
controller config.
This patch adds ability to call board specific setup/teardown methods
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This lets userspace request to get the currently set
regulatory domain.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since there is
nobody known to be waiting on it (arguably /fairly/ unfair locks).
Dummy it out to return 0 on other architectures.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
to prevent wrongly overwriting fixmap that still want to use.
ACPI used to rely on low mappings being all linearly mapped and
grew a habit: it never really unmapped certain kinds of tables
after use.
This can cause problems - for example the hypothetical case
when some spurious access still references it.
v2: remove prev_map and prev_size in __apci_map_table
v3: let acpi_os_unmap_memory() call early_iounmap too, so remove extral calling to
early_acpi_os_unmap_memory
v4: fix typo in one acpi_get_table_with_size calling
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When hardware detects any error with a descriptor from the invalidation
queue, it stops fetching new descriptors from the queue until software
clears the Invalidation Queue Error bit in the Fault Status register.
Following fix handles the IQE so the kernel won't be trapped in an
infinite loop.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Impact: saves sizeof(long) bytes per task_struct
By guaranteeing that sysctl_hung_task_timeout_secs have elapsed between
tasklist scans we can avoid using timestamps.
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Move the power tracer headers to trace/power.h to keep ftrace.h and power bits
more easy to maintain as separated topics.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: bug fix
IA-64 needs to put percpu data in the seperate section even on UP.
Fixes regression caused by "percpu: refactor percpu.h"
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch optimises the Ethernet header comparison to use 2-byte
and 4-byte xors instead of memcmp. In order to facilitate this,
the actual comparison is now carried out by the callers of the
shared dev_gro_receive function.
This has a significant impact when receiving 1500B packets through
10GbE.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch prepares for the move of the same_flow checks out of
dev_gro_receive. As such we need to remember the number of held
packets since doing a loop just to count them every time is silly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several devices need to insert some "pre headers" in front of the
main packet data when they transmit a packet.
Currently we allocate only 16 bytes of pad room and this ends up not
being enough for some types of hardware (NIU, usb-net, s390 qeth,
etc.)
So increase this to 32.
Note that drivers still need to check in their transmit routine
whether enough headroom exists, and if not use skb_realloc_headroom().
Tunneling, IPSEC, and other encapsulation methods can cause the
padding area to be used up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Impact: clean up
Fixed several typos in the comments.
Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up
Now that a generic in_nmi is available, this patch removes the
special code in the ring_buffer and implements the in_nmi generic
version instead.
With this change, I was also able to rename the "arch_ftrace_nmi_enter"
back to "ftrace_nmi_enter" and remove the code from the ring buffer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This code adds an in_nmi() macro that uses the current tasks preempt count
to track when it is in NMI context. Other parts of the kernel can
use this to determine if the context is in NMI context or not.
This code was inspired by the -rt patch in_nmi version that was
written by Peter Zijlstra, who borrowed that code from
Mathieu Desnoyers.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
tracing_off() is the fastest way to stop recording to the ring buffers.
This may be used in places like panic and die, just before the
ftrace_dump is called.
This patch adds the appropriate CPP conditionals to make it a stub
function when the ring buffer is not configured it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: prevent deadlock in NMI
The ring buffers are not yet totally lockless with writing to
the buffer. When a writer crosses a page, it grabs a per cpu spinlock
to protect against a reader. The spinlocks taken by a writer are not
to protect against other writers, since a writer can only write to
its own per cpu buffer. The spinlocks protect against readers that
can touch any cpu buffer. The writers are made to be reentrant
with the spinlocks disabling interrupts.
The problem arises when an NMI writes to the buffer, and that write
crosses a page boundary. If it grabs a spinlock, it can be racing
with another writer (since disabling interrupts does not protect
against NMIs) or with a reader on the same CPU. Luckily, most of the
users are not reentrant and protects against this issue. But if a
user of the ring buffer becomes reentrant (which is what the ring
buffers do allow), if the NMI also writes to the ring buffer then
we risk the chance of a deadlock.
This patch moves the ftrace_nmi_enter called by nmi_enter() to the
ring buffer code. It replaces the current ftrace_nmi_enter that is
used by arch specific code to arch_ftrace_nmi_enter and updates
the Kconfig to handle it.
When an NMI is called, it will set a per cpu variable in the ring buffer
code and will clear it when the NMI exits. If a write to the ring buffer
crosses page boundaries inside an NMI, a trylock is used on the spin
lock instead. If the spinlock fails to be acquired, then the entry
is discarded.
This bug appeared in the ftrace work in the RT tree, where event tracing
is reentrant. This workaround solved the deadlocks that appeared there.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI PM: make the PM core more careful with drivers using the new PM framework
PCI PM: Read power state from device after trying to change it on resume
PCI PM: Do not disable and enable bridges during suspend-resume
PCI: PCIe portdrv: Simplify suspend and resume
PCI PM: Fix saving of device state in pci_legacy_suspend
PCI PM: Check if the state has been saved before trying to restore it
PCI PM: Fix handling of devices without drivers
PCI: return error on failure to read PCI ROMs
PCI: properly clean up ASPM link state on device remove
Impact: fix spurious BUG_ON() triggered under load
module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
can give false results, for example).
Almost noone should be using __module_get, but that's another issue.
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>
--------------------
The commit 649274d993 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma. The API should be called only if
NET_DMA was enabled.
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.
Oleg Nesterov observed:
Can't fastpath_timer_check()->thread_group_cputimer() have the
false warning too? Suppose we had the timer, then posix_cpu_timer_del()
removes this timer, but task_cputime_zero(&sig->cputime_expires) still
not true.
Remove the spurious debug warning.
Reported-by: Mike Galbraith <efault@gmx.de>
Explained-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unlike a normal socket path, the tuntap device send path does
not have any accounting. This means that the user-space sender
may be able to pin down arbitrary amounts of kernel memory by
continuing to send data to an end-point that is congested.
Even when this isn't an issue because of limited queueing at
most end points, this can also be a problem because its only
response to congestion is packet loss. That is, when those
local queues at the end-point fills up, the tuntap device will
start wasting system time because it will continue to send
data there which simply gets dropped straight away.
Of course one could argue that everybody should do congestion
control end-to-end, unfortunately there are people in this world
still hooked on UDP, and they don't appear to be going away
anywhere fast. In fact, we've always helped them by performing
accounting in our UDP code, the sole purpose of which is to
provide congestion feedback other than through packet loss.
This patch attempts to apply the same bandaid to the tuntap device.
It creates a pseudo-socket object which is used to account our
packets just as a normal socket does for UDP. Of course things
are a little complex because we're actually reinjecting traffic
back into the stack rather than out of the stack.
The stack complexities however should have been resolved by preceding
patches. So this one can simply start using skb_set_owner_w.
For now the accounting is essentially disabled by default for
backwards compatibility. In particular, we set the cap to INT_MAX.
This is so that existing applications don't get confused by the
sudden arrival EAGAIN errors.
In future we may wish (or be forced to) do this by default.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of calls to ima_path_check()/ima_file_free()
should be balanced. An extra call to fput(), indicates
the file could have been accessed without first being
measured.
Although f_count is incremented/decremented in places other
than fget/fput, like fget_light/fput_light and get_file, the
current task must already hold a file refcnt. The call to
__fput() is delayed until the refcnt becomes 0, resulting
in ima_file_free() flagging any changes.
- add hook to increment opencount for IPC shared memory(SYSV),
shmat files, and /dev/zero
- moved NULL iint test in opencount_get()
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
IMA provides hardware (TPM) based measurement and attestation for
file measurements. As the Trusted Computing (TPM) model requires,
IMA measures all files before they are accessed in any way (on the
integrity_bprm_check, integrity_path_check and integrity_file_mmap
hooks), and commits the measurements to the TPM. Once added to the
TPM, measurements can not be removed.
In addition, IMA maintains a list of these file measurements, which
can be used to validate the aggregate value stored in the TPM. The
TPM can sign these measurements, and thus the system can prove, to
itself and to a third party, the system's integrity in a way that
cannot be circumvented by malicious or compromised software.
- alloc ima_template_entry before calling ima_store_template()
- log ima_add_boot_aggregate() failure
- removed unused IMA_TEMPLATE_NAME_LEN
- replaced hard coded string length with #define name
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch replaces the generic integrity hooks, for which IMA registered
itself, with IMA integrity hooks in the appropriate places directly
in the fs directory.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Impact: cleanup
disable_ioapic_setup() in init/main.c is ugly as the function is
x86-specific. The #ifdef inline prototype there is ugly too.
Replace it with a generic arch_disable_smp_support() function - which
has a weak alias for non-x86 architectures and for non-ioapic x86 builds.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.
Interruptible waiters don't do that when aborting due to a signal. And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.
This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently. The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.
Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock. Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.
Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.
[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org> ["after some testing"]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avoid calling copy_from/to_user() with fb_info->lock mutex held in fbmem
ioctl().
fb_mmap() is called under mm->mmap_sem (A) held, that also acquires
fb_info->lock (B); fb_ioctl() takes fb_info->lock (B) and does
copy_from/to_user() that might acquire mm->mmap_sem (A), causing a
deadlock.
NOTE: it doesn't push down the fb_info->lock in each own driver's
fb_ioctl(), so there are still potential deadlocks elsewhere.
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The swap() macro is accidentally retuning the value of its first argument.
Change it into a doesn't-return-anything macro before someone goes and
relies upon this behaviour.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This pattern shows up frequently in the kernel:
static int once = 1;
...
if (once) {
once = 0;
printk(KERN_ERR "message\n");
}
...
So add a printk_once() helper macro that reduces this to a single line
of:
printk_once(KERN_ERR "message\n");
It works analogously to WARN_ONCE() & friends. (We use a macro not
an inline because vararg expansion in inlines looks awkward and the
macro is simple enough.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change the process wide cpu timers/clocks so that we:
1) don't mess up the kernel with too many threads,
2) don't have a per-cpu allocation for each process,
3) have no impact when not used.
In order to accomplish this we're going to split it into two parts:
- clocks; which can take all the time they want since they run
from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)
- timers; which need constant time sampling but since they're
explicity used, the user can pay the overhead.
The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We're going to split the process wide cpu accounting into two parts:
- clocks; which can take all the time they want since they run
from user context.
- timers; which need constant time tracing but can affort the overhead
because they're default off -- and rare.
The clock readout will go back to a full sum of the thread group, for this
we need to re-add the exit stats that were removed in the initial itimer
rework (f06febc9: timers: fix itimer/many thread hang).
Furthermore, since that full sum can be rather slow for large thread groups
and we have the complete dead task stats, revert the do_notify_parent time
computation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Geert Uytterhoeven pointed out that we're not zeroing all the
memory when freeing a transform. This patch fixes it by calling
ksize to ensure that we zero everything in sight.
Reported-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.
The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.
Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au>
Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
VLAN filtering allows the hypervisor to drop packets from VLANs
that we're not a part of, further reducing the number of extraneous
packets recieved. This makes use of the VLAN virtqueue command class.
The CTRL_VLAN feature bit tells us whether the backend supports VLAN
filtering.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the MAC control virtqueue class to support a MAC
filter table. The filter table is managed by the hypervisor.
We consider the table to be available if the CTRL_RX feature
bit is set. We leave it to the hypervisor to manage the table
and enable promiscuous or all-multi mode as necessary depending
on the resources available to it.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the RX_MODE control virtqueue class to enable the
set_rx_mode netdev interface. This allows us to selectively
enable/disable promiscuous and allmulti mode so we don't see
packets we don't want. For now, we automatically enable these
as needed if additional unicast or multicast addresses are
requested.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used for RX mode, MAC filter table, VLAN filtering, etc...
The control transaction consists of one or more "out" sg entries and
one or more "in" sg entries. The first out entry contains a header
defining the class and command. Additional out entries may provide
data for the command. The last in entry provides a status response
back from the command.
Virtqueues typically run asynchronous, running a callback function
when there's data in the channel. We can't readily make use of this
in the command paths where we need to use this. Instead, we kick
the virtqueue and spin. The kick causes an I/O write, triggering an
immediate trap into the hypervisor.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
libata: add no penalty retry request for EH device handling routines
libata: improve probe failure handling
libata: add @spd_limit to sata_down_spd_limit()
libata: clear dev->ering in smarter way
libata: check onlineness before using SPD in sata_down_spd_limit()
libata: move ata_dev_disable() to libata-eh.c
libata: fix EH device failure handling
sata_nv: ck804 has borked hardreset too
ide/libata: fix ata_id_is_cfa() (take 4)
libata: fix kernel-doc warnings
ahci: add a module parameter to ignore the SSS flags for async scanning
sata_mv: Fix chip type for Hightpoint RocketRaid 1740/1742
[libata] sata_sil: Fix compilation error with libata debugging enabled
Only REISERFS_IOC_* definitions are required for user space
rest should be in #ifdef __KERNEL__ as pointed by Arnd Bergmann.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
These are only for kernel internals as pointed by Arnd Bergmann:
struct nubus_board
struct nubus_dev
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
These are only for kernel internals as pointed by Arnd Bergmann:
struct kstatfs
struct venus_comm
coda_vcp()
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
The netlink connector uses its own workqueue to relay the datas sent
from userspace to the appropriate callback. If you launch the test
from Documentation/connector and change it a bit to send a high flow
of data, you will see thousands of events coming to the "cqueue"
workqueue by looking at the workqueue tracer.
This flow of events can be sent very quickly. So, to not encumber the
kevent workqueue and delay other jobs, the "cqueue" workqueue should
remain.
But this workqueue is pointless most of the time, it will always be
created (assuming you have built it of course) although only
developpers with specific needs will use it.
So avoid this "most of the time useless task", this patch proposes to
create this workqueue only when needed once. The first jobs to be
sent to connector callbacks will be sent to kevent while the "cqueue"
thread creation will be scheduled to kevent too.
The following jobs will continue to be scheduled to keventd until the
cqueue workqueue is created, and then the rest of the jobs will
continue to perform as usual, through this dedicated workqueue.
Each time I tested this patch, only the first event was sent to
keventd, the rest has been sent to cqueue which have been created
quickly.
Also, this patch fixes some trailing whitespaces on the connector files.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add kernel-doc notation for @lock:
include/linux/sched.h:457: No description found for parameter 'lock'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
3Gbps is often much more prone to transmission failures. It's usually
okay to let EH handle speed down after transmission failures but some
WD My Book drives completely shutdown after certain transmission
failures and after it only power cycling can revive them. Combined
with the fact that external drives often end up with cable assembly
which is longer than usual and more likely to have intervening gender,
this makes these drives very likely to shutdown under certain
configurations virtually rendering them unusable.
This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
such that 1.5Gbps is forced once the device is identified.
Please take a look at the following bz for related reports.
http://bugzilla.kernel.org/show_bug.cgi?id=9913
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
dev->ering used to be cleared together with the rest of ata_device in
ata_dev_init() which is called whenever a probing event occurs.
dev->ering is about to be used to track probing failures so it needs
to remain persistent over multiple porbing events. This patch
achieves this by doing the following.
* Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
clear between BEGIN and END. ering is moved after END. The split
of persistent area is to allow hotter items remain at the head.
* ering is explicitly cleared on ata_dev_disable() and when device
attach succeeds. So, ering is persistent throug a device's life
time (unless explicitly cleared of course) and also through periods
inbetween disablement of an attached device and successful detection
of the next one.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83; it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI hotplug: Change link order of pciehp & acpiphp
PCI hotplug: fakephp: Allocate PCI resources before adding the device
PCI MSI: Fix undefined shift by 32
PCI PM: Do not wait for buses in B2 or B3 during resume
PCI PM: Power up devices before restoring their state
PCI PM: Fix hibernation breakage on EeePC 701
PCI: irq and pci_ids patch for Intel Tigerpoint DeviceIDs
PCI PM: Fix suspend error paths and testing facility breakage
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: fix per cpu kmem_cache_cpu array memory leak
kmalloc: return NULL instead of link failure
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: add text file detailing queue/ sysfs files
bio.h: If they MUST be inlined, then use __always_inline
Fix misleading comment in bio.h
block: fix inconsistent parenthesisation of QUEUE_FLAG_DEFAULT
block: fix oops in blk_queue_io_stat()
Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
using a lot of memory.
Each 'struct module' contains an [NR_CPUS] array of full cache lines.
This patch uses existing infrastructure (percpu_modalloc() &
percpu_modfree()) to allocate percpu space for the refcount storage.
Instead of wasting NR_CPUS*128 bytes (on i386), we now use
nr_cpu_ids*sizeof(local_t) bytes.
On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
size of module files by about 2 Mbytes. (1Kb per module)
Instead of having all refcounters in the same memory node - with TLB misses
because of vmalloc() - this new implementation permits to have better
NUMA properties, since each CPU will use storage on its preferred node,
thanks to percpu storage.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds internal kernel support for:
- reading/extending a pcr value
- looking up the tpm_chip for a given chip number
Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Reported by Andrew Walrond <andrew@walrond.org>
Changeset c19e654ddb
("gre: Add netlink interface") added an include
of linux/ip.h to linux/if_tunnel.h
We can't really let that get exposed to userspace
because this conflicts with types defined in netinet/ip.h
which userland is almost certainly going to have included
either explicitly or implicitly.
So guard this include with a __KERNEL__ ifdef.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows machines to use an OTG transceiver driver instead of
supplying a custom is_usb_online callback to check USB power.
Also, in the case that the OTG transceiver handles charger control when
connected to USB, a regulator named "ac_draw" can be supplied instead of
the custom set_charge callback to control the charger when connected to
AC.
The check for (transceiver->state == OTG_STATE_B_PERIPHERAL) in
otg_is_usb_online is probably too simple, I'm just using this with a
peripheral only device and gpio_vbus + bq24022. I'm not sure which other
OTG states can supply power.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
The 'pcf50633_mbc_set_status' function is unused, so remove it.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
The battery charger state machine switches into charging mode when
the battery voltage falls below 96% of a battery float voltage. But
the voltage drop in Li-ion batteries is marginal(1~2 %) till about
80% of its capacity - which means, after a BATFULL, charging won't
be restarted until 80%.
This work_struct function restarts charging at regular intervals to
make sure the battery doesn't discharge too much.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/reiserfs_fs.h:687: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:995: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:997: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1467: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1760: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1764: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1766: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1769: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1771: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1805: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1948: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1949: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1950: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1951: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1962: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1963: extern's make no sense in userspace
usr/include/linux/reiserfs_fs.h:1964: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/socket.h:29: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/nubus.h:297: extern's make no sense in userspace
usr/include/linux/nubus.h:299: extern's make no sense in userspace
usr/include/linux/nubus.h:303: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/in6.h:47: extern's make no sense in userspace
usr/include/linux/in6.h:49: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/coda_psdev.h:90: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
bvec_kmap_irq() and bvec_kunmap_irq() comments say they MUST be inlined,
so mark them as __always_inline.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The comment says "remember to add offset!", but the function already adds
it.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This adds another inet device option to enable gratuitous ARP
when device is brought up or address change. This is handy for
clusters or virtualization.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some platforms (for example pcm037) do not have an EEPROM fitted,
instead storing their mac address somewhere else. The bootloader
fetches this and configures the ethernet adapter before the kernel is
started.
This patch allows a platform to indicate to the driver via the
SMSC911X_SAVE_MAC_ADDRESS flag that the mac address has already been
configured via such a mechanism, and should be saved before resetting
the chip.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
On LAN9115/LAN9117/LAN9215/LAN9217, external phys are supported. These
are usually indicated by a hardware strap which sets an "external PHY
detected" bit in the HW_CFG register.
In some cases it is desirable to override this hardware strap and force
use of either the internal phy or an external PHY. This patch adds
SMSC911X_FORCE_INTERNAL_PHY and SMSC911X_FORCE_EXTERNAL_PHY flags so a
platform can indicate this preference via its platform_data.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimer: prevent negative expiry value after clock_was_set()
hrtimers: allow the hot-unplugging of all cpus
hrtimers: increase clock min delta threshold while interrupt hanging
The definition of spawn_softlockup_task in sched.h became
unnecessary once it was converted to the early_initcall()
interface.
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix CPU hotplug hang on Power6 testbox
On architectures that support offlining all cpus (at least powerpc/pseries),
hot-unpluging the tick_do_timer_cpu can result in a system hang.
This comes from the fact that if the cpu going down happens to be the
cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
cpu is dead (via the CPU_DEAD notification), we're left without ticks,
jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
That's particularly the case for the cpu looping in __cpu_die() waiting
for the dying cpu to be dead.
This patch addresses this by having the tick_do_timer_cpu handover happen
earlier during the CPU_DYING notification. For this, a new clockevent
notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
in hrtimer_cpu_notify().
Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix the following 'make headers_check' warning:
usr/include/linux/rtnetlink.h:328: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/nubus.h:232: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/virtio_net.h:28: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/virtio_console.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/virtio_blk.h:21: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/videodev.h:53: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/video_encoder.h:5: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/video_decoder.h:7: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/taskstats.h:44: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/synclink.h:209: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/sound.h:33: extern's make no sense in userspace
usr/include/linux/sound.h:34: extern's make no sense in userspace
usr/include/linux/sound.h:35: extern's make no sense in userspace
usr/include/linux/sound.h:36: extern's make no sense in userspace
usr/include/linux/sound.h:37: extern's make no sense in userspace
usr/include/linux/sound.h:39: extern's make no sense in userspace
usr/include/linux/sound.h:40: extern's make no sense in userspace
usr/include/linux/sound.h:41: extern's make no sense in userspace
usr/include/linux/sound.h:42: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/signalfd.h:19: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/random.h:39: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/ppp_defs.h:50: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/pkt_sched.h:32: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
linux/pkt_cls.h:122: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/phonet.h:50: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/nfs_idmap.h:55: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/neighbour.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/msdos_fs.h💯 found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/minix_fs.h:34: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/irda.h:127: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/ipx.h:13: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/ipv6_route.h:42: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/ipv6.h:26: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
include/linux/ip6_tunnel.h:21: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/inet_diag.h:16: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/igmp.h:31: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_tr.h:37: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_strip.h:22: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_ppp.h:96: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_link.h:9: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_hippi.h:82: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_fc.h:37: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_addrlabel.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/if_addr.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/icmpv6.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/hiddev.h:40: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/hid.h:69: extern's make no sense in userspace
usr/include/linux/hid.h:76: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/gfs2_ondisk.h:109: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/genetlink.h:12: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/errqueue.h:6: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/elf.h:379: extern's make no sense in userspace
usr/include/linux/elf.h:387: extern's make no sense in userspace
usr/include/linux/elf.h:401: extern's make no sense in userspace
usr/include/linux/elf.h:402: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/elf-fdpic.h:62: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/efs_fs_sb.h:49: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/edd.h:70: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/dn.h:75: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/dlm_plock.h:25: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/cgroupstats.h:31: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/cdrom.h:155: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/capability.h:73: extern's make no sense in userspace
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/blktrace_api.h:96: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/bfs_fs.h:24: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/auto_fs4.h:132: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/atmbr2684.h:88: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/atalk.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/aio_abi.h:58: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/usb/gadgetfs.h:21: include of <linux/types.h> is preferred over <asm/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/usb/cdc.h:50: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_ematch/tc_em_text.h:11: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_ematch/tc_em_nbyte.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_ematch/tc_em_meta.h:18: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_ematch/tc_em_cmp.h:8: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_act/tc_pedit.h:19: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_act/tc_mirred.h:16: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/tc_act/tc_gact.h:19: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/spi/spidev.h:83: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/raid/md_p.h:85: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/nfsd/syscall.h:12: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/nfsd/syscall.h:104: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Remove bogus BUG() check in ext4_bmap()
ext4: Fix building with EXT4FS_DEBUG
ext4: Initialize the new group descriptor when resizing the filesystem
ext4: Fix ext4_free_blocks() w/o a journal when files have indirect blocks
jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"
ext3: Add sanity check to make_indexed_dir
ext4: Add sanity check to make_indexed_dir
ext4: only use i_size_high for regular files
ext4: fix wrong use of do_div
fix the following 'make headers_check' warnings:
usr/include/linux/nfsd/nfsfh.h:17: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/nfsd/nfsfh.h:28: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/nfsd/export.h:13: include of <linux/types.h> is preferred over <asm/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/netfilter/xt_conntrack.h:40: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/dvb/video.h:29: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/dvb/video.h:102: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/dvb/net.h:27: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/dvb/net.h:31: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/dvb/frontend.h:29: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/dvb/frontend.h:76: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/dvb/dmx.h:27: include of <linux/types.h> is preferred over <asm/types.h>
usr/include/linux/dvb/dmx.h:90: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/dvb/audio.h:133: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warning:
usr/include/linux/can/bcm.h:29: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
This allows us to turn off disk stat accounting completely, for the cases
where the 0.5-1% reduction in system time is important.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
bsg.h in current form is perfectly suitable for user-mode
consumption. It is needed together with scsi/sg.h for applications
that want to interface with the bsg driver.
Currently the few projects that use it would copy it over into
the projects. But that is not acceptable for projects that need
to provide source and devel packages for distros.
This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
a stable API since these Kernels and distro users will need the header
for these kernels a swell
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The existing functions for checking bio->bi_rw are badly named. So lets
mirror what we do for bio->bi_flags testing, use a properly named
function so that it's immediately obvious what is being tested.
Maintain compatability names for the old macros, eventually we'll get
rid of these.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
If we get an I/O error on a read request there is no point in doing a
verify pass on the integrity buffer. Adjust the completion path
accordingly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances. That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.
Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:
LOMEM MAX_WATCHES (per user)
512MB ~178000
1GB ~356000
2GB ~712000
A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us. No more max_user_instances
limits then.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
css_tryget() and cgroup_clear_css_refs() contain polling loops; these
loops should have cpu_relax calls in them to reduce cross-cache traffic.
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define kprobes related data structures even if CONFIG_KPROBES is not set.
This fixes compilation errors which occur if CONFIG_KPROBES is not set, in
kprobe using modules.
[akpm@linux-foundation.org: fix build for non-kprobes-supporting architectures]
Reviewed-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unfortunately simplicity isn't always the best. The fraginfo
interface turned out to be suboptimal. The problem was quite
obvious. For every packet, we have to copy the headers from
the frags structure into skb->head, even though for 99% of the
packets this part is immediately thrown away after the merge.
LRO didn't have this problem because it directly read the headers
from the frags structure.
This patch attempts to address this by creating an interface
that allows GRO to access the headers in the first frag without
having to copy it. Because all drivers that use frags place the
headers in the first frag this optimisation should be enough.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently VLAN still has a bit of common code handling the aftermath
of GRO that's shared with the common path. This patch moves them
into shared helpers to reduce code duplication.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add vendor ID for AMBIT and use it to set the ath5k LED gpio.
base.c:
Changes-licensed-under: 3-Clause-BSD
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@gmail.com>
Acked-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The separate Association Comeback Time IE was removed from IEEE 802.11w
and the Timeout Interval IE (from IEEE 802.11r) is used instead. The
editing on this is still somewhat incomplete in IEEE 802.11w/D7.0, but
still, the use of Timeout Interval IE is the expected mechanism.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A new nl80211 command, NL80211_CMD_SET_MGMT_EXTRA_IE, can be used to
add arbitrary IE data into the end of management frames. The interface
allows extra IEs to be configured for each management frame subtype, but
only some of them (ProbeReq, ProbeResp, Auth, (Re)AssocReq, Deauth,
Disassoc) are currently accepted in mac80211 implementation.
This makes it easier to implement IEEE 802.11 extensions like WPS and
FT that add IE(s) into some management frames. In addition, this can
be useful for testing and experimentation purposes.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On the AR913x SOCs we have to provide EEPROM contents via platform_data,
because accessing the flash via MMIO is not safe. Additionally different
boards may store the radio calibration data at different locations.
Changes-licensed-under: ISC
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: Imre Kaloz <kaloz@openwrt.org>
Tested-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Couple of '_ATTR's were missing and SEC_CHAN_OFFSET to CHANNEL_TYPE
rename was missed in couple of places.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add initial support for libertas devices using a GSPI interface. This has
been tested with the 8686.
GSPI is intended to be used on embedded systems. Board-specific parameters are
required (see libertas_spi.h).
Thanks to everyone who took a look at the earlier versions of the patch.
Signed-off-by: Colin McCabe <colin@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When MFP is enabled, the AP does not allow a STA to associate if an
existing security association exists without first going through SA
Query process. When this happens, the association request is denied
with a new status code ("temporarily rejected") ans Association
Comeback IE is used to notify when the association may be tried again
(i.e., when the SA Query procedure has timed out).
Use the comeback time to update the mac80211 client MLME timer for
next association attempt to minimize waiting time if association is
temporarily rejected.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Process SA Query Requests for client mode in mac80211. AP side
processing of SA Query Response frames is in user space (hostapd).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new WEXT IW_AUTH_* parameter for setting MFP
disabled/optional/required.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added new SIOCSIWENCODEEXT algorithm for configuring BIP (AES-CMAC)
keys (IGTK).
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a new IW_AUTH parameter for setting cipher suite for
multicast/broadcast management frames. This is for full-mac drivers
that take care of RSN IE generation for (re)association request frames.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add mechanism for managing BIP keys (IGTK) and integrate BIP into the
TX/RX paths.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Implement Broadcast/Multicast Integrity Protocol for management frame
protection. This patch adds the needed definitions for the new
information element (MMIE) and implementation for the new "encryption"
type (though, BIP is actually not encrypting data, it provides only
integrity protection). These routines will be used by a follow-on patch
that enables BIP for multicast/broadcast robust management frames.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Extend CCMP to support encryption and decryption of unicast management
frames.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add flags for setting STA entries and struct ieee80211_if_sta to
indicate whether management frame protection (MFP) is used.
Signed-off-by: Jouni Malinen <j@w1.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add vendor ID for Foxconn and use it to set the ath5k LED gpio and
polarity for Acer branded laptops.
base.c:
Changes-licensed-under: 3-Clause-BSD
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Tested-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds detection code for the LP-PHY and SPROM
extraction code for version 8, which is needed by the LP-PHY and
newer N-PHY.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added mappings for FHSS, DSSS and OFDM channels - with macros to point
HR DSSS and ERP to the DSSS mappings. Currently just static inline
functions.
Use the new functions in the older fullmac drivers. This eliminates a
number of const static buffers and removes a couple of range checks that
are now redundant.
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Richard Farina <sidhayn@gmail.com>
Acked-by: Jeroen Vreeken <pe1rxq@amsat.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The dma ops unification enables X86 and IA64 to share intel_dma_ops so
we can make dma mapping functions static. This also remove unused
intel_map_single().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
SuperH CMT clockevent driver.
Both 16-bit and 32-bit CMT versions are supported, but only 32-bit
is tested. This driver contains support for both clockevents and
clocksources, but no unregistration is supported at this point.
Works fine as clock source and/or event in periodic or oneshot mode.
Tested on sh7722 and sh7723, but should work with any cpu/architecture.
This version is lacking clocksource and early platform driver support
for now - this to minimize the amount of dependencies.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds support for PCI-Express controllers as found on the
newer MPC83xx chips.
The work is loosely based on the Tony Li's patch[1], but unlike the
original patch, this patch implements sliding window for the Type 1
transactions using outbound window translations, so we don't have to
ioremap the whole PCI-E configuration space.
[1] http://ozlabs.org/pipermail/linuxppc-dev/2008-January/049028.html
Signed-off-by: Tony Li <tony.li@freescale.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Commit d7b1956fed ("DMI: Introduce
dmi_first_match to make the interface more flexible") introduced compile
errors like the following when !CONFIG_DMI
drivers/ata/sata_sil.c: In function 'sil_broken_system_poweroff':
drivers/ata/sata_sil.c:713: error: implicit declaration of function 'dmi_first_match'
drivers/ata/sata_sil.c:713: warning: initialization makes pointer from integer without a cast
We just need a dummy version of dmi_first_match() to fix this all up.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The idea is that drivers which implement multiqueue RX
pre-seed the SKB by recording the RX queue selected by
the hardware.
If such a seed is found on TX, we'll use that to select
the outgoing TX queue.
This helps get more consistent load balancing on router
and firewall loads.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (36 commits)
USB: Driver for Freescale QUICC Engine USB Host Controller
USB: option: add QUANTA HSDPA Data Card device ids
USB: storage: Add another unusual_dev for off-by-one bug
USB: unusual_dev: usb-storage needs to ignore a device
USB: GADGET: fix !x & y
USB: new id for ti_usb_3410_5052 driver
USB: cdc-acm: Add another conexant modem to the quirks
USB: 'option' driver - onda device MT503HS has wrong id
USB: Remove ZTE modem from unusual_devices
USB: storage: support of Dane-Elec MediaTouch USB device
USB: usbmon: Implement compat_ioctl
USB: add kernel-doc for wusb_dev in struct usb_device
USB: ftdi_sio driver support of bar code scanner from Diebold
USB: ftdi_sio: added Alti-2 VID and Neptune 3 PID
USB: cp2101 device
USB: usblp.c: add USBLP_QUIRK_BIDIR to Brother HL-1440
USB: remove vernier labpro from ldusb
USB: CDC-ACM quirk for MTK GPS
USB: cdc-acm: support some gps data loggers
USB: composite: Fix bug: low byte of w_index is the usb interface number not the whole 2 bytes of w_index
...
Reported by Randy Dunlap from a warning on the v2.6.29 merge window.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
gcc 3.4.6 doesn't like MODULE_DEVICE_TABLE(dmi, x) expansion enough to
error out. Shut it up in a most simple way.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SLAB kmalloc with a constant value isn't consistent with the other
implementations because it bails out with __you_cannot_kmalloc_that_much
rather than returning NULL and properly allowing the caller to fall back
to vmalloc or take other action. This doesn't happen with a non-constant
value or with SLOB or SLUB.
Starting with 2.6.28, I've been seeing build failures on s390x. This is
due to init_section_page_cgroup trying to allocate 2.5MB when the max size
for a kmalloc on s390x is 2MB.
It's failing because the value is constant. The workarounds at the call
size are ugly and the caller shouldn't have to change behavior depending
on what the backend of the API is.
So, this patch eliminates the link failure and returns NULL like the other
implementations.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* 'hibern_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
SATA PIIX: Blacklist system that spins off disks during ACPI power off
SATA Sil: Blacklist system that spins off disks during ACPI power off
SATA AHCI: Blacklist system that spins off disks during ACPI power off
SATA: Blacklisting of systems that spin off disks during ACPI power off
DMI: Introduce dmi_first_match to make the interface more flexible
Hibernation: Introduce system_entering_hibernation
Introduce new libata flags ATA_FLAG_NO_POWEROFF_SPINDOWN and
ATA_FLAG_NO_HIBERNATE_SPINDOWN that, if set, will prevent disks from
being spun off during system power off and hibernation, respectively
(to handle the hibernation case we need the new system state
SYSTEM_HIBERNATE_ENTER that can be checked against by libata, in
analogy with SYSTEM_POWER_OFF).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some notebooks from HP have the problem that their BIOSes attempt to
spin down hard drives before entering ACPI system states S4 and S5.
This leads to a yo-yo effect during system power-off shutdown and the
last phase of hibernation when the disk is first spun down by the
kernel and then almost immediately turned on and off by the BIOS.
This, in turn, may result in shortening the disk's life times.
To prevent this from happening we can blacklist the affected systems
using DMI information. However, only the on-board controlles should
be blacklisted and their PCI slot numbers can be used for this
purpose. Unfortunately the existing interface for checking DMI
information of the system is not very convenient for this purpose,
because to use it, we would have to define special callback functions
or create a separate struct dmi_system_id table for each blacklisted
system.
To overcome this difficulty introduce a new function
dmi_first_match() returning a pointer to the first entry in an array
of struct dmi_system_id elements that matches the system DMI
information. Then, we can use this pointer to access the entry's
.driver_data field containing the additional information, such as
the PCI slot number, allowing us to do the desired blacklisting.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduce boolean function system_entering_hibernation() returning
'true' during the last phase of hibernation, in which devices are
being put into low power states and the sleep state (for example,
ACPI S4) is finally entered.
Some device drivers need such a function to check if the system is
in the final phase of hibernation. In particular, some SATA drivers
are going to use it for blacklisting systems in which the disks
should not be spun down during the last phase of hibernation (the
BIOS will do that anyway).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduced by 8adb711f36 ("debugfs:
introduce stub for debugfs_create_size_t() when DEBUG_FS=n") and due to
a simple missing "static inline".
Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Warn on deprecated binding model use
eeprom: More consistent symbol names
eeprom: Move 93cx6 eeprom driver to /drivers/misc/eeprom
spi: Move at25 (for SPI eeproms) to /drivers/misc/eeprom
i2c: Move old eeprom driver to /drivers/misc/eeprom
i2c: Move at24 to drivers/misc/eeprom
i2c: Quilt tree has moved
i2c: Delete many unused adapter IDs
i2c: Delete 10 unused driver IDs
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (92 commits)
gianfar: Revive VLAN support
vlan: Export symbols as non GPL symbols.
bnx2x: tx_has_work should not wait for FW
netxen: reduce memory footprint
netxen: fix vlan tso/checksum offload
net: Fix linux/if_frad.h's suitability for userspace.
net: Move config NET_NS to from net/Kconfig to init/Kconfig
isdn: Fix missing ifdef in isdn_ppp
networking: document "nc" in addition to "netcat" in netconsole.txt
e1000e: workaround hw errata
af_key: initialize xfrm encap_oa
virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs
lcs: fix compilation for !CONFIG_IP_MULTICAST
rtl8187: Add termination packet to prevent stall
iwlwifi: fix rs_get_rate WARN_ON()
p54usb: fix packet loss with first generation devices
sctp: Fix another socket race during accept/peeloff
sctp: Properly timestamp outgoing data chunks for rtx purposes
sctp: Correctly start rtx timer on new packet transmissions.
sctp: Fix crc32c calculations on big-endian arhes.
...
The userspace interfaces are protected by CONFIG_* ifdefs
and that of course can't work.
Reported by Jaswinder Singh Rajput.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let the kernel developers know that i2c_attach_client() and
i2c_detach_client() are deprecated and should no longer be used.
Drivers using these should be converted to the standard device
driver binding model (probe and remove methods.)
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ben Dooks <ben-linux@fluff.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
klist.c: bit 0 in pointer can't be used as flag
debugfs: introduce stub for debugfs_create_size_t() when DEBUG_FS=n
sysfs: fix problems with binary files
PNP: fix broken pnp lowercasing for acpi module aliases
driver core: Convert '/' to '!' in dev_set_name()
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI hotplug: fix lock imbalance in pciehp
PCI PM: Restore standard config registers of all devices early
PCI/MSI: bugfix/utilize for msi_capability_init()
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
i.MX31: framebuffer driver
i.MX31: Image Processing Unit DMA and IRQ drivers
dmaengine: add async_tx_clear_ack() macro
dmaengine: dma_issue_pending_all == nop when CONFIG_DMA_ENGINE=n
dmaengine: kill some dubious WARN_ONCEs
fsldma: print correct IRQ on mpc83xx
fsldma: check for NO_IRQ in fsl_dma_chan_remove()
dmatest: Use custom map/unmap for destination buffer
fsldma: use a valid 'device' for dma_pool_create
dmaengine: fix dependency chaining
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
debugobjects: add and use INIT_WORK_ON_STACK
rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR
relay: fix lock imbalance in relay_late_setup_files
oprofile: fix uninitialized use of struct op_entry
rcu: move Kconfig menu
softlock: fix false panic which can occur if softlockup_thresh is reduced
rcu: add __cpuinit to rcu_init_percpu_data()
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix inconsistent lock state on resume in hres_timers_resume
time-sched.c: tick_nohz_update_jiffies should be static
locking, hpet: annotate false positive warning
kernel/fork.c: unused variable 'ret'
itimers: remove the per-cpu-ish-ness
It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571.
Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Reported by Stephen Rothwell.
Due to missing 'extern' in the com20020_netdev_ops declaration,
each file that includes linux/com20020.h gets another copy
defined in it's resulting object file.
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: fix to preempt trace triggering lockdep check_flag failure
In local_bh_disable, the use of add_preempt_count causes the
preempt tracer to start recording the time preemption is off.
But because it already modified the preempt_count to show
softirqs disabled, and before it called the lockdep code to
handle this, it causes a state that lockdep can not handle.
The preempt tracer will reset the ring buffer on start of a trace,
and the ring buffer reset code does a spin_lock_irqsave. This
calls into lockdep and lockdep will fail when it detects the
invalid state of having softirqs disabled but the internal
current->softirqs_enabled is still set.
The fix is to manually add the SOFTIRQ_OFFSET to preempt count
and call the preempt tracer code outside the lockdep critical
area.
Thanks to Peter Zijlstra for suggesting this solution.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This last patch makes the appropriate changes to use and propagate the
network namespace where needed in IPv4 multicast routing code.
This consists mainly in replacing all the remaining init_net occurences
with current netns pointer retrieved from sockets, net devices or
mfc_caches depending on the routines' contexts.
Some routines receive a new 'struct net' parameter to propagate the current
netns:
* vif_add/vif_delete
* ipmr_new_tunnel
* mroute_clean_tables
* ipmr_cache_find
* ipmr_cache_report
* ipmr_cache_unresolved
* ipmr_mfc_add/ipmr_mfc_delete
* ipmr_get_route
* rt_fill_info (in route.c)
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch stores into struct mfc_cache the network namespace each
mfc_cache belongs to. The new member is mfc_net.
mfc_net is assigned at cache allocation and doesn't change during
the rest of the cache entry life.
A new net parameter is added to ipmr_cache_alloc/ipmr_cache_alloc_unres.
This will help to retrieve the current netns around the IPv4 multicast
routing code.
At the moment, all mfc_cache are allocated in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Miller suggested, related to a kstat_irqs related build breakage:
> Either linux/kernel_stat.h provides the kstat_incr_irqs_this_cpu
> interface or linux/irq.h does, not both.
So move them to kernel_stat.h.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Fix debugobjects warning
debugobject enabled kernels spit out a warning in hpet code due to a
workqueue which is initialized on stack.
Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it
in hpet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Each namespace contains ppp channels and units separately
with appropriate locks
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the host to inform us that the link is down by adding
a VIRTIO_NET_F_STATUS which indicates that device status is
available in virtio_net config.
This is currently useful for simulating link down conditions
(e.g. using proposed qemu 'set_link' monitor command) but
would also be needed if we were to support device assignment
via virtio.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking)
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch takes care of initialising and type-checking sysctls
related to feature negotiation. Type checking is important since some
of the sysctls now directly impact the feature-negotiation process.
The sysctls are initialised with the known default values for each
feature. For the type-checking the value constraints from RFC 4340
are used:
* Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes),
tested and confirmed that it works up to 4294967295 - for Gbps speed;
* Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer);
* CCIDs are between 0 .. 255;
* request_retries, retries1, retries2 also between 0..255 for good measure;
* tx_qlen is checked to be non-negative;
* sync_ratelimit remains as before.
Notes:
------
1. Die s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c.
2. As pointed out by Arnaldo, the pattern of type-checking repeats itself in
other places, sometimes with exactly the same kind of definitions (e.g.
"static int zero;"). It may be a good idea (kernel janitors?) to consolidate
type checking. For the sake of keeping the changeset small and in order not
to affect other subsystems, I have not strived to generalise here.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds full support for local/remote Sequence Window feature, from which the
* sequence-number-validity (W) and
* acknowledgment-number-validity (W') windows
derive as specified in RFC 4340, 7.5.3.
Specifically, the following is contained in this patch:
* integrated new socket fields into dccp_sk;
* updated the update_gsr/gss routines with regard to these fields;
* updated handler code: the Sequence Window feature is located at the TX side,
so the local feature is meant if the handler-rx flag is false;
* the initialisation of `rcv_wnd' in reqsk is removed, since
- rcv_wnd is not used by the code anywhere;
- sequence number checks are not done in the LISTEN state (cf. 7.5.3);
- dccp_check_req checks the Ack number validity more rigorously;
* the `struct dccp_minisock' became empty and is now removed.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
This initialises feature negotiation from two tables, which are in
turn are initialised from sysctls.
As a novel feature, specifics of the implementation (e.g. that short
seqnos and ECN are not yet available) are advertised for robustness.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of making up a name for the device ids, put them directly in the
device id table. Also move the vendor id to pci_ids.h.
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also remove unneeded last_rx update from Synclink drivers.
Synclink part mostly by Stephen Hemminger.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pre-existing network_device_stats inside network_device rather than own
private structure.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improve usbnet's devdbg to always type-check diagnostic arguments,
like dev_dbg (device.h). This makes no change to the resulting size of
usbnet modules.
This patch also removes an #ifdef DEBUG directive from rndis_wlan so
it's devdbg statements are always type-checked at compile time.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit a1ed5b0cff
(klist: don't iterate over deleted entries) introduces use of the
low bit in a pointer to indicate if the knode is dead or not,
assuming that this bit is always free.
This is not true for all architectures, CRIS for example may align data
on byte borders.
The result is a bunch of warnings on bootup, devices not being
added correctly etc, reported by Hinko Kocevar <hinko.kocevar@cetrtapot.si>:
------------[ cut here ]------------
WARNING: at lib/klist.c:62 ()
Modules linked in:
Stack from c1fe1cf0:
c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d
c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0
00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc
Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>]
[<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>]
[<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>]
[<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>]
[<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]---
------------[ cut here ]------------
Repeat ad nauseam.
Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote:
> Perhaps using a pointerhackalign trick on this structure where
> #define pointerhackalign(x) __attribute__ ((aligned (x)))
> and declare
> struct klist_node {
> ...
> } pointerhackalign(2);
>
> Because __attribute__ ((aligned (x))) could only increase alignment
> it will safe to do that and serve as documentation purpose :)
That works, but we need to do it not for the struct klist_node,
but for the struct we insert into the void * in klist_node,
which is struct klist.
Reported-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si
Cc: Bastien ROUCARIES <roucaries.bastien@gmail.com>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Toralf Förster <toralf.foerster@gmx.de> reported a build failure in
the WiMAX stack when CONFIG_DEBUG_FS=n
http://linuxwimax.org/pipermail/wimax/2009-January/000449.html
This is due to debugfs_create_size_t() missing an stub that returns
-ENODEV when the DEBUGFS subsystem is not configured in (like the rest
of the debugfs API).
This patch adds said stub.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dump the branch trace on an oops (based on ftrace_dump_on_oops).
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To complete the DMA_CTRL_ACK handling API add a async_tx_clear_ack() macro.
Signed-off-by: Guennadi Liakhovetski <lg@denx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The device list will always be empty in this configuration, so no need
to walk the list.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There is a problem in our handling of suspend-resume of PCI devices that
many of them have their standard config registers restored with
interrupts enabled and they are put into the full power state with
interrupts enabled as well. This may lead to the following scenario:
* an interrupt vector is shared between two or more devices
* one device is resumed earlier and generates an interrupt
* the interrupt handler of another device tries to handle it and
attempts to access the device the config space of which hasn't been
restored yet and/or which still is in a low power state
* the system crashes as a result
To prevent this from happening we should restore the standard
configuration registers of all devices with interrupts disabled and we
should put them into the D0 power state right after that.
Unfortunately, this cannot be done using the existing
pci_set_power_state(), because it can sleep. Also, to do it we have to
make sure that the config spaces of all devices were actually saved
during suspend.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We implement dqget() and dqput() that need neither dqonoff_mutex nor dqptr_sem.
Then move dqget() and dqput() calls so that they are not called from under
dqptr_sem. This is important because filesystem callbacks aren't called from
under dqptr_sem which used to cause *lots* of problems with lock ranking
(and with OCFS2 they became close to unsolvable).
The patch also removes two functions which were introduced solely because OCFS2
needed them to cope with the old locking scheme. As time showed, they were not
enough for OCFS2 anyway and it would be unnecessary work to adapt them to the
new locking scheme in which they aren't needed. As a result OCFS2 needs the
following patch to compile properly with quotas. Sorry to any bisecters which
hit this in advance.
Signed-off-by: Jan Kara <jack@suse.cz>
Otherwise it can be very confusing to find a "EXT3-fs: " failure in
the middle of EXT4-fs failures, and it makes it harder to track the
source of the failure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Provide a shared interrupt debug facility under CONFIG_DEBUG_SHIRQ:
it uses the existing irqpoll facilities to iterate through all
registered interrupt handlers and call those which can handle shared
IRQ lines.
This can be handy for suspend/resume debugging: if we call this function
early during resume we can trigger crashes in those drivers which have
incorrect assumptions about when exactly their ISRs will be called
during suspend/resume.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_fsl: Return non-zero on error in probe()
drivers/ata/pata_ali.c: s/isa_bridge/ali_isa_bridge/ to fix alpha build
libata: New driver for OCTEON SOC Compact Flash interface (v7).
libata: Add another column to the ata_timing table.
sata_via: Add VT8261 support
pata_atiixp: update port enabledness test handling
[libata] get-identity ioctl: Fix use of invalid memory pointer
The forthcoming OCTEON SOC Compact Flash driver needs an additional
timing value that was not available in the ata_timing table. I add a
new column for dmack_hold time. The values were obtained from the
Compact Flash specification Rev 4.1.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
for SAS drivers.
Caught by Ke Wei (and team?) at Marvell.
Also, move the ata_scsi_ioctl export to libata-scsi.c, as that seems to be the
general trend.
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Decoupling allows:
* hung tasks check to happen at very low priority
* hung tasks check and softlockup to be enabled/disabled independently
at compile and/or run-time
* individual panic settings to be enabled disabled independently
at compile and/or run-time
* softlockup threshold to be reduced without increasing hung tasks
poll frequency (hung task check is expensive relative to softlock watchdog)
* hung task check to be zero over-head when disabled at run-time
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephen Rothwell reported this linux-next build failure with !CONFIG_SCHEDSTATS:
| In file included from kernel/sched.c:1703:
| kernel/sched_fair.c: In function 'adaptive_gran':
| kernel/sched_fair.c:1324: error: 'struct sched_entity' has no member named 'avg_wakeup'
The start_runtime and avg_wakeup metrics are now not just for statistics,
but also for scheduling - so they always need to be available. (Also
move out the nr_migrations fields - for future perfcounters usage.)
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (95 commits)
b44: GFP_DMA skb should not escape from driver
korina: do not use IRQF_SHARED with IRQF_DISABLED
korina: do not stop queue here
korina: fix handling tx_chain_tail
korina: do tx at the right position
korina: do schedule napi after testing for it
korina: rework korina_rx() for use with napi
korina: disable napi on close and restart
korina: reset resource buffer size to 1536
korina: fix usage of driver_data
bnx2x: First slow path interrupt race
bnx2x: MTU Filter
bnx2x: Indirection table initialization index
bnx2x: Missing brackets
bnx2x: Fixing the doorbell size
bnx2x: Endianness issues
bnx2x: VLAN tagged packets without VLAN offload
bnx2x: Protecting the link change indication
bnx2x: Flow control updated before reporting the link
bnx2x: Missing mask when calculating flow control
...
Impact: fix 15 make headers_check warnings:
include of <linux/types.h> is preferred over <asm/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the standard magic.h for btrfs and squashfs.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix __request_region() parameter kernel-doc notation and parameter name:
Warning(linux-2.6.28-git10//kernel/resource.c:627): No description found for parameter 'flags'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix jbd header file kernel-doc notation:
Warning(linux-2.6.28-git13//include/linux/jbd.h:823): No description found for parameter 'j_average_commit_time'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce a new avg_wakeup statistic.
avg_wakeup is a measure of how frequently a task wakes up other tasks, it
represents the average time between wakeups, with a limit of avg_runtime
for when it doesn't wake up anybody.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds an init_dummy_netdev() function that gets a network device
structure (allocation and lifetime entirely under caller's control) and
initialize the minimum amount of fields so it can be used to schedule
NAPI polls without registering a full blown interface. This is to be
used by drivers that need to tie several hardware interfaces to a single
NAPI poll scheduler due to HW limitations.
It also updates the ibm_newemac driver to use that, this fixing the
oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add()
Symbol is exported GPL only a I don't think we want binary drivers doing
that sort of acrobatics (if we want them at all).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (29 commits)
powerpc/83xx: Move mcu_mpc8349emitx driver out of drivers/i2c/chips/
powerpc/83xx: Make serial ports work on MPC8315E-RDB w/ FSL U-Boots
powerpc/e500mc: Doorbells need to be taken w/exceptions disabled
powerpc: Enable PS3 options and QPACE in ppc64_defconfig
powerpc/powermac: Fix occasional SMP boot failure
powerpc/cacheinfo: Rename cache_dir per-cpu variable
hvc_console: Use kzalloc() instead of kmalloc() + memset()
hvc_console: Do not set low_latency when using interrupts
hvc_console: Call free_irq() only if request_irq() was successful
hvc_console: Change an mb() to smp_mb() and add some comments
powerpc: Cleanup from l64 to ll64 change: drivers/net
powerpc: Cleanup from l64 to ll64 change: drivers/char
powerpc: Cleanup from l64 to ll64 change: arch code
powerpc: Change u64/s64 to a long long integer type
powerpc/kexec: Check crash_base for relocatable kernel
powerpc: Make dummy section a valid note header
Xilinx: SPI: updated driver for device tree
drivers/of: Add the of_find_i2c_device_by_node function.
powerpc/xsysace: add compatible string for non-ipcore instance
powerpc/mpc52xx: remove dead code from GPIO driver
...
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[CVE-2009-0029] s390 specific system call wrappers
[CVE-2009-0029] System call wrappers part 33
[CVE-2009-0029] System call wrappers part 32
[CVE-2009-0029] System call wrappers part 31
[CVE-2009-0029] System call wrappers part 30
[CVE-2009-0029] System call wrappers part 29
[CVE-2009-0029] System call wrappers part 28
[CVE-2009-0029] System call wrappers part 27
[CVE-2009-0029] System call wrappers part 26
[CVE-2009-0029] System call wrappers part 25
[CVE-2009-0029] System call wrappers part 24
[CVE-2009-0029] System call wrappers part 23
[CVE-2009-0029] System call wrappers part 22
[CVE-2009-0029] System call wrappers part 21
[CVE-2009-0029] System call wrappers part 20
[CVE-2009-0029] System call wrappers part 19
[CVE-2009-0029] System call wrappers part 18
[CVE-2009-0029] System call wrappers part 17
[CVE-2009-0029] System call wrappers part 16
[CVE-2009-0029] System call wrappers part 15
...
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
IDE: fix sparse signed-ness errors with host->host_busy
ide: fix suspend regression
tx4938ide: Fix build error due to read_sff_dma_status moving
ide: remove unused CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ
sl82c105: remove dead code
via82cxxx: fix cable warning message
ide: can't use SSD/non-rotational queue flag for all CFA devices
it821x.c: use dev->revision instead of pci_read_config_byte
it821x: Add ultra_mask quirk for Vortex86SX
ide: fix accidental LOCKDEP breakage caused by local_irq_set() removal
The host_busy field in struct ide_host defaults to a
signed-long, where most arch's test_and_set_bit_*
macros use an unsigned long.
Change to using an unsigned long, which on ARM removes
the following sparse errors:
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
On Vortex86SX with IDE controller revision 0x11 ultra DMA must be
disabled. This patch was tested by DMP and seems to work.
It is a cleaned up version of their older Kernel patch:
http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz
Tested-by: Shawn Lin <shawn@dmp.com.tw>
Signed-off-by: Brandon Philips <bphilips@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Change mutex contention behaviour such that it will sometimes busy wait on
acquisition - moving its behaviour closer to that of spinlocks.
This concept got ported to mainline from the -rt tree, where it was originally
implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.
Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
gave a 345% boost for VFS scalability on my testbox:
# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 296604
# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 85870
The key criteria for the busy wait is that the lock owner has to be running on
a (different) cpu. The idea is that as long as the owner is running, there is a
fair chance it'll release the lock soon, and thus we'll be better off spinning
instead of blocking/scheduling.
Since regular mutexes (as opposed to rtmutexes) do not atomically track the
owner, we add the owner in a non-atomic fashion and deal with the races in
the slowpath.
Furthermore, to ease the testing of the performance impact of this new code,
there is means to disable this behaviour runtime (without having to reboot
the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
by issuing the following command:
# echo NO_OWNER_SPIN > /debug/sched_features
This command re-enables spinning again (this is also the default):
# echo OWNER_SPIN > /debug/sched_features
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem is that dropping the spinlock right before schedule is a voluntary
preemption point and can cause a schedule, right after which we schedule again.
Fix this inefficiency by keeping preemption disabled until we schedule, do this
by explicity disabling preemption and providing a schedule() variant that
assumes preemption is already disabled.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This assertion is incorrect for lockless pagecache. By definition if we
have an unpinned page that we are trying to take a speculative reference
to, it may become the tail of a compound page at any time (if it is
freed, then reallocated as a compound page).
It was still a valid assertion for the vmscan.c LRU isolation case, but
it doesn't seem incredibly helpful... if somebody wants it, they can
put it back directly where it applies in the vmscan code.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This enables the use of syscall wrappers to do proper sign extension
for 64-bit programs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
By selecting HAVE_SYSCALL_WRAPPERS architectures can activate
system call wrappers in order to sign extend system call arguments.
All architectures where the ABI defines that the caller of a function
has to perform sign extension probably need this.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Convert all system calls to return a long. This should be a NOP since all
converted types should have the same size anyway.
With the exception of sys_exit_group which returned void. But that doesn't
matter since the system call doesn't return.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Define FTRACE_ADDR. In IA64, a function pointer isn't a 'unsigned long' but a
'struct {unsigned long ip, unsigned long gp}'.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At run-time, if softlockup_thresh is changed to a much lower value,
touch_timestamp is likely to be much older than the new softlock_thresh.
This will cause a false softlockup to be detected. If softlockup_panic
is enabled, the system will panic.
The fix is to touch all watchdogs before changing softlockup_thresh.
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
reorder struct xt_match to remove 8 bytes of padding and make its size
128 bytes.
This saves a small amount of data space in each of the xt netfilter
modules and fits xt_match in one 128 byte cache line.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sparc64: Fix cpumask related build failure
smp_call_function_single(): be slightly less stupid, fix
smp_call_function_single(): be slightly less stupid
rcu: fix bug in rcutorture system-shutdown code
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
ucc_geth: use correct UCCE macros
net_dma: acquire/release dma channels on ifup/ifdown
cxgb3: Keep LRO off if disabled when interface is down
sfc: SFT9001: Fix condition for LNPGA power-off
dccp ccid-3: Fix RFC reference
smsc911x: register irq with device name, not driver name
smsc911x: fix smsc911x_reg_read compiler warning
forcedeth: napi schedule lock fix
net: fix section mismatch warnings in dccp/ccids/lib/tfrc.c
forcedeth: remove mgmt unit for mcp79 chipset
qlge: Remove dynamic alloc of rx ring control blocks.
qlge: Fix schedule while atomic issue.
qlge: Remove support for device ID 8000.
qlge: Get rid of split addresses in hardware control blocks.
qlge: Get rid of volatile usage for shadow register.
forcedeth: version bump and copyright
forcedeth: xmit lock fix
netdev: missing validate_address hooks
netdev: add missing set_mac_address hook
netdev: gianfar: add MII ioctl handler
...
* 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6:
Fix small typo
misdn: indentation and braces disagree - add braces
misdn: one handmade ARRAY_SIZE converted
drivers/isdn/hardware/mISDN: move a dereference below a NULL test
indentation & braces disagree - add braces
Make parameter debug writable
BUGFIX: used NULL pointer at ioctl(sk,IMGETDEVINFO,&devinfo) when devinfo.id not registered
warning: ignoring return value of 'device_register', declared with attribute
warn_unused_result
warning: ignoring return value of 'device_create_file', declared with
attribute warn_unused_result
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Alexander Beregalov reported that this warning is caused by the HPET code:
> hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
> hpet0: 3 comparators, 64-bit 14.318180 MHz counter
> ODEBUG: object is on stack, but not annotated
> ------------[ cut here ]------------
> WARNING: at lib/debugobjects.c:251 __debug_object_init+0x2a4/0x352()
> Bisected down to 26afe5f2fb
> (x86: HPET_MSI Initialise per-cpu HPET timers)
The commit is fine - but the on-stack workqueue entry needs annotation.
Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar wrote:
> All non-x86 architectures fail to build:
>
> In file included from /home/mingo/tip/include/linux/random.h:11,
> from /home/mingo/tip/include/linux/stackprotector.h:6,
> from /home/mingo/tip/init/main.c:17:
> /home/mingo/tip/include/linux/irqnr.h:26:63: error: asm/irq_vectors.h: No such file or directory
Do not include asm/irq_vectors.h in generic code - it's not available
on all architectures.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Reduce memory usage.
This is the second half of the changes to make the irq_desc_ptrs be
variable sized based on nr_cpu_ids. This is done by adding a new
"max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to
return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids.
This necessitated moving the define of MAX_IO_APICS to a separate
file (asm/apicnum.h) so it could be included without the baggage
of the other asm/apicdef.h declarations.
Signed-off-by: Mike Travis <travis@sgi.com>
Impact: fix bug where new irq_desc uses old cpumask pointers which are freed.
As Yinghai pointed out, init_copy_one_irq_desc() copies the old desc to
the new desc overwriting the cpumask pointers. Since the old_desc and
the cpumask pointers are freed, then memory corruption will occur if
these old pointers are used.
Move the allocation of these pointers to after the copy.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Impact: reduce stack usage, use new cpumask API.
This actually uses topology_core_cpumask() and
topology_thread_cpumask(), removing the only users of
topology_core_siblings() and topology_thread_siblings()
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: linux-net-drivers@solarflare.com
Impact: reduce memory usage, use new cpumask API.
Replace the affinity and pending_masks with cpumask_var_t's. This adds
to the significant size reduction done with the SPARSE_IRQS changes.
The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
in the include file so they can be inlined (and optimized out for the
!CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with
the other init*irq functions, as well as the backwards arg declaration
of "from, to" instead of the more common "to, from" standard.]
Includes a slight change to the declaration of struct irq_desc to embed
the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
references, and some small changes to Xen.
Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: virtualization@lists.osdl.org
Cc: xen-devel@lists.xensource.com
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
The recent dmaengine rework removed the capability to remove dma device
driver modules while net_dma is active. Rather than notify
dmaengine-clients that channels are trying to be removed, we now rely on
clients to notify dmaengine when they no longer have a need for
channels. Teach net_dma to release channels by taking dmaengine
references at netdevice open and dropping references at netdevice close.
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The swiotlb_arch_range_needs_mapping() hook should take a physical
address rather than a virtual address in order to support highmem pages.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up sparseirq fallout on random.c
Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS
so we could some #ifdef later if all arch support genirq
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If you do
smp_call_function_single(expression-with-side-effects, ...)
then expression-with-side-effects never gets evaluated on UP builds.
As always, implementing it in C is the correct thing to do.
While we're there, uninline it for size and possible header dependency
reasons.
And create a new kernel/up.c, as a place in which to put
uniprocessor-specific code and storage. It should mirror kernel/smp.c.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Changes from V1:
- Removed support for suspend_enable & suspend_disable functions.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
What the PCF05633 calls as a 'GPIO' is much more than the GPIO in the linux
sense and there are only 4 of them - which means, the gpiolib is not used
here.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch adds basic support for the PCF50633 ADC. The subtractive mode
is not supported yet.
Since we don't have adc subsystem, it currently lives in drivers/mfd.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Acked-by: Jonathan Cameron <jonathan.cameron@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch implements the core of the PCF50633 driver. This core driver has
generic register read/write functions and does interrupt management for its
sub devices.
Signed-off-by: Balaji Rao <balajirrao@openmoko.org>
Cc: Andy Green <andy@openmoko.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch adds a per host flag that allows drivers to opt in into
having its busses scanned in parallel.
Drivers that do not set this flag get their ports scanned in
the "original" sequence.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>