Commit Graph

33311 Commits

Author SHA1 Message Date
Rusty Russell
0d21b0e347 module: add new state MODULE_STATE_UNFORMED.
You should never look at such a module, so it's excised from all paths
which traverse the modules list.

We add the state at the end, to avoid gratuitous ABI break (ksplice).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-12 13:27:05 +10:30
Stanislaw Gruszka
d07d7507bf net, wireless: overwrite default_ethtool_ops
Since:

commit 2c60db0370
Author: Eric Dumazet <edumazet@google.com>
Date:   Sun Sep 16 09:17:26 2012 +0000

    net: provide a default dev->ethtool_ops

wireless core does not correctly assign ethtool_ops.

After alloc_netdev*() call, some cfg80211 drivers provide they own
ethtool_ops, but some do not. For them, wireless core provide generic
cfg80211_ethtool_ops, which is assigned in NETDEV_REGISTER notify call:

        if (!dev->ethtool_ops)
                dev->ethtool_ops = &cfg80211_ethtool_ops;

But after Eric's commit, dev->ethtool_ops is no longer NULL (on cfg80211
drivers without custom ethtool_ops), but points to &default_ethtool_ops.

In order to fix the problem, provide function which will overwrite
default_ethtool_ops and use it by wireless core.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-11 15:55:48 -08:00
Michel Lespinasse
3cb7a56344 lib/rbtree.c: avoid the use of non-static __always_inline
lib/rbtree.c declared __rb_erase_color() as __always_inline void, and
then exported it with EXPORT_SYMBOL.

This was because __rb_erase_color() must be exported for augmented
rbtree users, but it must also be inlined into rb_erase() so that the
dummy callback can get optimized out of that call site.

(Actually with a modern compiler, none of the dummy callback functions
should even be generated as separate text functions).

The above usage is legal C, but it was unusual enough for some compilers
to warn about it.  This change makes things more explicit, with a static
__always_inline ____rb_erase_color function for use in rb_erase(), and a
separate non-inline __rb_erase_color function for use in
rb_erase_augmented call sites.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:56 -08:00
Mel Gorman
8fb74b9fb2 mm: compaction: partially revert capture of suitable high-order page
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket.  It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e ("mm: compaction: capture a suitable high-order page
immediately when it is made available").

The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed.  For Eric, the problem was
that page->pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped.  However, I identified a few more problems with the patch
including the fact that it can increase contention on zone->lock in some
cases which could result in async direct compaction being aborted early.

In retrospect the capture patch took the wrong approach.  What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way.  While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested.  This patch
partially reverts commit 1fb3f8ca0e ("mm: compaction: capture a
suitable high-order page immediately when it is made available").

Reported-and-tested-by: Eric Wong <normalperson@yhbt.net>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:56 -08:00
Mike Frysinger
c0a3a20b6c linux/audit.h: move ptrace.h include to kernel header
While the kernel internals want pt_regs (and so it includes
linux/ptrace.h), the user version of audit.h does not need it.  So move
the include out of the uapi version.

This avoids issues where people want the audit defines and userland
ptrace api.  Including both the kernel ptrace and the userland ptrace
headers can easily lead to failure.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:56 -08:00
Kees Cook
7b9205bd77 audit: create explicit AUDIT_SECCOMP event type
The seccomp path was using AUDIT_ANOM_ABEND from when seccomp mode 1
could only kill a process.  While we still want to make sure an audit
record is forced on a kill, this should use a separate record type since
seccomp mode 2 introduces other behaviors.

In the case of "handled" behaviors (process wasn't killed), only emit a
record if the process is under inspection.  This change also fixes
userspace examination of seccomp audit events, since it was considered
malformed due to missing fields of the AUDIT_ANOM_ABEND event type.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Julien Tinnes <jln@google.com>
Acked-by: Will Drewry <wad@chromium.org>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:55 -08:00
Jiri Kosina
1b963c81b1 lockdep, rwsem: provide down_write_nest_lock()
down_write_nest_lock() provides a means to annotate locking scenario
where an outer lock is guaranteed to serialize the order nested locks
are being acquired.

This is analogoue to already existing mutex_lock_nest_lock() and
spin_lock_nest_lock().

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:55 -08:00
David Decotigny
896f97ea95 lib: cpu_rmap: avoid flushing all workqueues
In some cases, free_irq_cpu_rmap() is called while holding a lock (eg
rtnl).  This can lead to deadlocks, because it invokes
flush_scheduled_work() which ends up waiting for whole system workqueue
to flush, but some pending works might try to acquire the lock we are
already holding.

This commit uses reference-counting to replace
irq_run_affinity_notifiers().  It also removes
irq_run_affinity_notifiers() altogether.

[akpm@linux-foundation.org: eliminate free_cpu_rmap, rename cpu_rmap_reclaim() to cpu_rmap_release(), propagate kref_put() retval from cpu_rmap_put()]
Signed-off-by: David Decotigny <decot@googlers.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:54 -08:00
Nathan Hintz
990debe2ca bcma: update pci configuration for bcm4706/bcm4716
Update the PCI configuration for BCM4706 and BCM4716 per the 2011
Broadcom SDK.

Signed-off-by: Nathan Hintz <nlhintz@hotmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-01-11 14:50:00 -05:00
Nathan Hintz
e2aa19fadd bcma: return the mips irq number in bcma_core_irq
The irq signal numbers that are send by the cpu are increased by 2 from
the number programmed into the mips core by bcma.
Return the irq number on which the irqs are send in bcma_core_irq() now.

Signed-off-by: Nathan Hintz <nlhintz@hotmail.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-01-11 14:49:59 -05:00
Alexander Duyck
024e9679a2 net: Add support for XPS without sysfs being defined
This patch makes it so that we can support transmit packet steering without
sysfs needing to be enabled.  The reason for making this change is to make
it so that a driver can make use of the XPS even while the sysfs portion of
the interface is not present.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10 22:47:04 -08:00
Alexander Duyck
537c00de1c net: Add functions netif_reset_xps_queue and netif_set_xps_queue
This patch adds two functions, netif_reset_xps_queue and
netif_set_xps_queue.  The main idea behind these two functions is to
provide a mechanism through which drivers can update their defaults in
regards to XPS.

Currently no such mechanism exists and as a result we cannot use XPS for
things such as ATR which would require a basic configuration to start in
which the Tx queues are mapped to CPUs via a 1:1 mapping.  With this change
I am making it possible for drivers such as ixgbe to be able to use the XPS
feature by controlling the default configuration.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10 22:47:03 -08:00
Alexander Duyck
416186fbf8 net: Split core bits of netdev_pick_tx into __netdev_pick_tx
This change splits the core bits of netdev_pick_tx into a separate function.
The main idea behind this is to make this code accessible to select queue
functions when they decide to process the standard path instead of their
own custom path in their select queue routine.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10 22:47:03 -08:00
Greg Kroah-Hartman
54b956b903 Remove __dev* markings from init.h
Now that all in-kernel users of __dev* are gone, let's remove them from
init.h to keep them from popping up again and again.

Thanks to Bill Pemberton for doing all of the hard work to make removal
of this possible.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-10 10:57:01 -08:00
Rafał Miłecki
dd4544f054 bgmac: driver for GBit MAC core on BCMA bus
BCMA is a Broadcom specific bus with devices AKA cores. All recent BCMA
based SoCs have gigabit ethernet provided by the GBit MAC core. This
patch adds driver for such a cores registering itself as a netdev. It
has been tested on a BCM4706 and BCM4718 chipsets.

In the kernel tree there is already b44 driver which has some common
things with bgmac, however there are many differences that has led to
the decision or writing a new driver:
1) GBit MAC cores appear on BCMA bus (not SSB as in case of b44)
2) There is 64bit DMA engine which differs from 32bit one
3) There is no CAM (Content Addressable Memory) in GBit MAC
4) We have 4 TX queues on GBit MAC devices (instead of 1)
5) Many registers have different addresses/values
6) RX header flags are also different

The driver in it's state is functional how, however there is of course
place for improvements:
1) Supporting more net_device_ops
2) SUpporting more ethtool_ops
3) Unaligned addressing in DMA
4) Writing separated PHY driver

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-09 23:37:03 -08:00
Marc Dionne
08c097fc3b cred: Remove tgcred pointer from struct cred
Commit 3a50597de8 ("KEYS: Make the session and process keyrings
per-thread") removed the definition of the thread_group_cred structure,
but left a now unused pointer in struct cred.

Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-09 08:26:53 -08:00
Cong Wang
b7394d2429 netpoll: prepare for ipv6
This patch adjusts some struct and functions, to prepare
for supporting IPv6.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-08 17:56:09 -08:00
Eric Dumazet
fda55eca5a net: introduce skb_transport_header_was_set()
We have skb_mac_header_was_set() helper to tell if mac_header
was set on a skb. We would like the same for transport_header.

__netif_receive_skb() doesn't reset the transport header if already
set by GRO layer.

Note that network stacks usually reset the transport header anyway,
after pulling the network header, so this change only allows
a followup patch to have more precise qdisc pkt_len computation
for GSO packets at ingress side.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-08 17:51:54 -08:00
Sascha Hauer
e909c682d0 [media] coda: Fix build due to iram.h rename
commit c045e3f13 (ARM: imx: include iram.h rather than mach/iram.h) changed the
location of iram.h, which causes the following build error when building the coda
driver:

drivers/media/platform/coda.c:27:23: error: mach/iram.h: No such file or directory
drivers/media/platform/coda.c: In function 'coda_probe':
drivers/media/platform/coda.c:2000: error: implicit declaration of function 'iram_alloc'
drivers/media/platform/coda.c:2001: warning: assignment makes pointer from integer without a cast
drivers/media/platform/coda.c: In function 'coda_remove':
drivers/media/platform/coda.c:2024: error: implicit declaration of function 'iram_free'

Since the content of iram.h is not imx specific, move it to
include/linux/platform_data/imx-iram.h instead. This is an intermediate solution
until the i.MX iram allocator is converted to the generic SRAM allocator.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-01-08 09:44:06 +01:00
Hauke Mehrtens
8f9dc85348 bcma: mips: remove assigned_irqs from structure
This member is not needed any more.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-01-07 15:18:29 -05:00
Linus Torvalds
d287b8750e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull namei.h missing include fix from Al Viro.

The new use of ESTALE in namei.h can cause compile failures on ARM with
certain configurations due to lack of errno.h.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  namei.h: include errno.h
2013-01-07 09:41:20 -08:00
Bartlomiej Zolnierkiewicz
a458431e17 mm: fix zone_watermark_ok_safe() accounting of isolated pages
Commit 702d1a6e07 ("memory-hotplug: fix kswapd looping forever
problem") added an isolated pageblocks counter (nr_pageblock_isolate in
struct zone) and used it to adjust free pages counter in
zone_watermark_ok_safe() to prevent kswapd looping forever problem.

Then later, commit 2139cbe627 ("cma: fix counting of isolated pages")
fixed accounting of isolated pages in global free pages counter.  It
made the previous zone_watermark_ok_safe() fix unnecessary and
potentially harmful (cause now isolated pages may be accounted twice
making free pages counter incorrect).

This patch removes the special isolated pageblocks counter altogether
which fixes zone_watermark_ok_safe() free pages check.

Reported-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-04 16:11:46 -08:00
Stanislav Kinsbursky
3a665531a3 selftests: IPC message queue copy feature test
This test can be used to check wheither kernel supports IPC message queue
copy and restore features (required by CRIU project).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-04 16:11:45 -08:00
Stanislav Kinsbursky
f9dd87f473 ipc: message queue receive cleanup
Move all message related manipulation into one function msg_fill().
Actually, two functions because of the compat one.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-04 16:11:45 -08:00
Stanislav Kinsbursky
03f5956680 ipc: add sysctl to specify desired next object id
Add 3 new variables and sysctls to tune them (by one "next_id" variable
for messages, semaphores and shared memory respectively).  This variable
can be used to set desired id for next allocated IPC object.  By default
it's equal to -1 and old behaviour is preserved.  If this variable is
non-negative, then desired idr will be extracted from it and used as a
start value to search for free IDR slot.

Notes:

1) this patch doesn't guarantee that the new object will have desired
   id.  So it's up to user space how to handle new object with wrong id.

2) After a sucessful id allocation attempt, "next_id" will be set back
   to -1 (if it was non-negative).

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-04 16:11:45 -08:00
Jiri Pirko
85464ef271 net: kill dev->master
Nobody uses this now. Remove it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04 13:31:51 -08:00
Jiri Pirko
8b98a70c28 net: remove no longer used netdev_set_bond_master() and netdev_set_master()
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04 13:31:50 -08:00
Jiri Pirko
9ff162a8b9 net: introduce upper device lists
This lists are supposed to serve for storing pointers to all upper devices.
Eventually it will replace dev->master pointer which is used for
bonding, bridge, team but it cannot be used for vlan, macvlan where
there might be multiple upper present. In case the upper link is
replacement for dev->master, it is marked with "master" flag.

New upper device list resolves this limitation. Also, the information
stored in lists is used for preventing looping setups like
"bond->somethingelse->samebond"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04 13:31:49 -08:00
Jiri Pirko
fbdeca2d77 net: add address assign type "SET"
This is the way to indicate that mac address of a device has been set by
dev_set_mac_address()

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03 22:37:36 -08:00
Jiri Pirko
e41b2d7fe7 net: set dev->addr_assign_type correctly
Not a bitfield, but a plain value.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03 22:37:35 -08:00
Greg Kroah-Hartman
e389623a68 include: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit from some include files that
were previously missed.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:16 -08:00
Greg Kroah-Hartman
0f58a01ddd Drivers: bcma: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, and __devexit
from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: "Rafał Miłecki" <zajec5@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:15 -08:00
Greg Kroah-Hartman
f568f6ca81 pstore: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit from the pstore filesystem.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:14 -08:00
Johannes Berg
ec61cd63dd mac80211: support HT notify channel width action
Support the HT notify channel width action frame
to update the rate scaling about the bandwidth
the peer can receive in.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03 13:01:44 +01:00
Johannes Berg
598a5938e0 wireless: use __packed in ieee80211.h
Use __packed instead of __attribute__((packed)).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03 13:01:43 +01:00
Vladimir Kondratiev
0f6dfcee2e wireless: more 'capability info' bits
define bits for 'capability info', as in recent spec edition
IEEE802.11-2012

Also, add mask for 2-bit field 'bss type', as it is in 802.11ad

Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-03 13:01:15 +01:00
Linus Torvalds
080a62e2ce PCI updates for v3.8:
PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
   PCI: Remove spurious error for sriov_numvfs store and simplify flow
   PCI: Add PCIe Link Capability link speed and width names
   PCI/PM: Do not suspend port if any subordinate device needs PME polling
   PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ5HnVAAoJEPGMOI97Hn6zprgP/1we4RhVaXLRnmLyc9OpS97Z
 9KZUU/rsMQ/RYstdNFV30JOypMFL1BlK7jpLR14gjSgCulKK9etvjBTwiV26Rfor
 n/LWru4CWUtGUH/2c4IwuN0FKxfU7W4GxuVfKi3uACh7yJRwKgxZhFKLLb4OZ/T0
 A1CiIktdpZhH5A8+WdoSkZSsfQPuUA6UVKQleEQh/qJl9qgxwEDLsdj5fIZLsFUB
 Fo3bbusq2X+pHU0uuBIzrheSUeSmxXzeZcte8JxTEEwB/Gdsn24lJ39MK5PHAaOE
 gSVC7HDi+vNCICZhi7H93musPczL1TqeyMZQWSa/rj7KV836kG+Phz61SmsXTxyR
 VpfnEZOx7GreErpBuLKrOVslXJl1TBc/ZiiLd5SBUlO4ZClAssPcevtUexCR3xr6
 eHoSYMtwblW/vgJ3rn/PD8SgksZVJsd6+JAlVbC53XAdeJuEheCdEU7HnQZ3ZQRF
 6wpWOBfIxdSQM4AukncNjUSTQpVjNFoEXNcPBCamazDz9NgRIcrnBAd/94+AVD0t
 WQpoU0HDP6h00pK8Ls3Fsv23qbfPDPP9i6zhSGlv5Q9Sz5T8b178j8h7tkUgtPy/
 vxAtwUgFwz5cxE053lrht0JEQUikv99VcUJKrQc17g6GIMenh4duXrxF1I1EERXD
 fcVZNas3SrnLSKfsVwws
 =Ehs+
 -----END PGP SIGNATURE-----

Merge tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Some fixes for v3.8.  They include a fix for the new SR-IOV sysfs
  management support, an expanded quirk for Ricoh SD card readers, a
  Stratus DMI quirk fix, and a PME polling fix."

* tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
  PCI/PM: Do not suspend port if any subordinate device needs PME polling
  PCI: Add PCIe Link Capability link speed and width names
  PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
  PCI: Remove spurious error for sriov_numvfs store and simplify flow
2013-01-02 17:44:29 -08:00
David Howells
3d33fcc11b UAPI: Remove empty Kbuild files
Empty files can get deleted by the patch program, so remove empty Kbuild
files and their links from the parent Kbuilds.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02 17:36:10 -08:00
Mel Gorman
42288fe366 mm: mempolicy: Convert shared_policy mutex to spinlock
Sasha was fuzzing with trinity and reported the following problem:

  BUG: sleeping function called from invalid context at kernel/mutex.c:269
  in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
  2 locks held by trinity-main/6361:
   #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0
   #1:  (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0
  Pid: 6361, comm: trinity-main Tainted: G        W
  3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74
  Call Trace:
    __might_sleep+0x1c3/0x1e0
    mutex_lock_nested+0x29/0x50
    mpol_shared_policy_lookup+0x2e/0x90
    shmem_get_policy+0x2e/0x30
    get_vma_policy+0x5a/0xa0
    mpol_misplaced+0x41/0x1d0
    handle_pte_fault+0x465/0x6a0

This was triggered by a different version of automatic NUMA balancing
but in theory the current version is vunerable to the same problem.

do_numa_page
  -> numa_migrate_prep
    -> mpol_misplaced
      -> get_vma_policy
        -> shmem_get_policy

It's very unlikely this will happen as shared pages are not marked
pte_numa -- see the page_mapcount() check in change_pte_range() -- but
it is possible.

To address this, this patch restores sp->lock as originally implemented
by Kosaki Motohiro.  In the path where get_vma_policy() is called, it
should not be calling sp_alloc() so it is not necessary to treat the PTL
specially.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02 17:32:13 -08:00
Linus Torvalds
5439ca6b8f Various bug fixes for ext4. Perhaps the most serious bug fixed is one
which could cause file system corruptions when performing file punch
 operations.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJQ374OAAoJENNvdpvBGATwEGAP/jKUwjQhBZiF0k9dg1kQ5eTz
 bdli4fy1vxrEMIOym8IZa4nBQJVCkArwRgjc28gCBD6k9u6X3GPa26vUydsoPfP6
 odPdc9c9HtsbYQGuaq1SohID5HfjxHewTcUmCs4X4SpGcSurUcT7eQYWqSuIxFHR
 0nKk8NO4EcWh2uqIoGPrc8QpSdor0DXXYYjZmHCeVLH1n6PyoMsnrFMfO9KqMLUL
 vNR54CX9n1GRTfAfJNkNzcwfs8IfNkDUyv5hFpDh15tLltogU0TqnlAl3vSeZGSx
 vVfhwHmQTK/bJyC3YaoRZqq9CQJVk2f/OTBpJDFY/USaapuitJd6vqbmh7NiRNAN
 LaKmFt99MPfwyjEhIA7+J0LCTraAxc536q43oWWK5dAJhWI7DW0lbHARVeQTixNy
 KJ1Lp0pmmz1mX8/lugOnK1SPBF525kTaoiz2bWqg4oQgn7mBzUlgj+EV22/6Rq83
 TpKOKstl4BiZi8t5AhmFiwqtknCDiT5vUKQNy2kuM/oXtPJID/lM/TJbR5viYD3l
 AH3Ef7xj61CynFZ0oBeraGwtXc2BHJpJdWz+8uj0/VhFfC+uNUYapSLFwyiAVZKO
 xxaItT3ylfKpa0AWK6HBc2SLuL72SCHAPks06YKFtSyHtr5C8SCcafxU2DSOSi7K
 VrhkcH6STa77Br7a1ORt
 =9R/D
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bug fixes from Ted Ts'o:
 "Various bug fixes for ext4.  Perhaps the most serious bug fixed is one
  which could cause file system corruptions when performing file punch
  operations."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: avoid hang when mounting non-journal filesystems with orphan list
  ext4: lock i_mutex when truncating orphan inodes
  ext4: do not try to write superblock on ro remount w/o journal
  ext4: include journal blocks in df overhead calcs
  ext4: remove unaligned AIO warning printk
  ext4: fix an incorrect comment about i_mutex
  ext4: fix deadlock in journal_unmap_buffer()
  ext4: split off ext4_journalled_invalidatepage()
  jbd2: fix assertion failure in jbd2_journal_flush()
  ext4: check dioread_nolock on remount
  ext4: fix extent tree corruption caused by hole punch
2013-01-02 09:57:34 -08:00
Hugh Dickins
a7a88b2373 mempolicy: remove arg from mpol_parse_str, mpol_to_str
Remove the unused argument (formerly no_context) from mpol_parse_str()
and from mpol_to_str().

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02 09:27:10 -08:00
Eric Dumazet
2681128f0c veth: reduce stat overhead
veth stats are a bit bloated. There is no need to account transmit
and receive stats, since they are absolutely symmetric.

Also use a per device atomic64_t for the dropped counter, as it
should never be used in fast path.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30 02:31:58 -08:00
Jiri Pirko
4bf84c35c6 net: add change_carrier netdev op
This allows a driver to register change_carrier callback which will be
called whenever user will like to change carrier state. This is useful
for devices like dummy, gre, team and so on.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28 15:24:18 -08:00
Linus Torvalds
ddf75ae34e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace fixes from Eric Biederman:
 "This tree includes two bug fixes for problems Oleg spotted on his
  review of the recent pid namespace work.  A small fix to not enable
  bottom halves with irqs disabled, and a trivial build fix for f2fs
  with user namespaces enabled."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  f2fs: Don't assign e_id in f2fs_acl_from_disk
  proc: Allow proc_free_inum to be called from any context
  pidns: Stop pid allocation when init dies
  pidns: Outlaw thread creation after unshare(CLONE_NEWPID)
2012-12-27 10:42:46 -08:00
Linus Torvalds
7fd83b47ce Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

1) GRE tunnel drivers don't set the transport header properly, they also
   blindly deref the inner protocol ipv4 and needs some checks.  Fixes
   from Isaku Yamahata.

2) Fix sleeps while atomic in netdevice rename code, from Eric Dumazet.

3) Fix double-spinlock in solos-pci driver, from Dan Carpenter.

4) More ARP bug fixes.  Fix lockdep splat in arp_solicit() and then the
   bug accidentally added by that fix.  From Eric Dumazet and Cong Wang.

5) Remove some __dev* annotations that slipped back in, as well as all
   HOTPLUG references.  From Greg KH

6) RDS protocol uses wrong interfaces to access scatter-gather elements,
   causing a regression.  From Mike Marciniszyn.

7) Fix build error in cpts driver, from Richard Cochran.

8) Fix arithmetic in packet scheduler, from Stefan Hasko.

9) Similarly, fix association during calculation of random backoff in
   batman-adv.  From Akinobu Mita.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
  ipv6/ip6_gre: set transport header correctly
  ipv4/ip_gre: set transport header correctly to gre header
  IB/rds: suppress incompatible protocol when version is known
  IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len
  net/vxlan: Use the underlying device index when joining/leaving multicast groups
  tcp: should drop incoming frames without ACK flag set
  netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled
  cpts: fix a run time warn_on.
  cpts: fix build error by removing useless code.
  batman-adv: fix random jitter calculation
  arp: fix a regression in arp_solicit()
  net: sched: integer overflow fix
  CONFIG_HOTPLUG removal from networking core
  Drivers: network: more __dev* removal
  bridge: call br_netpoll_disable in br_add_if
  ipv4: arp: fix a lockdep splat in arp_solicit()
  tuntap: dont use a private kmem_cache
  net: devnet_rename_seq should be a seqcount
  ip_gre: fix possible use after free
  ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally
  ...
2012-12-27 10:40:30 -08:00
Christoffer Dall
ad4b3fb7ff mm: Fix PageHead when !CONFIG_PAGEFLAGS_EXTENDED
Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and
(PageHead) is true, for tail pages.  If this is indeed the intended
behavior, which I doubt because it breaks cache cleaning on some ARM
systems, then the nomenclature is highly problematic.

This patch makes sure PageHead is only true for head pages and PageTail
is only true for tail pages, and neither is true for non-compound pages.

[ This buglet seems ancient - seems to have been introduced back in Apr
  2008 in commit 6a1e7f777f: "pageflags: convert to the use of new
  macros".  And the reason nobody noticed is because the PageHead()
  tests are almost all about just sanity-checking, and only used on
  pages that are actual page heads.  The fact that the old code returned
  true for tail pages too was thus not really noticeable.   - Linus ]

Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Acked-by:  Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Steve Capper <Steve.Capper@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: stable@kernel.org  # 2.6.26+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-26 14:57:34 -08:00
Andy Lutomirski
812089e01b PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
Otherwise it fails like this on cards like the Transcend 16GB SDHC card:

    mmc0: new SDHC card at address b368
    mmcblk0: mmc0:b368 SDC   15.0 GiB
    mmcblk0: error -110 sending status command, retrying
    mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0xb0

Tested on my Lenovo x200 laptop.

[bhelgaas: changelog]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Chris Ball <cjb@laptop.org>
CC: Manoj Iyer <manoj.iyer@canonical.com>
CC: stable@vger.kernel.org
2012-12-26 10:43:06 -07:00
Eric W. Biederman
c876ad7682 pidns: Stop pid allocation when init dies
Oleg pointed out that in a pid namespace the sequence.
- pid 1 becomes a zombie
- setns(thepidns), fork,...
- reaping pid 1.
- The injected processes exiting.

Can lead to processes attempting access their child reaper and
instead following a stale pointer.

That waitpid for init can return before all of the processes in
the pid namespace have exited is also unfortunate.

Avoid these problems by disabling the allocation of new pids in a pid
namespace when init dies, instead of when the last process in a pid
namespace is reaped.

Pointed-out-by:  Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-25 16:10:05 -08:00
Stephen Warren
08b60f8438 namei.h: include errno.h
This solves:

In file included from fs/ext3/symlink.c:20:0:
include/linux/namei.h: In function 'retry_estale':
include/linux/namei.h:114:19: error: 'ESTALE' undeclared (first use in this function)

Signed-off-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-25 18:45:06 -05:00
Jan Kara
53e872681f ext4: fix deadlock in journal_unmap_buffer()
We cannot wait for transaction commit in journal_unmap_buffer()
because we hold page lock which ranks below transaction start.  We
solve the issue by bailing out of journal_unmap_buffer() and
jbd2_journal_invalidatepage() with -EBUSY.  Caller is then responsible
for waiting for transaction commit to finish and try invalidation
again. Since the issue can happen only for page stradding i_size, it
is simple enough to manually call jbd2_journal_invalidatepage() for
such page from ext4_setattr(), check the return value and wait if
necessary.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-12-25 13:29:52 -05:00
Linus Torvalds
4fe19a136a Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
 "This includes some fixes and code improvements (like
  clk_prepare_enable and clk_disable_unprepare), conversion from the
  omap_wdt and twl4030_wdt drivers to the watchdog framework, addition
  of the SB8x0 chipset support and the DA9055 Watchdog driver and some
  OF support for the davinci_wdt driver."

* git://www.linux-watchdog.org/linux-watchdog: (22 commits)
  watchdog: mei: avoid oops in watchdog unregister code path
  watchdog: Orion: Fix possible null-deference in orion_wdt_probe
  watchdog: sp5100_tco: Add SB8x0 chipset support
  watchdog: davinci_wdt: add OF support
  watchdog: da9052: Fix invalid free of devm_ allocated data
  watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
  watchdog: remove depends on CONFIG_EXPERIMENTAL
  watchdog: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
  watchdog: DA9055 Watchdog driver
  watchdog: omap_wdt: eliminate goto
  watchdog: omap_wdt: delete redundant platform_set_drvdata() calls
  watchdog: omap_wdt: convert to devm_ functions
  watchdog: omap_wdt: convert to new watchdog core
  watchdog: WatchDog Timer Driver Core: fix comment
  watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepare
  watchdog: imx2_wdt: Select the driver via ARCH_MXC
  watchdog: cpu5wdt.c: add missing del_timer call
  watchdog: hpwdt.c: Increase version string
  watchdog: Convert twl4030_wdt to watchdog core
  davinci_wdt: preparation for switch to common clock framework
  ...
2012-12-21 17:10:29 -08:00
Linus Torvalds
b49249d103 Miscellaneous device-mapper fixes, cleanups and performance improvements.
Of particular note:
 - Disable broken WRITE SAME support in all targets except linear and striped.
   Use it when kcopyd is zeroing blocks.
 - Remove several mempools from targets by moving the data into the bio's new
   front_pad area(which dm calls 'per_bio_data').
 - Fix a race in thin provisioning if discards are misused.
 - Prevent userspace from interfering with the ioctl parameters and
   use kmalloc for the data buffer if it's small instead of vmalloc.
 - Throttle some annoying error messages when I/O fails.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ1M5OAAoJEK2W1qbAHj1nrQcP/itnnAw8RNsSHBrFMrL9wVnB
 5dmZ1BXPZmEbG+ViU4wzVmRUPHuSHTwhIqH7UFPjyCgbWaz1jaXfpyIxBsxlJi4E
 zuGjv46akANMwH0o/aJDRuEIrCnMtjLrMiY2Oq00lJFvATurwYAKSIgmnwRVdAYy
 gDehJhaymNtHVjhymu33xEn/hqqkQtUbMDj9o+IZppmAw1aQyNuYnwQu3HvcETuz
 /JBcs8isXKIQMJdMLFdGg7lZjLO241UvSwCAeGycKkupHLaYfycumPywgdiNFVUg
 L6pQP9RtAQ+H2VBQ1OIVMJxqiXxQ0xHhyxUYIe3reTar+RXoMA0yK+FiJTwSY1cE
 Xk0s8x2DXwUyu3Vx7UmvgUXnMgd4TIPITYBYiOAanEF/8Xt0voZn8mzNyyzsyFXy
 0u1vMRK+ZK7+QPio9LRh7bgHNK1g5ZyShvwqTMDmtlp+uskaP4iHDDGtVUjFA+Wf
 r9Ms0CXPbXIN6laUIT/4L3LJZtyRWB6e8wuCrUWIWWRbjrMPaPnB+/NlckGJ0CHa
 P/5r1rmLdneTEZ8Vx/2g3fBJ+H2uNQKhYujjnE0HqtHP+tvjt7ernibyU2QhNBeE
 Zy0PXRatY0Xn7UFpn44uJ2qxkWaO5Dloaa4HkWdlWFdR3f/u5MzVjy5mDXLUxkGq
 wj2Z3YkjYjy948MViBhD
 =yzhS
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull dm update from Alasdair G Kergon:
 "Miscellaneous device-mapper fixes, cleanups and performance
  improvements.

  Of particular note:
   - Disable broken WRITE SAME support in all targets except linear and
     striped.  Use it when kcopyd is zeroing blocks.
   - Remove several mempools from targets by moving the data into the
     bio's new front_pad area(which dm calls 'per_bio_data').
   - Fix a race in thin provisioning if discards are misused.
   - Prevent userspace from interfering with the ioctl parameters and
     use kmalloc for the data buffer if it's small instead of vmalloc.
   - Throttle some annoying error messages when I/O fails."

* tag 'dm-3.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (36 commits)
  dm stripe: add WRITE SAME support
  dm: remove map_info
  dm snapshot: do not use map_context
  dm thin: dont use map_context
  dm raid1: dont use map_context
  dm flakey: dont use map_context
  dm raid1: rename read_record to bio_record
  dm: move target request nr to dm_target_io
  dm snapshot: use per_bio_data
  dm verity: use per_bio_data
  dm raid1: use per_bio_data
  dm: introduce per_bio_data
  dm kcopyd: add WRITE SAME support to dm_kcopyd_zero
  dm linear: add WRITE SAME support
  dm: add WRITE SAME support
  dm: prepare to support WRITE SAME
  dm ioctl: use kmalloc if possible
  dm ioctl: remove PF_MEMALLOC
  dm persistent data: improve improve space map block alloc failure message
  dm thin: use DMERR_LIMIT for errors
  ...
2012-12-21 17:08:06 -08:00
Linus Torvalds
184e251661 Second batch of InfiniBand/RDMA changes for 3.8:
- cxgb4 changes to fix lookup engine hash collisions
  - mlx4 changes to make flow steering usable
  - fix to IPoIB to avoid pinning dst reference for too long
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJQ1NdhAAoJEENa44ZhAt0hjb0P/i0tL4Ux+PvqG/Phh2gaZaQR
 evoi3bw6tCYFzWEJPfur2AJ63svKjyrSrfpvgEZjhthDdYORjIv2Dw2Je1qkOSrf
 5tJtSp3r9D0y35SE/DxNnzlgua9heBqphPlOGpjcKdN83KP7XIyVG2SGyJiEeLza
 owefPx/48jZr8hsw7LB2DlZmNUbVWK00o+pXa/VsUQX/dlIU5hyihAzBjtkwJyT2
 xtiyu9oqXGuv1JW/a16ooPGDaETDLJ1G50NndadUZYWFWj36VrAwW6hOAK3oOikf
 Qa4z3gJVzpSdaC1kiuxERj7GxlRpVUJY0IgHEoMTVrexOz1IsFEP8KEfGLkAYwzB
 jjuXh+Z2+QU5OOO3un0nINRGxKZUSD8Scoa222GBwGnWuCCq68APx2UGTkVkhWon
 FyjhF13WJRbElg2oXzI1cg9lJNv2pf10hXhiy2qdO6tDElVXVQk+KRiDdcNtxS0H
 FYYh3og/DjFwp18j+FVLA9r5AiPuVV5DjNnlwBNjTTMc9RwlOrX/6oCK2kZN24rZ
 l+Nr0gv+h6MAjTPBPYLUP2bsY6wYt5n566mfLea/lir9YeI+Q3PL2sDdGZ7C6xHH
 S4pRW95leP4pEFpHqYg8z3QKPswYqzokgbUSHxg2TVYrV01RC75axXCh0Q90q3RE
 oLP8GDYod2okxhAU2AyN
 =lDXt
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull more infiniband changes from Roland Dreier:
 "Second batch of InfiniBand/RDMA changes for 3.8:
   - cxgb4 changes to fix lookup engine hash collisions
   - mlx4 changes to make flow steering usable
   - fix to IPoIB to avoid pinning dst reference for too long"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cxgb4: Fix bug for active and passive LE hash collision path
  RDMA/cxgb4: Fix LE hash collision bug for passive open connection
  RDMA/cxgb4: Fix LE hash collision bug for active open connection
  mlx4_core: Allow choosing flow steering mode
  mlx4_core: Adjustments to Flow Steering activation logic for SR-IOV
  mlx4_core: Fix error flow in the flow steering wrapper
  mlx4_core: Add QPN enforcement for flow steering rules set by VFs
  cxgb4: Add LE hash collision bug fix path in LLD driver
  cxgb4: Add T4 filter support
  IPoIB: Call skb_dst_drop() once skb is enqueued for sending
2012-12-21 16:40:26 -08:00
Eric Dumazet
30e6c9fa93 net: devnet_rename_seq should be a seqcount
Using a seqlock for devnet_rename_seq is not a good idea,
as device_rename() can sleep.

As we hold RTNL, we dont need a protection for writers,
and only need a seqcount so that readers can catch a change done
by a writer.

Bug added in commit c91f6df2db (sockopt: Change getsockopt() of
SO_BINDTODEVICE to return an interface name)

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-21 13:14:01 -08:00
Mikulas Patocka
7de3ee57da dm: remove map_info
This patch removes map_info from bio-based device mapper targets.
map_info is still used for request-based targets.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-12-21 20:23:41 +00:00
Mikulas Patocka
ddbd658f64 dm: move target request nr to dm_target_io
This patch moves target_request_nr from map_info to dm_target_io and
makes it accessible with dm_bio_get_target_request_nr.

This patch is a preparation for the next patch that removes map_info.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-12-21 20:23:39 +00:00
Mikulas Patocka
c0820cf5ad dm: introduce per_bio_data
Introduce a field per_bio_data_size in struct dm_target.

Targets can set this field in the constructor. If a target sets this
field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
are allocated for each bio submitted to the target. These data can be
used for any purpose by the target and help us improve performance by
removing some per-target mempools.

Per-bio data is accessed with dm_per_bio_data. The
argument data_size must be the same as the value per_bio_data_size in
dm_target.

If the target has a pointer to per_bio_data, it can get a pointer to
the bio with dm_bio_from_per_bio_data() function (data_size must be the
same as the value passed to dm_per_bio_data).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-12-21 20:23:38 +00:00
Mike Snitzer
d54eaa5a0f dm: prepare to support WRITE SAME
Allow targets to opt in to WRITE SAME support by setting
'num_write_same_requests' in the dm_target structure.

A dm device will only advertise WRITE SAME support if all its
targets and all its underlying devices support it.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-12-21 20:23:36 +00:00
Linus Torvalds
96680d2b91 Merge branch 'for-next' of git://git.infradead.org/users/eparis/notify
Pull filesystem notification updates from Eric Paris:
 "This pull mostly is about locking changes in the fsnotify system.  By
  switching the group lock from a spin_lock() to a mutex() we can now
  hold the lock across things like iput().  This fixes a problem
  involving unmounting a fs and having inodes be busy, first pointed out
  by FAT, but reproducible with tmpfs.

  This also restores signal driven I/O for inotify, which has been
  broken since about 2.6.32."

Ugh.  I *hate* the timing of this.  It was rebased after the merge
window opened, and then left to sit with the pull request coming the day
before the merge window closes.  That's just crap.  But apparently the
patches themselves have been around for over a year, just gathering
dust, so now it's suddenly critical.

Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
Rothwell's fixes from -next.

* 'for-next' of git://git.infradead.org/users/eparis/notify:
  inotify: automatically restart syscalls
  inotify: dont skip removal of watch descriptor if creation of ignored event failed
  fanotify: dont merge permission events
  fsnotify: make fasync generic for both inotify and fanotify
  fsnotify: change locking order
  fsnotify: dont put marks on temporary list when clearing marks by group
  fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
  fsnotify: pass group to fsnotify_destroy_mark()
  fsnotify: use a mutex instead of a spinlock to protect a groups mark list
  fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
  fsnotify: take groups mark_lock before mark lock
  fsnotify: use reference counting for groups
  fsnotify: introduce fsnotify_get_group()
  inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()
2012-12-20 20:11:52 -08:00
Linus Torvalds
4c9a44aebe Merge branch 'akpm' (Andrew's patch-bomb)
Merge the rest of Andrew's patches for -rc1:
 "A bunch of fixes and misc missed-out-on things.

  That'll do for -rc1.  I still have a batch of IPC patches which still
  have a possible bug report which I'm chasing down."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (25 commits)
  keys: use keyring_alloc() to create module signing keyring
  keys: fix unreachable code
  sendfile: allows bypassing of notifier events
  SGI-XP: handle non-fatal traps
  fat: fix incorrect function comment
  Documentation: ABI: remove testing/sysfs-devices-node
  proc: fix inconsistent lock state
  linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors
  memcg: don't register hotcpu notifier from ->css_alloc()
  checkpatch: warn on uapi #includes that #include <uapi/...
  revert "rtc: recycle id when unloading a rtc driver"
  mm: clean up transparent hugepage sysfs error messages
  hfsplus: add error message for the case of failure of sync fs in delayed_sync_fs() method
  hfsplus: rework processing of hfs_btree_write() returned error
  hfsplus: rework processing errors in hfsplus_free_extents()
  hfsplus: avoid crash on failed block map free
  kcmp: include linux/ptrace.h
  drivers/rtc/rtc-imxdi.c: must include <linux/spinlock.h>
  mm: cma: WARN if freed memory is still in use
  exec: do not leave bprm->interp on stack
  ...
2012-12-20 20:00:43 -08:00
Linus Torvalds
1f0377ff08 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS update from Al Viro:
 "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted
  misc stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits)
  vfs: make lremovexattr retry once on ESTALE error
  vfs: make removexattr retry once on ESTALE
  vfs: make llistxattr retry once on ESTALE error
  vfs: make listxattr retry once on ESTALE error
  vfs: make lgetxattr retry once on ESTALE
  vfs: make getxattr retry once on an ESTALE error
  vfs: allow lsetxattr() to retry once on ESTALE errors
  vfs: allow setxattr to retry once on ESTALE errors
  vfs: allow utimensat() calls to retry once on an ESTALE error
  vfs: fix user_statfs to retry once on ESTALE errors
  vfs: make fchownat retry once on ESTALE errors
  vfs: make fchmodat retry once on ESTALE errors
  vfs: have chroot retry once on ESTALE error
  vfs: have chdir retry lookup and call once on ESTALE error
  vfs: have faccessat retry once on an ESTALE error
  vfs: have do_sys_truncate retry once on an ESTALE error
  vfs: fix renameat to retry on ESTALE errors
  vfs: make do_unlinkat retry once on ESTALE errors
  vfs: make do_rmdir retry once on ESTALE errors
  vfs: add a flags argument to user_path_parent
  ...
2012-12-20 18:14:31 -08:00
Linus Torvalds
54d46ea993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20 18:05:28 -08:00
Guenter Roeck
c4e18497d8 linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors
Commit 263a523d18 ("linux/kernel.h: Fix warning seen with W=1 due to
change in DIV_ROUND_CLOSEST") fixes a warning seen with W=1 due to
change in DIV_ROUND_CLOSEST.

Unfortunately, the C compiler converts divide operations with unsigned
divisors to unsigned, even if the dividend is signed and negative (for
example, -10 / 5U = 858993457).  The C standard says "If one operand has
unsigned int type, the other operand is converted to unsigned int", so
the compiler is not to blame.  As a result, DIV_ROUND_CLOSEST(0, 2U) and
similar operations now return bad values, since the automatic conversion
of expressions such as "0 - 2U/2" to unsigned was not taken into
account.

Fix by checking for the divisor variable type when deciding which
operation to perform.  This fixes DIV_ROUND_CLOSEST(0, 2U), but still
returns bad values for negative dividends divided by unsigned divisors.
Mark the latter case as unsupported.

One observed effect of this problem is that the s2c_hwmon driver reports
a value of 4198403 instead of 0 if the ADC reads 0.

Other impact is unpredictable.  Problem is seen if the divisor is an
unsigned variable or constant and the dividend is less than (divisor/2).

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Juergen Beisert <jbe@pengutronix.de>
Tested-by: Juergen Beisert <jbe@pengutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: <stable@vger.kernel.org>	[3.7.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:20 -08:00
Kees Cook
b66c598401 exec: do not leave bprm->interp on stack
If a series of scripts are executed, each triggering module loading via
unprintable bytes in the script header, kernel stack contents can leak
into the command line.

Normally execution of binfmt_script and binfmt_misc happens recursively.
However, when modules are enabled, and unprintable bytes exist in the
bprm->buf, execution will restart after attempting to load matching
binfmt modules.  Unfortunately, the logic in binfmt_script and
binfmt_misc does not expect to get restarted.  They leave bprm->interp
pointing to their local stack.  This means on restart bprm->interp is
left pointing into unused stack memory which can then be copied into the
userspace argv areas.

After additional study, it seems that both recursion and restart remains
the desirable way to handle exec with scripts, misc, and modules.  As
such, we need to protect the changes to interp.

This changes the logic to require allocation for any changes to the
bprm->interp.  To avoid adding a new kmalloc to every exec, the default
value is left as-is.  Only when passing through binfmt_script or
binfmt_misc does an allocation take place.

For a proof of concept, see DoTest.sh from:

   http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:19 -08:00
Jeff Layton
1ac12b4b6d vfs: turn is_dir argument to kern_path_create into a lookup_flags arg
Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags
passed in here are currently ignored.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 18:50:02 -05:00
Jeff Layton
b9d6ba94b8 vfs: add a retry_estale helper function to handle retries on ESTALE
This function is expected to be called from path-based syscalls to help
them decide whether to try the lookup and call again in the event that
they got an -ESTALE return back on an earier try.

Currently, we only retry the call once on an ESTALE error, but in the
event that we decide that that's not enough in the future, we should be
able to change the logic in this helper without too much effort.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 18:50:00 -05:00
Al Viro
21e89c0c48 Merge branch 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into for-linus 2012-12-20 18:49:14 -05:00
Alessio Igor Bogani
471667391a vfs: Remove useless function prototypes
Commit 8e22cc88d6 removes the (un)lock_super
function definitions but forgets to remove their prototypes.

Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 18:47:08 -05:00
Marco Stornelli
7898575fc8 mm: drop vmtruncate
Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 18:46:29 -05:00
Marco Stornelli
d30357f2f0 vfs: drop vmtruncate
Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 18:46:29 -05:00
David Howells
1f372dff1d FS-Cache: Mark cancellation of in-progress operation
Mark as cancelled an operation that is in progress rather than pending at the
time it is cancelled, and call fscache_complete_op() to cancel an operation so
that blocked ops can be started.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 22:34:00 +00:00
David Howells
36a02de5d7 FS-Cache: Convert the object event ID #defines into an enum
Convert the fscache_object event IDs from #defines into an enum.  Also add an
extra label to the enum to carry the event count and redefine the event mask
in terms of that.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 22:08:05 +00:00
David Howells
a02de96085 VFS: Make more complete truncate operation available to CacheFiles
Make a more complete truncate operation available to CacheFiles (including
security checks and suchlike) so that it can use this to clear invalidated
cache files.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 22:05:41 +00:00
Linus Torvalds
982197277c Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux
Pull nfsd update from Bruce Fields:
 "Included this time:

   - more nfsd containerization work from Stanislav Kinsbursky: we're
     not quite there yet, but should be by 3.9.

   - NFSv4.1 progress: implementation of basic backchannel security
     negotiation and the mandatory BACKCHANNEL_CTL operation.  See

       http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

     for remaining TODO's

   - Fixes for some bugs that could be triggered by unusual compounds.
     Our xdr code wasn't designed with v4 compounds in mind, and it
     shows.  A more thorough rewrite is still a todo.

   - If you've ever seen "RPC: multiple fragments per record not
     supported" logged while using some sort of odd userland NFS client,
     that should now be fixed.

   - Further work from Jeff Layton on our mechanism for storing
     information about NFSv4 clients across reboots.

   - Further work from Bryan Schumaker on his fault-injection mechanism
     (which allows us to discard selective NFSv4 state, to excercise
     rarely-taken recovery code paths in the client.)

   - The usual mix of miscellaneous bugs and cleanup.

  Thanks to everyone who tested or contributed this cycle."

* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
  nfsd4: don't leave freed stateid hashed
  nfsd4: free_stateid can use the current stateid
  nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
  nfsd: warn on odd reply state in nfsd_vfs_read
  nfsd4: fix oops on unusual readlike compound
  nfsd4: disable zero-copy on non-final read ops
  svcrpc: fix some printks
  NFSD: Correct the size calculation in fault_inject_write
  NFSD: Pass correct buffer size to rpc_ntop
  nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
  nfsd: simplify service shutdown
  nfsd: replace boolean nfsd_up flag by users counter
  nfsd: simplify NFSv4 state init and shutdown
  nfsd: introduce helpers for generic resources init and shutdown
  nfsd: make NFSd service structure allocated per net
  nfsd: make NFSd service boot time per-net
  nfsd: per-net NFSd up flag introduced
  nfsd: move per-net startup code to separated function
  nfsd: pass net to __write_ports() and down
  nfsd: pass net to nfsd_set_nrthreads()
  ...
2012-12-20 14:04:11 -08:00
David Howells
ef778e7ae6 FS-Cache: Provide proper invalidation
Provide a proper invalidation method rather than relying on the netfs retiring
the cookie it has and getting a new one.  The problem with this is that isn't
easy for the netfs to make sure that it has completed/cancelled all its
outstanding storage and retrieval operations on the cookie it is retiring.

Instead, have the cache provide an invalidation method that will cancel or wait
for all currently outstanding operations before invalidating the cache, and
will cause new operations to queue up behind that.  Whilst invalidation is in
progress, some requests will be rejected until the cache can stack a barrier on
the operation queue to cause new operations to be deferred behind it.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 22:04:07 +00:00
Linus Torvalds
40889e8d9f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph update from Sage Weil:
 "There are a few different groups of commits here.  The largest is
  Alex's ongoing work to enable the coming RBD features (cloning,
  striping).  There is some cleanup in libceph that goes along with it.

  Cyril and David have fixed some problems with NFS reexport (leaking
  dentries and page locks), and there is a batch of patches from Yan
  fixing problems with the fs client when running against a clustered
  MDS.  There are a few bug fixes mixed in for good measure, many of
  which will be going to the stable trees once they're upstream.

  My apologies for the late pull.  There is still a gremlin in the rbd
  map/unmap code and I was hoping to include the fix for that as well,
  but we haven't been able to confirm the fix is correct yet; I'll send
  that in a separate pull once it's nailed down."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
  rbd: get rid of rbd_{get,put}_dev()
  libceph: register request before unregister linger
  libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
  libceph: init event->node in ceph_osdc_create_event()
  libceph: init osd->o_node in create_osd()
  libceph: report connection fault with warning
  libceph: socket can close in any connection state
  rbd: don't use ENOTSUPP
  rbd: remove linger unconditionally
  rbd: get rid of RBD_MAX_SEG_NAME_LEN
  libceph: avoid using freed osd in __kick_osd_requests()
  ceph: don't reference req after put
  rbd: do not allow remove of mounted-on image
  libceph: Unlock unprocessed pages in start_read() error path
  ceph: call handle_cap_grant() for cap import message
  ceph: Fix __ceph_do_pending_vmtruncate
  ceph: Don't add dirty inode to dirty list if caps is in migration
  ceph: Fix infinite loop in __wake_requests
  ceph: Don't update i_max_size when handling non-auth cap
  bdi_register: add __printf verification, fix arg mismatch
  ...
2012-12-20 14:00:13 -08:00
David Howells
9f10523f89 FS-Cache: Fix operation state management and accounting
Fix the state management of internal fscache operations and the accounting of
what operations are in what states.

This is done by:

 (1) Give struct fscache_operation a enum variable that directly represents the
     state it's currently in, rather than spreading this knowledge over a bunch
     of flags, who's processing the operation at the moment and whether it is
     queued or not.

     This makes it easier to write assertions to check the state at various
     points and to prevent invalid state transitions.

 (2) Add an 'operation complete' state and supply a function to indicate the
     completion of an operation (fscache_op_complete()) and make things call
     it.  The final call to fscache_put_operation() can then check that an op
     in the appropriate state (complete or cancelled).

 (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
     govern the state of an object:

	(a) The ->n_ops is now the number of extant operations on the object
	    and is now decremented by fscache_put_operation() only.

	(b) The ->n_in_progress is simply the number of objects that have been
	    taken off of the object's pending queue for the purposes of being
	    run.  This is decremented by fscache_op_complete() only.

	(c) The ->n_exclusive is the number of exclusive ops that have been
	    submitted and queued or are in progress.  It is decremented by
	    fscache_op_complete() and by fscache_cancel_op().

     fscache_put_operation() and fscache_operation_gc() now no longer try to
     clean up ->n_exclusive and ->n_in_progress.  That was leading to double
     decrements against fscache_cancel_op().

     fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
     double decrements against fscache_put_operation().

     fscache_submit_exclusive_op() now decides whether it has to queue an op
     based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
     will persist in being true even after all preceding operations have been
     cancelled or completed.  Furthermore, if an object is active and there are
     runnable ops against it, there must be at least one op running.

 (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
     provide a function to record completion of the pages as they complete.

     When n_pages reaches 0, the operation is deemed to be complete and
     fscache_op_complete() is called.

     Add calls to fscache_retrieval_complete() anywhere we've finished with a
     page we've been given to read or allocate for.  This includes places where
     we just return pages to the netfs for reading from the server and where
     accessing the cache fails and we discard the proposed netfs page.

The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs.  This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.


FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
 ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
 [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
 [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
 [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
 [<ffffffff810d8d47>] evict+0xa1/0x15c
 [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
 [<ffffffff810c56b7>] prune_super+0xd5/0x140
 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
 [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
 [<ffffffff8103e009>] ? process_timeout+0xb/0xb
 [<ffffffff8109dba3>] kswapd+0x270/0x289
 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
 [<ffffffff8104bf7a>] kthread+0x7f/0x87
 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
 [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 21:58:26 +00:00
David Howells
ef46ed888e FS-Cache: Make cookie relinquishment wait for outstanding reads
Make fscache_relinquish_cookie() log a warning and wait if there are any
outstanding reads left on the cookie it was given.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 21:58:25 +00:00
Linus Torvalds
a13eea6bd9 Introduce a new file system, Flash-Friendly File System (F2FS), to Linux 3.8.
Highlights:
 - Add initial f2fs source codes
 - Fix an endian conversion bug
 - Fix build failures on random configs
 - Fix the power-off-recovery routine
 - Minor cleanup, coding style, and typos patches
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxuJcAAoJEEAUqH6CSFDSq80QAI3i7NgUkx4h225MnbJdEKRb
 YX1MfSPmgE0q/15XS2qQu/s9NGJmXLV1IR9EtRSBlCQjwWhbx9Q9URktGkWslFnx
 6mBLy8EvVKDMVdwoUS8ZY6IjfKbmSnoIHTZrGaT9+9d7k8nlOQLaj3qQF4wBuw1+
 +qhJQV642v8qw7JiVVFgxcBSLpAS9cbdOA0vxfWncMwmRLaEO45W5+rob8ZN8WaS
 BUiYIiue8vlB0VDIYfpl/sSPJC/Bn1XsLKZoS2WJl8CKioE1ptLjT3acUBbabUxp
 hNLl8Ae0PylDMFpH8hrBXhleznrVqEMOTos/Z80/UyBny2sCxJFnaQ60TayUo2l2
 hYk5Wbyj8K7IBJEke23Fepild2PnGz22zf2v+tLxxVgPH5j7/l2XHfy9gPvDbd1P
 4ENiJUC3LE49Mi4TvEIFqhbrcJfD9C+v3bxpWGe8CevrpYZaB8tv/6nQXJCC/Ixp
 tMWqLKlHyXGmk5DZpiSFaj0/GbTPT0UGqZVRzzSXQpKqxJU6eTnXDa6aLUEYH8fH
 grOCriaJrd8SgL3l7RokQSQEyRHuNjMm1tlUQWOObE+y0nJjWb9Amwn1yUtJuNzx
 Np4nnlMhxwJ48P3LeeheSCuOUbxJtOzOR8MVXm7deYiGQbYaqB1/+9TbjOZBSX4O
 fpbCXrmqe1pUBukftZsL
 =iMoX
 -----END PGP SIGNATURE-----

Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull new F2FS filesystem from Jaegeuk Kim:
 "Introduce a new file system, Flash-Friendly File System (F2FS), to
  Linux 3.8.

  Highlights:
   - Add initial f2fs source codes
   - Fix an endian conversion bug
   - Fix build failures on random configs
   - Fix the power-off-recovery routine
   - Minor cleanup, coding style, and typos patches"

From the Kconfig help text:

  F2FS is based on Log-structured File System (LFS), which supports
  versatile "flash-friendly" features. The design has been focused on
  addressing the fundamental issues in LFS, which are snowball effect
  of wandering tree and high cleaning overhead.

  Since flash-based storages show different characteristics according to
  the internal geometry or flash memory management schemes aka FTL, F2FS
  and tools support various parameters not only for configuring on-disk
  layout, but also for selecting allocation and cleaning algorithms.

and there's an article by Neil Brown about it on lwn.net:

  http://lwn.net/Articles/518988/

* tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
  f2fs: fix tracking parent inode number
  f2fs: cleanup the f2fs_bio_alloc routine
  f2fs: introduce accessor to retrieve number of dentry slots
  f2fs: remove redundant call to f2fs_put_page in delete entry
  f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
  f2fs: rewrite f2fs_bio_alloc to make it simpler
  f2fs: fix a typo in f2fs documentation
  f2fs: remove unused variable
  f2fs: move error condition for mkdir at proper place
  f2fs: remove unneeded initialization
  f2fs: check read only condition before beginning write out
  f2fs: remove unneeded memset from init_once
  f2fs: show error in case of invalid mount arguments
  f2fs: fix the compiler warning for uninitialized use of variable
  f2fs: resolve build failures
  f2fs: adjust kernel coding style
  f2fs: fix endian conversion bugs reported by sparse
  f2fs: remove unneeded version.h header file from f2fs.h
  f2fs: update the f2fs document
  f2fs: update Kconfig and Makefile
  ...
2012-12-20 13:54:52 -08:00
David Howells
c4d6d8dbf3 CacheFiles: Fix the marking of cached pages
Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.

There are, however, two problems with this:

 (1) It can lead to the PG_fscache mark being applied _after_ the page is set
     PG_uptodate and unlocked (by the call to fscache_end_io()).

 (2) CacheFiles's ref on the page is dropped immediately following
     fscache_end_io() - and so may not still be held when the mark is applied.
     This can lead to the page being passed back to the allocator before the
     mark is applied.

Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page.  This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.

The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:

BUG: Bad page state in process tar  pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping:          (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G        W   3.1.0-rc4-fsdevel+ #1064
Call Trace:
 [<ffffffff8109583c>] ? dump_page+0xb9/0xbe
 [<ffffffff81095916>] bad_page+0xd5/0xea
 [<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
 [<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
 [<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
 [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
 [<ffffffff81098d7b>] ra_submit+0x1c/0x20
 [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
 [<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
 [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
 [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
 [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
 [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
 [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
 [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
 [<ffffffff810c2a79>] sys_read+0x45/0x6c
 [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b

As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.

Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't.  This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.

Reported-by: M. Stevens <m@tippett.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2012-12-20 21:54:30 +00:00
Jeff Layton
39e3c9553f vfs: remove DCACHE_NEED_LOOKUP
The code that relied on that flag was ripped out of btrfs quite some
time ago, and never added back. Josef indicated that he was going to
take a different approach to the problem in btrfs, and that we
could just eliminate this flag.

Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20 13:57:36 -05:00
Linus Torvalds
787314c35f IOMMU Updates for Linux v3.8
A few new features this merge-window. The most important one is
 probably, that dma-debug now warns if a dma-handle is not checked with
 dma_mapping_error by the device driver. This requires minor changes to
 some architectures which make use of dma-debug. Most of these changes
 have the respective Acks by the Arch-Maintainers.
 Besides that there are updates to the AMD IOMMU driver for refactor the
 IOMMU-Groups support and to make sure it does not trigger a hardware
 erratum.
 The OMAP changes (for which I pulled in a branch from Tony Lindgren's
 tree) have a conflict in linux-next with the arm-soc tree. The conflict
 is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in
 the arm-soc tree. It is safe to delete the file too so solve the
 conflict. Similar changes are done in the arm-soc tree in the common
 clock framework migration. A missing hunk from the patch in the IOMMU
 tree will be submitted as a seperate patch when the merge-window is
 closed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQzbQQAAoJECvwRC2XARrjXCIP/2RxBzbVOiaPOorl+ZWbsZ41
 lzWiXsCHJkh4BK4/qGsVeKhiNd9LcbQUlhywnBbhWxym3spzmjGtvU2Hcg8QiO/M
 R83r9S4e8Z6DnF9Gcats1Ns9BufgpyhLXg3XoXPxtyHOgRS59fvYi6xXOxyX30Dy
 uhbj+WL6UD0zvOMNztEnM1p6UhX+XlpvzKDTR5+G5xKdVPkcgeiaKSwqz739caTn
 QE2NpqIh+8Mwuu1nIapk8h07xhUYU5eGMXa38u1LvDwSHsrsCMLC+lXIjtInn7Gw
 Bv+XcCHgtOaoPQwwk/xd2HVwJQxO9HNb5YX51EIjwP0C5S/3yW9Ji1RgqFb6Ewqq
 jIkF6ckwUheLWsBGkw5UknI/f7RX3MDiTWkziYLIniYKKewm+ymGfgIqPt2TzLIO
 tMZZiIssKvy7wOXQ5JjpYJg5Xmrau6opNwdEguC8pWkJT7qsn+3SeLjMt0Lh9IoY
 +37DOgOLb3O3/vnZJ3i0KMRZBfVeaRj5HaGmlxFCYUZCNQymIPTih9Jtqm+WuVcu
 YaGQCTtynsQ0JVh8YEekLzSfgd3OODP68fyCg1CQNixEgvUi2hd/toX2/Z1wkkSA
 JC9bZarcoPkSWqaTAA2HvmaaxvRR+0UbhFPopFTQarVV0MVLZWBxoyuKy/nMrmMd
 UgTzrDYy74UKdrSTwIXg
 =pPHZ
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "A few new features this merge-window.  The most important one is
  probably, that dma-debug now warns if a dma-handle is not checked with
  dma_mapping_error by the device driver.  This requires minor changes
  to some architectures which make use of dma-debug.  Most of these
  changes have the respective Acks by the Arch-Maintainers.

  Besides that there are updates to the AMD IOMMU driver for refactor
  the IOMMU-Groups support and to make sure it does not trigger a
  hardware erratum.

  The OMAP changes (for which I pulled in a branch from Tony Lindgren's
  tree) have a conflict in linux-next with the arm-soc tree.  The
  conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
  deleted in the arm-soc tree.  It is safe to delete the file too so
  solve the conflict.  Similar changes are done in the arm-soc tree in
  the common clock framework migration.  A missing hunk from the patch
  in the IOMMU tree will be submitted as a seperate patch when the
  merge-window is closed."

* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
  ARM: dma-mapping: support debug_dma_mapping_error
  ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
  iommu/omap: Adapt to runtime pm
  iommu/omap: Migrate to hwmod framework
  iommu/omap: Keep mmu enabled when requested
  iommu/omap: Remove redundant clock handling on ISR
  iommu/amd: Remove obsolete comment
  iommu/amd: Don't use 512GB pages
  iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
  iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
  iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
  tile: dma_debug: add debug_dma_mapping_error support
  sh: dma_debug: add debug_dma_mapping_error support
  powerpc: dma_debug: add debug_dma_mapping_error support
  mips: dma_debug: add debug_dma_mapping_error support
  microblaze: dma-mapping: support debug_dma_mapping_error
  ia64: dma_debug: add debug_dma_mapping_error support
  c6x: dma_debug: add debug_dma_mapping_error support
  ARM64: dma_debug: add debug_dma_mapping_error support
  intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
  ...
2012-12-20 10:07:25 -08:00
Linus Torvalds
b7dfde956d Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
 
 There's a slightly non-trivial merge in virtio-net, as we cleaned up the
 virtio add_buf interface while DaveM accepted the mq virtio-net patches.
 
 You can see my solution in my pending-rebases branch, if that helps, but I
 know you love merging:
 
 https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7
 Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB
 /mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP
 noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj
 iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz
 hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O
 OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd
 NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq
 Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK
 xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B
 08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm
 WJv9uZPs+QbIMNky2Lcb
 =myDR
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio update from Rusty Russell:
 "Some nice cleanups, and even a patch my wife did as a "live" demo for
  Latinoware 2012.

  There's a slightly non-trivial merge in virtio-net, as we cleaned up
  the virtio add_buf interface while DaveM accepted the mq virtio-net
  patches."

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
  virtio_console: Add support for remoteproc serial
  virtio_console: Merge struct buffer_token into struct port_buffer
  virtio: add drv_to_virtio to make code clearly
  virtio: use dev_to_virtio wrapper in virtio
  virtio-mmio: Fix irq parsing in command line parameter
  virtio_console: Free buffers from out-queue upon close
  virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
  virtio_console: Use kmalloc instead of kzalloc
  virtio_console: Free buffer if splice fails
  virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
  virtio: console: don't rely on virtqueue_add_buf() returning capacity.
  virtio_net: don't rely on virtqueue_add_buf() returning capacity.
  virtio-net: remove unused skb_vnet_hdr->num_sg field
  virtio-net: correct capacity math on ring full
  virtio: move queue_index and num_free fields into core struct virtqueue.
  ...
2012-12-20 08:37:05 -08:00
Linus Torvalds
1ffab3d413 ARM: arm-soc fixes for 3.8
This is a batch of fixes for arm-soc platforms, most of it is for OMAP
 but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
 broken platforms and drivers, etc. A bit all over the map really.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0BBZAAoJEIwa5zzehBx3WwgP/jS31XauTUrGLEOCUzarINB/
 7ZVGkkVv9cp4AqW1lcBAyQak424ff2hxfhJWxRthBJT/fQ2OFcdZFWLkFEG2kO0y
 PZ5WCxI1Q4ZNz8iW9qynIRCyhvzhaTHwA7wsGqmGRl1u2VMfoeBiAPNoxTAnpUEm
 05L7EBDVSK++KgvkuoQ2KeWOII9IKNaHH4Yg5y8/guCsTbsWjzo4cjS5MGmo3s7r
 6ArPr2h2WvSbayL67aPheBwg6K0ScY/JJQhL/8HgFbnnnL+mDcZ9iH5yKl/b25BX
 FnQkjb37p0GUDNhXQOoElifghDF8rIAD6o5WDgTL2h5uun4WImYbMS/CktnLAQeH
 wNVvrpmpxi11xf9D3SCRCM4jVAy4u1DVEL/0FElWCx1hn4iixm9hGvQ+YPq6/pMl
 LOP87mzmFn+FhPtj7HIDp5B1ECw1xqcP064FHUYhMEntEHfN/Xh6gmDooesDrbUf
 VuvjrSRBIeTI98xrNALepqWn5w/veZtIymZdEz9vlcK8gQ+tHNt7oI/RbDb08cap
 gyMboWciAm2F6FE9QNlrjQHTbCEIrE5BoryRW87ArTbYhKvnZ1vpW01YmJHhA6LE
 v6glfw6FWYAU8wlF4xbyzN1Gy3PB4kUMCGDqZzh4wU1JGXklJm3CbIf73sOwRmg7
 lmnw+RYq6z/RMmPjjrb4
 =4g/i
 -----END PGP SIGNATURE-----

Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "This is a batch of fixes for arm-soc platforms, most of it is for OMAP
  but there are others too (i.MX, Tegra, ep93xx).  Fixes warnings, some
  broken platforms and drivers, etc.  A bit all over the map really."

There was some concern about commit 68136b10 ("RM: sunxi: Change device
tree naming scheme for sunxi"), but Tony says:
 "Looks like that's trivial to fix as needed, no need to rebuild the
  branch to fix that AFAIK.

  The fix can be done once Olof is available online again.

  Linus, I suggest that you go ahead and pull this if there are no other
  issues with this branch."

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits)
  ARM: sunxi: Change device tree naming scheme for sunxi
  ARM: ux500: fix missing include
  ARM: u300: delete custom pin hog code
  ARM: davinci: fix build break due to missing include
  ARM: exynos: Fix warning due to missing 'inline' in stub
  ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
  ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks
  ARM: dts: mx27: Fix the AIPI bus for FEC
  ARM: OMAP2+: common: remove use of vram
  ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings
  ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent lists
  ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider
  ARM: OMAP4: Fix EMU clock domain always on
  ARM: OMAP4460: Workaround ABE DPLL failing to turn-on
  ARM: OMAP4: Enhance support for DPLLs with 4X multiplier
  ARM: OMAP4: Add function table for non-M4X dplls
  ARM: OMAP4: Update timer clock aliases
  ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h
  ARM: dts: Add build target for omap4-panda-a4
  ARM: dts: OMAP2420: Correct H4 board memory size
  ...
2012-12-20 07:21:54 -08:00
Maarten Lankhorst
ada65c7405 dma-buf: remove fallback for !CONFIG_DMA_SHARED_BUFFER
Documentation says that code requiring dma-buf should add it to
select, so inline fallbacks are not going to be used. A link error
will make it obvious what went wrong, instead of silently doing
nothing at runtime.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2012-12-20 12:05:06 +05:30
Linus Torvalds
9eb127cc04 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Really fix tuntap SKB use after free bug, from Eric Dumazet.

 2) Adjust SKB data pointer to point past the transport header before
    calling icmpv6_notify() so that the headers are in the state which
    that function expects.  From Duan Jiong.

 3) Fix ambiguities in the new tuntap multi-queue APIs.  From Jason
    Wang.

 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.

 5) Don't destroy mutex after freeing up device private in mac802154,
    fix also from Konstantin Khlebnikov.

 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.

 7) SCTP HMAC kconfig rework, from Neil Horman.

 8) Fix SCTP jprobes function signature, otherwise things explode, from
    Daniel Borkmann.

 9) Fix typo in ipv6-offload Makefile variable reference, from Simon
    Arlott.

10) Don't fail USBNET open just because remote wakeup isn't supported,
    from Oliver Neukum.

11) be2net driver bug fixes from Sathya Perla.

12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
    Woodhouse.

13) Fix MTU changing regression in 8139cp driver, from John Greene.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
  solos-pci: ensure all TX packets are aligned to 4 bytes
  solos-pci: add firmware upgrade support for new models
  solos-pci: remove superfluous debug output
  solos-pci: add GPIO support for newer versions on Geos board
  8139cp: Prevent dev_close/cp_interrupt race on MTU change
  net: qmi_wwan: add ZTE MF880
  drivers/net: Use of_match_ptr() macro in smsc911x.c
  drivers/net: Use of_match_ptr() macro in smc91x.c
  ipv6: addrconf.c: remove unnecessary "if"
  bridge: Correctly encode addresses when dumping mdb entries
  bridge: Do not unregister all PF_BRIDGE rtnl operations
  use generic usbnet_manage_power()
  usbnet: generic manage_power()
  usbnet: handle PM failure gracefully
  ksz884x: fix receive polling race condition
  qlcnic: update driver version
  qlcnic: fix unused variable warnings
  net: fec: forbid FEC_PTP on SoCs that do not support
  be2net: fix wrong frag_idx reported by RX CQ
  be2net: fix be_close() to ensure all events are ack'ed
  ...
2012-12-19 20:29:15 -08:00
Linus Torvalds
e32795503d Device tree v3.8 bug fix branch. Fixes an undefined struct device build
error and a missing symbol export.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0l2KAAoJEEFnBt12D9kB+tUQAKMjdtBO4MV9LaSham/yj+bf
 f7aGoslEFHloXOvP0/Hg8+bP+Z0El5p7ncCZIROtN2pfhYad1CjbZHhwhJeeRMuJ
 vQT+uy9BZ09pWoSvkLCWdUQyEGYWMxaOQtDMVHEdURU7nRBOPlntrCijoS68pcK8
 Tw9XsX69Qurk/Z8q8DXf8hmNF49Cyv8ax1rywqjTTT0yzR+UzQGldXUDqEg9bg/t
 Qcf03xwUSojdEBQrLo8aaFm32EUguqB02WqS1KMQBaz6FnbqmvVdVyIgVea7kgNi
 mws1um6/3B/yDYB3ESKNXeZiF1bW03ccdITOeRe+mBDEz0JjnZQmQb30xaT6kjT/
 z4VV1DVFAKFQ4M0HlKmv7xvcBcNdzuhrhiFteGNqDkU+zGwgl4GkiSP/c6l0CRVt
 Ij8jHM+YtLmeI1ajx2V9OhM4xzK1Upo6+zi5zgGxLqflvBnFysBVuLjV8/KdeJ6Z
 FW5J0iOMbIdRKdBZugj+c47qMuBjXx+BZwUxoaAbgHnFLctkF3cWhc3sLpqNi6pe
 6F8GkfEIW8nc6RYLb+Rh4e29mZgAMub+GS3bqKA0bcpOjgm6zwNV27ws7lLxwz2u
 d5Xf6uB2nfZGSZJtRa8mvqwEFkMoFPwaAD3XX2DmpSPZ0jTX5X1bRMa4u3UA7uJF
 VIdMKi+ZzuVhLpaMtnXc
 =srgY
 -----END PGP SIGNATURE-----
mergetag object bc1008cf7d
 type commit
 tag gpio-for-linus
 tagger Grant Likely <grant.likely@secretlab.ca> 1355962627 +0000
 
 GPIO device driver bug fixes:
 - gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
 - gpio/ich: Add missing spinlock init
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0lmWAAoJEEFnBt12D9kBhxEP/0RQ87g52xXFxYpfIRRw6YIY
 YlXR0noPEJqYnOUGHI0pi+P6mFDvv95etT3khEsuwVkSbb0c0fEl8m6w8O05aTd9
 DHdQ4ilvUE77+xO2uZsRN6VxBgnApaj8qMdLWi09yFnSm43wegWTi38IpnY2W4PE
 /qgTWT+pBraW+l4TQ/dA4hvhsTorH1McnbaNWQG1oNHbrnQ4Q43UII8qR02exz6Y
 CTgo5qpgvzbz334VQ/VYPgd2NOE1wZ6BihMZr8wNVv25Ip1tWhEh4BbCjpMmGBYC
 FnHfmKB3zeDdHbsnV2TZc2vC48qnMblGSkMjKsKZn0IOfH2YSym94L0VaXZhCLkw
 hjtZxrDnbkHzpkLc7inc8IxbTG1wyhtDQhIf9HBzO33QPtdXw2v6GVtLJ0Ca9ikB
 1T3XJZPZW+JwaEUdsw4UP7ZkJ7cwRG+fxB3iN57QDKj6Hjy+i/hA3GOjbF1VAOsb
 xTrLcfnYvQ81BXdMP2rzPo2c/XdwroW6SNNxYq8xkzlVZuRT0kZSObfGSM4zFxKp
 idfxwHz6ctXB1oni3XBaHmuyRXMZ9zERuyoNARPIJVw0ylSb93eq7dx+j02JOBtc
 RtrBb1oVtOLiVdn6mIDA4LSELdmcbZdXHW4k84CqyNPWSRG4Sbhx3539r1m90vHw
 EBnI5XpggxPt/U9qqpxH
 =JGvn
 -----END PGP SIGNATURE-----
mergetag object d3601e56cf
 type commit
 tag spi-for-linus
 tagger Grant Likely <grant.likely@secretlab.ca> 1355962904 +0000
 
 SPI device driver bug fixes branch for the v3.8 merge window. Most of
 this is bug fixes to the core code and the sh-hspi and s3c64xx device
 drivers.
 
 There is also a patch here to add DT support to the Atmel driver. This
 one should have been in the first round, but I missed it. It's a low
 risk change contained within a single driver and the Atmel maintainer
 has requested it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0lxNAAoJEEFnBt12D9kBCTMP/0WvS/7iiilM00b5pmclGkRc
 Ct3832KnZXn5dDYoSsP1AS2PRJAKbrMzvKMZu+ggjOAlwza87rKJJpt+KdU3r15b
 +KElnaStCC88ohxEOwQV4+CJpF3dS1AmrGMQvo9RK8hddpR1dH+FgMdmex104X/A
 Ysr0WgqQMY3iGVmnlPxU4EYlR1scnqOpaNADNbMY7ibuJ7LOYk1RYtqmX5yxdnJq
 yABILFa10qV9N/cqPBu2Cuhw8Xtp/u70dsQPkALG//pyZnFyhw6rMJ6bowOC2IHM
 p3R4IrVqJq1a3nL1IC1XX5/k21b0PBNxfOGqpeU4QoDK4/cIKeiPny1dtoI1UM+2
 mpTU99VvitZ0ywolenKnbPdU61UwZ6Sd+ptsZM0Y/wIMiXDWSBDTZqVJRAbINqFb
 JwwjtdaFDPgtIb6Yg1WuhLmhcOPfXFIRys7JmnS0VNQCpvvEIdFsR9l5jjMcmluR
 W7V69z9FZj5lf7WjNstbAxbCvrw1i65OD6Nr1UBAA17bqNU6a9BCWdNJGwLozraR
 k2ZomvfmIMhTP82z5by5UY39M9B5uGOHDxxXZOOvxzB/fxlYgLhiAGPfMY1FMrPH
 48gDqQtdOpakL1B/gVELpBLMoKjwEx9jJa/8tJb5wy8EgNSw06BDSN69OclLgmV/
 uVnSZWrp62odDLz0qjSR
 =trRO
 -----END PGP SIGNATURE-----

Merge tags 'dt-for-linus', 'gpio-for-linus' and 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull devicetree, gpio and spi bugfixes from Grant Likely:
 "Device tree v3.8 bug fix:
   - Fixes an undefined struct device build error and a missing symbol
     export.

  GPIO device driver bug fixes:
   - gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
   - gpio/ich: Add missing spinlock init

  SPI device driver bug fixes:
   - Most of this is bug fixes to the core code and the sh-hspi and
     s3c64xx device drivers.

   - There is also a patch here to add DT support to the Atmel driver.
     This one should have been in the first round, but I missed it.
     It's a low risk change contained within a single driver and the
     Atmel maintainer has requested it."

* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS
  of: Fix export of of_find_matching_node_and_match()

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG
  gpio/ich: Add missing spinlock init

* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  spi/sh-hspi: fix return value check in hspi_probe().
  spi: fix tegra SPI binding examples
  spi/atmel: add DT support
  of/spi: Fix SPI module loading by using proper "spi:" modalias prefixes.
  spi: Change FIFO flush operation and spi channel off
  spi: Keep chipselect assertion during one message
2012-12-19 20:26:16 -08:00
Al Viro
c40702c49f new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
note that they are relying on access_ok() already checked by caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro
9026843952 generic compat_sys_sigaltstack()
Again, conditional on CONFIG_GENERIC_SIGALTSTACK

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro
6bf9adfc90 introduce generic sys_sigaltstack(), switch x86 and um to it
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not
select it are completely unaffected

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro
9b064fc3f9 new helper: compat_user_stack_pointer()
Compat counterpart of current_user_stack_pointer(); for most of the biarch
architectures those two are identical, but e.g. arm64 and arm use different
registers for stack pointer...

Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer
do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro
5c49574ffd new helper: restore_altstack()
to be used by rt_sigreturn instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro
1ca97bb541 new helper: current_user_stack_pointer()
Cross-architecture equivalent of rdusp(); default is
user_stack_pointer(current_pt_regs()) - that works for almost all
platforms that have usp saved in pt_regs.  The only exception from
that is ia64 - we want memory stack, not the backing store for
register one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:39 -05:00
Al Viro
ae903caae2 Bury the conditionals from kernel_thread/kernel_execve series
All architectures have
	CONFIG_GENERIC_KERNEL_THREAD
	CONFIG_GENERIC_KERNEL_EXECVE
	__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:38 -05:00
Al Viro
4683661388 COMPAT_SYSCALL_DEFINE: infrastructure
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:06:58 -05:00
Fabio Porcedda
cf13a84d17 watchdog: WatchDog Timer Driver Core: fix comment
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-12-19 22:24:55 +01:00
Linus Torvalds
ca2a88f56a MTD pull for 3.8
- Various cleanups especially in NAND tests
  - Add support for NAND flash on BCMA bus
  - DT support for sh_flctl and denali NAND drivers
  - Kill obsolete/superceded drivers (fortunet, nomadik_nand)
  - Fix JFFS2 locking bug in ENOMEM failure path
  - New SPI flash chips, as usual
  - Support writing in 'reliable mode' for DiskOnChip G4
  - Debugfs support in nandsim
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAlDSAa4ACgkQdwG7hYl686MMcACeNYa//ghPtccb5L+IRXsqaFDL
 Yi4AoLWOaOjN8qM4KUF/bfMEkwNGAePz
 =DaAQ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd

Pull MTD updates from David Woodhouse:
 - Various cleanups especially in NAND tests
 - Add support for NAND flash on BCMA bus
 - DT support for sh_flctl and denali NAND drivers
 - Kill obsolete/superceded drivers (fortunet, nomadik_nand)
 - Fix JFFS2 locking bug in ENOMEM failure path
 - New SPI flash chips, as usual
 - Support writing in 'reliable mode' for DiskOnChip G4
 - Debugfs support in nandsim

* tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd: (96 commits)
  mtd: nand: typo in nand_id_has_period() comments
  mtd: nand/gpio: use io{read,write}*_rep accessors
  mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited.
  mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems
  mtd: nand/docg4: fix and improve read of factory bbt
  mtd: nand/docg4: reserve bb marker area in ecclayout
  mtd: nand/docg4: add support for writing in reliable mode
  mtd: mxc_nand: reorder part_probes to let cmdline override other sources
  mtd: mxc_nand: fix unbalanced clk_disable() in error path
  mtd: nandsim: Introduce debugfs infrastructure
  mtd: physmap_of: error checking to prevent a NULL pointer dereference
  mtg: docg3: potential divide by zero in doc_write_oob()
  mtd: bcm47xxnflash: writing support
  mtd: tests/read: initialize buffer for whole next page
  mtd: at91: atmel_nand: return bit flips for the PMECC read_page()
  mtd: fix recovery after failed write-buffer operation in cfi_cmdset_0002.c
  mtd: nand: onfi need to be probed in 8 bits mode
  mtd: nand: add NAND_BUSWIDTH_AUTO to autodetect bus width
  mtd: nand: print flash size during detection
  mted: nand_wait_ready timeout fix
  ...
2012-12-19 12:47:41 -08:00
Oliver Neukum
2dd7c8cf29 usbnet: generic manage_power()
Centralise common code for manage_power() in usbnet
by making a generic simple implementation

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-19 12:46:40 -08:00
Oliver Neukum
a1c088e01b usbnet: handle PM failure gracefully
If a device fails to do remote wakeup, this is no reason
to abort an open totally. This patch just continues without
runtime PM.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-19 12:46:40 -08:00
Jack Morgenstein
3c439b5586 mlx4_core: Allow choosing flow steering mode
Device managed flow steering will be enabled only under administrator
directive provided through setting the existing module parameter
log_num_mgm_entry_size to -1 (if the device actually supports flow
steering).  If flow steering isn't requested or not available, the
driver will use the value of log_num_mgm_entry_size and B0 steering.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19 11:47:22 -08:00