Commit Graph

34277 Commits

Author SHA1 Message Date
James Bottomley
871dd9286e block: fix max discard sectors limit
linux-v3.8-rc1 and later support for plug for blkdev_issue_discard with
commit 0cfbcafcae
(block: add plug for blkdev_issue_discard )

For example,
1) DISCARD rq-1 with size size 4GB
2) DISCARD rq-2 with size size 1GB

If these 2 discard requests get merged, final request size will be 5GB.

In this case, request's __data_len field may overflow as it can store
max 4GB(unsigned int).

This issue was observed while doing mkfs.f2fs on 5GB SD card:
https://lkml.org/lkml/2013/4/1/292

Info: sector size = 512
Info: total sectors = 11370496 (in 512bytes)
Info: zone aligned segment0 blkaddr: 512
[  257.789764] blk_update_request: bio idx 0 >= vcnt 0

mkfs process gets stuck in D state and I see the following in the dmesg:

[  257.789733] __end_that: dev mmcblk0: type=1, flags=122c8081
[  257.789764]   sector 4194304, nr/cnr 2981888/4294959104
[  257.789764]   bio df3840c0, biotail df3848c0, buffer   (null), len
1526726656
[  257.789764] blk_update_request: bio idx 0 >= vcnt 0
[  257.794921] request botched: dev mmcblk0: type=1, flags=122c8081
[  257.794921]   sector 4194304, nr/cnr 2981888/4294959104
[  257.794921]   bio df3840c0, biotail df3848c0, buffer   (null), len
1526726656

This patch fixes this issue.

Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-04-24 08:52:50 -06:00
Jens Axboe
64f8de4da7 Merge branch 'writeback-workqueue' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq into for-3.10/core
Tejun writes:

-----

This is the pull request for the earlier patchset[1] with the same
name.  It's only three patches (the first one was committed to
workqueue tree) but the merge strategy is a bit involved due to the
dependencies.

* Because the conversion needs features from wq/for-3.10,
  block/for-3.10/core is based on rc3, and wq/for-3.10 has conflicts
  with rc3, I pulled mainline (rc5) into wq/for-3.10 to prevent those
  workqueue conflicts from flaring up in block tree.

* Resolving the issue that Jan and Dave raised about debugging
  requires arch-wide changes.  The patchset is being worked on[2] but
  it'll have to go through -mm after these changes show up in -next,
  and not included in this pull request.

The three commits are located in the following git branch.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git writeback-workqueue

Pulling it into block/for-3.10/core produces a conflict in
drivers/md/raid5.c between the following two commits.

  e3620a3ad5 ("MD RAID5: Avoid accessing gendisk or queue structs when not available")
  2f6db2a707 ("raid5: use bio_reset()")

The conflict is trivial - one removes an "if ()" conditional while the
other removes "rbi->bi_next = NULL" right above it.  We just need to
remove both.  The merged branch is available at

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git block-test-merge

so that you can use it for verification.  The test merge commit has
proper merge description.

While these changes are a bit of pain to route, they make code simpler
and even have, while minute, measureable performance gain[3] even on a
workload which isn't particularly favorable to showing the benefits of
this conversion.

----

Fixed up the conflict.

Conflicts:
	drivers/md/raid5.c

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-04-02 10:04:39 +02:00
Tejun Heo
839a8e8660 writeback: replace custom worker pool implementation with unbound workqueue
Writeback implements its own worker pool - each bdi can be associated
with a worker thread which is created and destroyed dynamically.  The
worker thread for the default bdi is always present and serves as the
"forker" thread which forks off worker threads for other bdis.

there's no reason for writeback to implement its own worker pool when
using unbound workqueue instead is much simpler and more efficient.
This patch replaces custom worker pool implementation in writeback
with an unbound workqueue.

The conversion isn't too complicated but the followings are worth
mentioning.

* bdi_writeback->last_active, task and wakeup_timer are removed.
  delayed_work ->dwork is added instead.  Explicit timer handling is
  no longer necessary.  Everything works by either queueing / modding
  / flushing / canceling the delayed_work item.

* bdi_writeback_thread() becomes bdi_writeback_workfn() which runs off
  bdi_writeback->dwork.  On each execution, it processes
  bdi->work_list and reschedules itself if there are more things to
  do.

  The function also handles low-mem condition, which used to be
  handled by the forker thread.  If the function is running off a
  rescuer thread, it only writes out limited number of pages so that
  the rescuer can serve other bdis too.  This preserves the flusher
  creation failure behavior of the forker thread.

* INIT_LIST_HEAD(&bdi->bdi_list) is used to tell
  bdi_writeback_workfn() about on-going bdi unregistration so that it
  always drains work_list even if it's running off the rescuer.  Note
  that the original code was broken in this regard.  Under memory
  pressure, a bdi could finish unregistration with non-empty
  work_list.

* The default bdi is no longer special.  It now is treated the same as
  any other bdi and bdi_cap_flush_forker() is removed.

* BDI_pending is no longer used.  Removed.

* Some tracepoints become non-applicable.  The following TPs are
  removed - writeback_nothread, writeback_wake_thread,
  writeback_wake_forker_thread, writeback_thread_start,
  writeback_thread_stop.

Everything, including devices coming and going away and rescuer
operation under simulated memory pressure, seems to work fine in my
test setup.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
2013-04-01 19:08:06 -07:00
Tejun Heo
181387da2d writeback: remove unused bdi_pending_list
There's no user left.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Fengguang Wu <fengguang.wu@intel.com>
2013-04-01 19:08:06 -07:00
Tejun Heo
229641a6f1 Linux 3.9-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRWLTrAAoJEHm+PkMAQRiGe8oH/iMy48mecVWvxVZn74Tx3Cef
 xmW/PnAIj28EhSPqK49N/Ow6AfQToFKf7AP0ge20KAf5teTq95AY+tH74DAANt8F
 BjKXXTZiR5xwBvRkq7CR5wDcCvEcBAAz8fgTEd6SEDB2d2VXFf5eKdKUqt1avTCh
 Z6Hup5kuwX+ddtwY2DCBXtp2n6fL0Rm5yLzY1A3OOBye1E7VyLTF7M5BR603Q44P
 4kRLxn8+R7jy3hTuZIhAeoS8TKUoBwVk7DmKxEzrhTHZVOmvwE9lEHybRnIyOpd/
 k1JnbRbiPsLsCVFOn10SQkGDAIk00lro3tuWP2C1ljERiD/OOh5Ui9nXYAhMkbI=
 =q15K
 -----END PGP SIGNATURE-----

Merge tag 'v3.9-rc5' into wq/for-3.10

Writeback conversion to workqueue will be based on top of wq/for-3.10
branch to take advantage of custom attrs and NUMA support for unbound
workqueues.  Mainline currently contains two commits which result in
non-trivial merge conflicts with wq/for-3.10 and because
block/for-3.10/core is based on v3.9-rc3 which contains one of the
conflicting commits, we need a pre-merge-window merge anyway.  Let's
pull v3.9-rc5 into wq/for-3.10 so that the block tree doesn't suffer
from workqueue merge conflicts.

The two conflicts and their resolutions:

* e68035fb65 ("workqueue: convert to idr_alloc()") in mainline changes
  worker_pool_assign_id() to use idr_alloc() instead of the old idr
  interface.  worker_pool_assign_id() goes through multiple locking
  changes in wq/for-3.10 causing the following conflict.

  static int worker_pool_assign_id(struct worker_pool *pool)
  {
	  int ret;

  <<<<<<< HEAD
	  lockdep_assert_held(&wq_pool_mutex);

	  do {
		  if (!idr_pre_get(&worker_pool_idr, GFP_KERNEL))
			  return -ENOMEM;
		  ret = idr_get_new(&worker_pool_idr, pool, &pool->id);
	  } while (ret == -EAGAIN);
  =======
	  mutex_lock(&worker_pool_idr_mutex);
	  ret = idr_alloc(&worker_pool_idr, pool, 0, 0, GFP_KERNEL);
	  if (ret >= 0)
		  pool->id = ret;
	  mutex_unlock(&worker_pool_idr_mutex);
  >>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89

	  return ret < 0 ? ret : 0;
  }

  We want locking from the former and idr_alloc() usage from the
  latter, which can be combined to the following.

  static int worker_pool_assign_id(struct worker_pool *pool)
  {
	  int ret;

	  lockdep_assert_held(&wq_pool_mutex);

	  ret = idr_alloc(&worker_pool_idr, pool, 0, 0, GFP_KERNEL);
	  if (ret >= 0) {
		  pool->id = ret;
		  return 0;
	  }
	  return ret;
   }

* eb2834285c ("workqueue: fix possible pool stall bug in
  wq_unbind_fn()") updated wq_unbind_fn() such that it has single
  larger for_each_std_worker_pool() loop instead of two separate loops
  with a schedule() call inbetween.  wq/for-3.10 renamed
  pool->assoc_mutex to pool->manager_mutex causing the following
  conflict (earlier function body and comments omitted for brevity).

  static void wq_unbind_fn(struct work_struct *work)
  {
  ...
		  spin_unlock_irq(&pool->lock);
  <<<<<<< HEAD
		  mutex_unlock(&pool->manager_mutex);
	  }
  =======
		  mutex_unlock(&pool->assoc_mutex);
  >>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89

		  schedule();

  <<<<<<< HEAD
	  for_each_cpu_worker_pool(pool, cpu)
  =======
  >>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89
		  atomic_set(&pool->nr_running, 0);

		  spin_lock_irq(&pool->lock);
		  wake_up_worker(pool);
		  spin_unlock_irq(&pool->lock);
	  }
  }

  The resolution is mostly trivial.  We want the control flow of the
  latter with the rename of the former.

  static void wq_unbind_fn(struct work_struct *work)
  {
  ...
		  spin_unlock_irq(&pool->lock);
		  mutex_unlock(&pool->manager_mutex);

		  schedule();

		  atomic_set(&pool->nr_running, 0);

		  spin_lock_irq(&pool->lock);
		  wake_up_worker(pool);
		  spin_unlock_irq(&pool->lock);
	  }
  }

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-04-01 18:45:36 -07:00
Tejun Heo
d55262c4d1 workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity
Unbound workqueues are now NUMA aware.  Let's add some control knobs
and update sysfs interface accordingly.

* Add kernel param workqueue.numa_disable which disables NUMA affinity
  globally.

* Replace sysfs file "pool_id" with "pool_ids" which contain
  node:pool_id pairs.  This change is userland-visible but "pool_id"
  hasn't seen a release yet, so this is okay.

* Add a new sysf files "numa" which can toggle NUMA affinity on
  individual workqueues.  This is implemented as attrs->no_numa whichn
  is special in that it isn't part of a pool's attributes.  It only
  affects how apply_workqueue_attrs() picks which pools to use.

After "pool_ids" change, first_pwq() doesn't have any user left.
Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2013-04-01 11:23:38 -07:00
Paul Walmsley
dbf520a9d7 Revert "lockdep: check that no locks held at freeze time"
This reverts commit 6aa9707099.

Commit 6aa9707099 ("lockdep: check that no locks held at freeze time")
causes problems with NFS root filesystems.  The failures were noticed on
OMAP2 and 3 boards during kernel init:

  [ BUG: swapper/0/1 still has locks held! ]
  3.9.0-rc3-00344-ga937536 #1 Not tainted
  -------------------------------------
  1 lock held by swapper/0/1:
   #0:  (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574

  stack backtrace:
    rpc_wait_bit_killable
    __wait_on_bit
    out_of_line_wait_on_bit
    __rpc_execute
    rpc_run_task
    rpc_call_sync
    nfs_proc_get_root
    nfs_get_root
    nfs_fs_mount_common
    nfs_try_mount
    nfs_fs_mount
    mount_fs
    vfs_kern_mount
    do_mount
    sys_mount
    do_mount_root
    mount_root
    prepare_namespace
    kernel_init_freeable
    kernel_init

Although the rootfs mounts, the system is unstable.  Here's a transcript
from a PM test:

  http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt

Here's what the test log should look like:

  http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt

Mailing list discussion is here:

  http://lkml.org/lkml/2013/3/4/221

Deal with this for v3.9 by reverting the problem commit, until folks can
figure out the right long-term course of action.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <maciej.rutecki@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-31 11:38:33 -07:00
Michel Lespinasse
09a9f1d278 Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs"
This reverts commit 1869305009 ("mm: introduce VM_POPULATE flag to
better deal with racy userspace programs").

VM_POPULATE only has any effect when userspace plays racy games with
vmas by trying to unmap and remap memory regions that mmap or mlock are
operating on.

Also, the only effect of VM_POPULATE when userspace plays such games is
that it avoids populating new memory regions that get remapped into the
address range that was being operated on by the original mmap or mlock
calls.

Let's remove VM_POPULATE as there isn't any strong argument to mandate a
new vm_flag.

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-28 17:45:51 -07:00
Linus Torvalds
0776ce03b1 USB fixes for 3.9-rc4
Here are some USB fixes to resolve issues reported recently, as well as a new
 device id for the ftdi_sio driver.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlFUaA0ACgkQMUfUDdst+yk0AACfe9iitBiGERSO4NsyIvypoJ1q
 vOgAoKek8fiPmTKrZl18n79oX28qU9x2
 =Oee3
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
 "Here are some USB fixes to resolve issues reported recently, as well
  as a new device id for the ftdi_sio driver."

* tag 'usb-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: ftdi_sio: Add support for Mitsubishi FX-USB-AW/-BD
  usb: Fix compile error by selecting USB_OTG_UTILS
  USB: serial: fix hang when opening port
  USB: EHCI: fix bug in iTD/siTD DMA pool allocation
  xhci: Don't warn on empty ring for suspended devices.
  usb: xhci: Fix TRB transfer length macro used for Event TRB.
  usb/acpi: binding xhci root hub usb port with ACPI
  usb: add find_raw_port_number callback to struct hc_driver()
  usb: xhci: fix build warning
2013-03-28 15:54:25 -07:00
Linus Torvalds
1b6a4db220 char/misc driver fixes for 3.9-rc4
Here are some small char/misc driver fixes that resolve issues recently
 reported against the 3.9-rc kernels.  All have been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlFUdOAACgkQMUfUDdst+ykb9QCeKb0WfCxqwPFZDCAbIiyX9AyA
 1OMAoJU7WJo1/wpfyyTLr6RuN8E0X0p/
 =1+Ic
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg Kroah-Hartman:
 "Here are some small char/misc driver fixes that resolve issues
  recently reported against the 3.9-rc kernels.  All have been in
  linux-next for a while."

* tag 'char-misc-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  VMCI: Fix process-to-process DRGAMs.
  mei: ME hardware reset needs to be synchronized
  mei: add mei_stop function to stop mei device
  extcon: max77693: Initialize register of MUIC device to bring up it without platform data
  extcon: max77693: Fix bug of wrong pointer when platform data is not used
  extcon: max8997: Check the pointer of platform data to protect null pointer error
2013-03-28 15:51:33 -07:00
Linus Torvalds
2c3de1c2d7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns fixes from Eric W Biederman:
 "The bulk of the changes are fixing the worst consequences of the user
  namespace design oversight in not considering what happens when one
  namespace starts off as a clone of another namespace, as happens with
  the mount namespace.

  The rest of the changes are just plain bug fixes.

  Many thanks to Andy Lutomirski for pointing out many of these issues."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns: Restrict when proc and sysfs can be mounted
  ipc: Restrict mounting the mqueue filesystem
  vfs: Carefully propogate mounts across user namespaces
  vfs: Add a mount flag to lock read only bind mounts
  userns:  Don't allow creation if the user is chrooted
  yama:  Better permission check for ptraceme
  pid: Handle the exit of a multi-threaded init.
  scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
2013-03-28 13:43:46 -07:00
Eric W. Biederman
87a8ebd637 userns: Restrict when proc and sysfs can be mounted
Only allow unprivileged mounts of proc and sysfs if they are already
mounted when the user namespace is created.

proc and sysfs are interesting because they have content that is
per namespace, and so fresh mounts are needed when new namespaces
are created while at the same time proc and sysfs have content that
is shared between every instance.

Respect the policy of who may see the shared content of proc and sysfs
by only allowing new mounts if there was an existing mount at the time
the user namespace was created.

In practice there are only two interesting cases: proc and sysfs are
mounted at their usual places, proc and sysfs are not mounted at all
(some form of mount namespace jail).

Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-27 07:50:08 -07:00
Eric W. Biederman
90563b198e vfs: Add a mount flag to lock read only bind mounts
When a read-only bind mount is copied from mount namespace in a higher
privileged user namespace to a mount namespace in a lesser privileged
user namespace, it should not be possible to remove the the read-only
restriction.

Add a MNT_LOCK_READONLY mount flag to indicate that a mount must
remain read-only.

CC: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-27 07:50:04 -07:00
Eric W. Biederman
3151527ee0 userns: Don't allow creation if the user is chrooted
Guarantee that the policy of which files may be access that is
established by setting the root directory will not be violated
by user namespaces by verifying that the root directory points
to the root of the mount namespace at the time of user namespace
creation.

Changing the root is a privileged operation, and as a matter of policy
it serves to limit unprivileged processes to files below the current
root directory.

For reasons of simplicity and comprehensibility the privilege to
change the root directory is gated solely on the CAP_SYS_CHROOT
capability in the user namespace.  Therefore when creating a user
namespace we must ensure that the policy of which files may be access
can not be violated by changing the root directory.

Anyone who runs a processes in a chroot and would like to use user
namespace can setup the same view of filesystems with a mount
namespace instead.  With this result that this is not a practical
limitation for using user namespaces.

Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-27 07:49:29 -07:00
Linus Torvalds
b175293ccc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Always increment IPV4 ID field in encapsulated GSO packets, even
    when DF is set.  Regression fix from Pravin B Shelar.

 2) Fix per-net subsystem initialization in netfilter conntrack,
    otherwise we may access dynamically allocated memory before it is
    actually allocated.  From Gao Feng.

 3) Fix DMA buffer lengths in iwl3945 driver, from Stanislaw Gruszka.

 4) Fix race between submission of sync vs async commands in mwifiex
    driver, from Amitkumar Karwar.

 5) Add missing cancel of command timer in mwifiex driver, from Bing
    Zhao.

 6) Missing SKB free in rtlwifi USB driver, from Jussi Kivilinna.

 7) Thermal layer tries to use a genetlink multicast string that is
    longer than the 16 character limit.  Fix it and add a BUG check to
    prevent this kind of thing from happening in the future.

 From Masatake YAMATO.

 8) Fix many bugs in the handling of the teardown of L2TP connections,
    UDP encapsulation instances, and sockets.  From Tom Parkin.

 9) Missing socket release in IRDA, from Kees Cook.

10) Fix fec driver modular build, from Fabio Estevam.

11) Erroneous use of kfree() instead of free_netdev() in lantiq_etop,
    from Wei Yongjun.

12) Fix bugs in handling of queue numbers and steering rules in mlx4
    driver, from Moshe Lazer, Hadar Hen Zion, and Or Gerlitz.

13) Some FOO_DIAG_MAX constants were defined off by one, fix from Andrey
    Vagin.

14) TCP segmentation deferral is unintentionally done too strongly,
    breaking ACK clocking.  Fix from Eric Dumazet.

15) net_enable_timestamp() can legitimately be invoked from software
    interrupts, and in a way that is safe, so remove the WARN_ON().
    Also from Eric Dumazet.

16) Fix use after free in VLANs, from Cong Wang.

17) Fix TCP slow start retransmit storms after SACK reneging, from
    Yuchung Cheng.

18) Unix socket release should mark a socket dead before NULL'ing out
    sock->sk, otherwise we can race.  Fix from Paul Moore.

19) IPV6 addrconf code can try to free static memory, from Hong Zhiguo.

20) Fix register mis-programming, NULL pointer derefs, and wrong PHC
    clock frequency in IGB driver.  From Lior LevyAlex Williamson, Jiri
    Benc, and Jeff Kirsher.

21) skb->ip_summed logic in pch_gbe driver is reversed, breaking packet
    forwarding.  Fix from Veaceslav Falico.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  ipv4: Fix ip-header identification for gso packets.
  bonding: remove already created master sysfs link on failure
  af_unix: dont send SCM_CREDENTIAL when dest socket is NULL
  pch_gbe: fix ip_summed checksum reporting on rx
  igb: fix PHC stopping on max freq
  igb: make sensor info static
  igb: SR-IOV init reordering
  igb: Fix null pointer dereference
  igb: fix i350 anti spoofing config
  ixgbevf: don't release the soft entries
  ipv6: fix bad free of addrconf_init_net
  unix: fix a race condition in unix_release()
  tcp: undo spurious timeout after SACK reneging
  bnx2x: fix assignment of signed expression to unsigned variable
  bridge: fix crash when set mac address of br interface
  8021q: fix a potential use-after-free
  net: remove a WARN_ON() in net_enable_timestamp()
  tcp: preserve ACK clocking in TSO
  net: fix *_DIAG_MAX constants
  net/mlx4_core: Disallow releasing VF QPs which have steering rules
  ...
2013-03-26 14:24:29 -07:00
Lan Tianyu
3f5eb14135 usb: add find_raw_port_number callback to struct hc_driver()
xhci driver divides the root hub into two logical hubs which work
respectively for usb 2.0 and usb 3.0 devices. They are independent
devices in the usb core. But in the ACPI table, it's one device node
and all usb2.0 and usb3.0 ports are under it. Binding usb port with
its acpi node needs the raw port number which is reflected in the xhci
extended capabilities table. This patch is to add find_raw_port_number
callback to struct hc_driver(), fill it with xhci_find_raw_port_number()
which will return raw port number and add a wrap usb_hcd_find_raw_port_number().

Otherwise, refactor xhci_find_real_port_number(). Using
xhci_find_raw_port_number() to get real index in the HW port status
registers instead of scanning through the xHCI roothub port array.
This can help to speed up.

All addresses in xhci->usb2_ports and xhci->usb3_ports array are
kown good ports and don't include following bad ports in the extended
capabilities talbe.
     (1) root port that doesn't have an entry
     (2) root port with unknown speed
     (3) root port that is listed twice and with different speeds.

So xhci_find_raw_port_number() will only return port num of good ones
and never touch bad ports above.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
2013-03-25 10:39:17 -07:00
Linus Torvalds
fb9bb1829d arm-soc: bug fixes for 3.9-rc4
Four patches for arm-soc this week:
 
 - Kevin Hilman is no longer reachable under his previous email address.
   He submitted the patch earlier, but nobody felt responsible to pick
   it up.
 
 - One Tegra fix for an incorect register address in device tree.
 
 - IMX multiplatform support exposes a configuration option that
   leads to unbootable kernels on all other machines and that needs
   to depend on that platform.
 
 - A nontrivial bug fix for the setup of the mxs video output.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUVBQf2CrR//JCVInAQJwyRAA0qzt+jbn+ezvLpeWvrgdEVLikiAieVyt
 HmSex/pEWh2ytIkrX2maHv5Oov9QmX7v4ZEU2WtdvMv5YERsT7/y1hjHaZTLcdmH
 11ogfVwTRbrNVWOb41ofdd3rUVgZfgzCGQ0rfEua3wLRK6AetZJxkqsuGXRaqjdm
 BBbmgKAmLsLeM3/aBzeuFint1+EDY74WBMxgqkwUretefKFMxzcBaqhoR+FNDIdV
 YaYbOocq45LsOa44gxlF6pmJkZsOsB2pqAjoANm5KtZlphTEpDD1C/wXvBaVAOBh
 8mCuk2mHEyZsyLrufh/ZywaPcDaUMDwpO1zidATwaRCf6qWOr3jtWiCtQo4FeNYS
 o+kkYtELyAEvwDQuljghviq0p+z2vpnk52sYdXkYW8Pgz5TqhK+Gu2ywaeiqeT2B
 cbLGG32lVgJnmWOOXI7Z6MjekgKx5arx7z6Br+1pTT/fE44DgE+CtabEsCdcpFWG
 ftC7FdxZabDUhfynSaO43tgKhdVv2XpVobG3iW2RYJOWm2dJSxulZg9+jypdxITm
 VX3kPar+mjvwyf3svsGIc65SVaayR6pfiLzV8qBtR3trFRbIlrI7vo21d2tFtuS8
 PfAR+9VHkeOdjVKDbu1sl7YycWz03xq4cM9XPFhvrobeFXb5OFwDwLWn4DZR5ZWY
 iJSMJkaBSww=
 =F6UW
 -----END PGP SIGNATURE-----

Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC bug fixes from Arnd Bergmann:
 "Four patches for arm-soc this week:

   - Kevin Hilman is no longer reachable under his previous email
     address.  He submitted the patch earlier, but nobody felt
     responsible to pick it up.

   - One Tegra fix for an incorect register address in device tree.

   - IMX multiplatform support exposes a configuration option that leads
     to unbootable kernels on all other machines and that needs to
     depend on that platform.

   - A nontrivial bug fix for the setup of the mxs video output."

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: update email address for Kevin Hilman
  ARM: tegra: fix register address of slink controller
  ARM: imx: add dependency check for DEBUG_IMX_UART_PORT
  ARM: video: mxs: Fix mxsfb misconfiguring VDCTRL0
2013-03-25 09:26:10 -07:00
Jens Axboe
705cd0ea1c Merge branch 'for-jens' of http://evilpiepirate.org/git/linux-bcache into for-3.10/core
This contains Kents prep work for the immutable bio_vecs.
2013-03-24 21:38:59 -06:00
Kent Overstreet
29ed7813ce bio-integrity: Add explicit field for owner of bip_buf
This was the only real user of BIO_CLONED, which didn't have very clear
semantics. Convert to its own flag so we can get rid of BIO_CLONED.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23 14:26:34 -07:00
Kent Overstreet
a38352e0ac block: Add an explicit bio flag for bios that own their bvec
This is for the new bio splitting code. When we split a bio, if the
split occured on a bvec boundry we reuse the bvec for the new bio. But
that means bio_free() can't free it, hence the explicit flag.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
Acked-by: Tejun Heo <tj@kernel.org>
2013-03-23 14:26:33 -07:00
Kent Overstreet
a07876064a block: Add bio_alloc_pages()
More utility code to replace stuff that's getting open coded.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
2013-03-23 14:26:31 -07:00
Kent Overstreet
d74c6d514f block: Add bio_for_each_segment_all()
__bio_for_each_segment() iterates bvecs from the specified index
instead of bio->bv_idx.  Currently, the only usage is to walk all the
bvecs after the bio has been advanced by specifying 0 index.

For immutable bvecs, we need to split these apart;
bio_for_each_segment() is going to have a different implementation.
This will also help document the intent of code that's using it -
bio_for_each_segment_all() is only legal to use for code that owns the
bio.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Neil Brown <neilb@suse.de>
CC: Boaz Harrosh <bharrosh@panasas.com>
2013-03-23 14:26:28 -07:00
Kent Overstreet
16ac3d63e7 block: Add bio_copy_data()
This gets open coded quite a bit and it's tricky to get right, so make a
generic version and convert some existing users over to it instead.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
2013-03-23 14:15:37 -07:00
Kent Overstreet
9e882242c6 block: Add submit_bio_wait(), remove from md
Random cleanup - this code was duplicated and it's not really specific
to md.

Also added the ability to return the actual error code.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
2013-03-23 14:15:32 -07:00
Kent Overstreet
f73a1c7d11 block: Add bio_end_sector()
Just a little convenience macro - main reason to add it now is preparing
for immutable bio vecs, it'll reduce the size of the patch that puts
bi_sector/bi_size/bi_idx into a struct bvec_iter.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Lars Ellenberg <drbd-dev@lists.linbit.com>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Alasdair Kergon <agk@redhat.com>
CC: dm-devel@redhat.com
CC: Neil Brown <neilb@suse.de>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: linux-s390@vger.kernel.org
CC: Chris Mason <chris.mason@fusionio.com>
CC: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2013-03-23 14:15:29 -07:00
Kent Overstreet
054bdf646e block: Add bio_advance()
This is prep work for immutable bio vecs; we first want to centralize
where bvecs are modified.

Next two patches convert some existing code to use this function.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
2013-03-23 14:15:27 -07:00
Kent Overstreet
9f060e2231 block: Convert integrity to bvec_alloc_bs()
This adds a pointer to the bvec array to struct bio_integrity_payload,
instead of the bvecs always being inline; then the bvecs are allocated
with bvec_alloc_bs().

Changed bvec_alloc_bs() and bvec_free_bs() to take a pointer to a
mempool instead of the bioset, so that bio integrity can use a different
mempool for its bvecs, and thus avoid a potential deadlock.

This is eventually for immutable bio vecs - immutable bvecs aren't
useful if we still have to copy them, hence the need for the pointer.
Less code is always nice too, though.

Also, bio_integrity_alloc() was using fs_bio_set if no bio_set was
specified. This was wrong - using the bio_set doesn't protect us from
memory allocation failures, because we just used kmalloc for the
bio_integrity_payload. But it does introduce the possibility of
deadlock, if for some reason we weren't supposed to be using fs_bio_set.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23 14:15:27 -07:00
Kent Overstreet
6fda981caf block: Fix a buffer overrun in bio_integrity_split()
bio_integrity_split() seemed to be confusing pointers and arrays -
bip_vec in bio_integrity_payload was an array appended to the end of the
payload, so the bio_vecs in struct bio_pair should have come after the
bio_integrity_payload they're for.

Fix it by making bip_vec a pointer to the inline vecs - a later patch is
going to make more use of this pointer.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23 14:15:26 -07:00
Kent Overstreet
df2cb6daa4 block: Avoid deadlocks with bio allocation by stacking drivers
Previously, if we ever try to allocate more than once from the same bio
set while running under generic_make_request() (i.e. a stacking block
driver), we risk deadlock.

This is because of the code in generic_make_request() that converts
recursion to iteration; any bios we submit won't actually be submitted
(so they can complete and eventually be freed) until after we return -
this means if we allocate a second bio, we're blocking the first one
from ever being freed.

Thus if enough threads call into a stacking block driver at the same
time with bios that need multiple splits, and the bio_set's reserve gets
used up, we deadlock.

This can be worked around in the driver code - we could check if we're
running under generic_make_request(), then mask out __GFP_WAIT when we
go to allocate a bio, and if the allocation fails punt to workqueue and
retry the allocation.

But this is tricky and not a generic solution. This patch solves it for
all users by inverting the previously described technique. We allocate a
rescuer workqueue for each bio_set, and then in the allocation code if
there are bios on current->bio_list we would be blocking, we punt them
to the rescuer workqueue to be submitted.

This guarantees forward progress for bio allocations under
generic_make_request() provided each bio is submitted before allocating
the next, and provided the bios are freed after they complete.

Note that this doesn't do anything for allocation from other mempools.
Instead of allocating per bio data structures from a mempool, code
should use bio_set's front_pad.

Tested it by forcing the rescue codepath to be taken (by disabling the
first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
of arbitrary bio splitting) and verified that the rescuer was being
invoked.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Muthukumar Ratty <muthur@gmail.com>
2013-03-23 14:15:26 -07:00
Kent Overstreet
57fb233f07 block: Reorder struct bio_set
This is prep work for the next patch, which embeds a struct bio_list in
struct bio_set.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
2013-03-23 14:15:25 -07:00
Lin Ming
6c95466758 block: add runtime pm helpers
Add runtime pm helper functions:

void blk_pm_runtime_init(struct request_queue *q, struct device *dev)
  - Initialization function for drivers to call.

int blk_pre_runtime_suspend(struct request_queue *q)
  - If any requests are in the queue, mark last busy and return -EBUSY.
    Otherwise set q->rpm_status to RPM_SUSPENDING and return 0.

void blk_post_runtime_suspend(struct request_queue *q, int err)
  - If the suspend succeeded then set q->rpm_status to RPM_SUSPENDED.
    Otherwise set it to RPM_ACTIVE and mark last busy.

void blk_pre_runtime_resume(struct request_queue *q)
  - Set q->rpm_status to RPM_RESUMING.

void blk_post_runtime_resume(struct request_queue *q, int err)
  - If the resume succeeded then set q->rpm_status to RPM_ACTIVE
    and call __blk_run_queue, then mark last busy and autosuspend.
    Otherwise set q->rpm_status to RPM_SUSPENDED.

The idea and API is designed by Alan Stern and described here:
http://marc.info/?l=linux-scsi&m=133727953625963&w=2

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 22:22:15 -06:00
Lin Ming
6631127469 block: add a flag to identify PM request
Add a flag REQ_PM to identify the request is PM related, such requests
will not change the device request queue's runtime status. It is
intended to be used in driver's runtime PM callback, so that driver can
perform some IO to the device there with the queue's runtime status
unaffected. e.g. in SCSI disk's runtime suspend callback, the disk will
be put into stopped power state, and this require sending a command to
the device. Such command processing should not change the disk's runtime
status.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 22:22:15 -06:00
Linus Torvalds
5da273fe3f Merge git://git.infradead.org/users/willy/linux-nvme
Pull NVMe driver update from Matthew Wilcox:
 "These patches have mostly been baking for a few months; sorry I didn't
  get them in during the merge window.  They're all bug fixes, except
  for the addition of the SMART log and the addition to MAINTAINERS."

* git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Add namespaces with no LBA range feature
  MAINTAINERS: Add entry for the NVMe driver
  NVMe: Initialize iod nents to 0
  NVMe: Define SMART log
  NVMe: Add result to nvme_get_features
  NVMe: Set result from user admin command
  NVMe: End queued bio requests when freeing queue
  NVMe: Free cmdid on nvme_submit_bio error
2013-03-22 16:43:53 -07:00
Linus Torvalds
14629ed314 Merge branch 'akpm' (fixes from Andrew)
Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mqueue: sys_mq_open: do not call mnt_drop_write() if read-only
  mm/hotplug: only free wait_table if it's allocated by vmalloc
  dma-debug: update DMA debug API to better handle multiple mappings of a buffer
  dma-debug: fix locking bug in check_unmap()
  drivers/rtc/rtc-at91rm9200.c: use a variable for storing IMR
  drivers/video/ep93xx-fb.c: include <linux/io.h> for devm_ioremap()
  drivers/rtc/rtc-da9052.c: fix for rtc device registration
  mm: zone_end_pfn is too small
  poweroff: change orderly_poweroff() to use schedule_work()
  mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting
  printk: Provide a wake_up_klogd() off-case
  irq_work.h: fix warning when CONFIG_IRQ_WORK=n
2013-03-22 16:41:44 -07:00
Russ Anderson
f9228b204f mm: zone_end_pfn is too small
Booting with 32 TBytes memory hits BUG at mm/page_alloc.c:552! (output
below).

The key hint is "page 4294967296 outside zone".
4294967296 = 0x100000000 (bit 32 is set).

The problem is in include/linux/mmzone.h:

  530 static inline unsigned zone_end_pfn(const struct zone *zone)
  531 {
  532         return zone->zone_start_pfn + zone->spanned_pages;
  533 }

zone_end_pfn is "unsigned" (32 bits).  Changing it to "unsigned long"
(64 bits) fixes the problem.

zone_end_pfn() was added recently in commit 108bcc96ef ("mm: add & use
zone_end_pfn() and zone_spans_pfn()")

Output from the failure.

  No AGP bridge found
  page 4294967296 outside zone [ 4294967296 - 4327469056 ]
  ------------[ cut here ]------------
  kernel BUG at mm/page_alloc.c:552!
  invalid opcode: 0000 [#1] SMP
  Modules linked in:
  CPU 0
  Pid: 0, comm: swapper Not tainted 3.9.0-rc2.dtp+ #10
  RIP: free_one_page+0x382/0x430
  Process swapper (pid: 0, threadinfo ffffffff81942000, task ffffffff81955420)
  Call Trace:
    __free_pages_ok+0x96/0xb0
    __free_pages+0x25/0x50
    __free_pages_bootmem+0x8a/0x8c
    __free_memory_core+0xea/0x131
    free_low_memory_core_early+0x4a/0x98
    free_all_bootmem+0x45/0x47
    mem_init+0x7b/0x14c
    start_kernel+0x216/0x433
    x86_64_start_reservations+0x2a/0x2c
    x86_64_start_kernel+0x144/0x153
  Code: 89 f1 ba 01 00 00 00 31 f6 d3 e2 4c 89 ef e8 66 a4 01 00 e9 2c fe ff ff 0f 0b eb fe 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 eb f3 <0f> 0b eb fe 0f 0b 0f 1f 84 00 00 00 00 00 eb f6 0f 0b eb fe 49

Signed-off-by: Russ Anderson <rja@sgi.com>
Reported-by: George Beshers <gbeshers@sgi.com>
Acked-by: Hedi Berriche <hedi@sgi.com>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22 16:41:20 -07:00
Frederic Weisbecker
dc72c32e1f printk: Provide a wake_up_klogd() off-case
wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
nor printk_sched() are in use and there are actually no waiter on
log_wait waitqueue.  It should be a stub in this case for users like
bust_spinlocks().

Otherwise this results in this warning when CONFIG_PRINTK=n and
CONFIG_IRQ_WORK=n:

	kernel/built-in.o In function `wake_up_klogd':
	(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'

To fix this, provide an off-case for wake_up_klogd() when
CONFIG_PRINTK=n.

There is much more from console_unlock() and other console related code
in printk.c that should be moved under CONFIG_PRINTK.  But for now,
focus on a minimal fix as we passed the merged window already.

[akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: James Hogan <james.hogan@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22 16:41:20 -07:00
James Hogan
fe8d52614b irq_work.h: fix warning when CONFIG_IRQ_WORK=n
A randconfig caught repeated compiler warnings when CONFIG_IRQ_WORK=n
due to the definition of a non-inline static function in
<linux/irq_work.h>:

  include/linux/irq_work.h +40 : warning: 'irq_work_needs_cpu' defined but not used

Make it inline to supress the warning.  This is caused commit
00b4295910 ("irq_work: Don't stop the tick with pending works") merged
in v3.9-rc1.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22 16:41:19 -07:00
Linus Torvalds
b9cb3bf42e Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti:
 "Fix compilation on PPC with !CONFIG_KVM"

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Revert "KVM: allow host header to be included even for !CONFIG_KVM"
2013-03-22 12:57:30 -07:00
Linus Torvalds
8f46c507a2 USB fixes for 3.9-rc3
Here are a number of USB fixes that resolve issues that have been reported
 against 3.9-rc3.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlFMnR0ACgkQMUfUDdst+yliJwCeNOJ0bz7Crzs3NepnRWHaHJ0Z
 OmQAnRvQMX1OpqDCsjnc7RW2gL/2TDNd
 =05gg
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
 "Here are a number of USB fixes that resolve issues that have been
  reported against 3.9-rc3."

* tag 'usb-3.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (37 commits)
  USB: ti_usb_3410_5052: fix use-after-free in TIOCMIWAIT
  USB: ssu100: fix use-after-free in TIOCMIWAIT
  USB: spcp8x5: fix use-after-free in TIOCMIWAIT
  USB: quatech2: fix use-after-free in TIOCMIWAIT
  USB: pl2303: fix use-after-free in TIOCMIWAIT
  USB: oti6858: fix use-after-free in TIOCMIWAIT
  USB: mos7840: fix use-after-free in TIOCMIWAIT
  USB: mos7840: fix broken TIOCMIWAIT
  USB: mct_u232: fix use-after-free in TIOCMIWAIT
  USB: io_ti: fix use-after-free in TIOCMIWAIT
  USB: io_edgeport: fix use-after-free in TIOCMIWAIT
  USB: ftdi_sio: fix use-after-free in TIOCMIWAIT
  USB: f81232: fix use-after-free in TIOCMIWAIT
  USB: cypress_m8: fix use-after-free in TIOCMIWAIT
  USB: ch341: fix use-after-free in TIOCMIWAIT
  USB: ark3116: fix use-after-free in TIOCMIWAIT
  USB: serial: add modem-status-change wait queue
  USB: serial: fix interface refcounting
  USB: io_ti: fix get_icount for two port adapters
  USB: garmin_gps: fix memory leak on disconnect
  ...
2013-03-22 12:45:55 -07:00
Linus Torvalds
1e0695cbc8 A fix from Mauro to correct csrow size accounting in sysfs and a sparse
fix from Stephen Hemminger.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRTC/tAAoJEBLB8Bhh3lVKjGAP/jL6utGeRhctJLrdDqfV9zW0
 CD9U3bsTIlVuuKi54VgIOsgPjXx+g3CiIWSfaeNMidrs+WeG8ly/wHblLoWv5EoY
 37y2t3uHtWlGDTU5Xg1xUCr/J6xAyiza86bPuw61pQBWeCcv1vxu3hoNRwVN1ZTq
 lGCy3PtrZ17EuX1mWrwJEmDZvofNm+Tm8jdKSNU1efITrpfnHhSH3dm/SJF9I1of
 DjP3iHRAVuj+w8uQpKwrPOVIHIOGdjF89pDJwIz+aEHU9ioPgw4dG3230X+RrXpS
 h7Xr/qdH5x5e0h5GpFnDZUWSGYYQPjL9gAwI2RgxjUvylxGvXLbdlBhBuKMyWQwk
 tPApVo7EzoIQbaLkjPE0sJ7vVpiZfWGWSKV6edVZtbuP5ns8ikiwNnWpkyZFiRWT
 jcG1Fgfi76iBbJndz5wVVNHHbsJRU8l7GVb4r+KKXLhsrdr7l+LAT6lfZWFuR/at
 IojcJlOn3vwtjNuOan6YrBNLTFOMZp2Pi98Tn7KhxshI3SGfqFZ0rtbyW/GnWwnQ
 bd1gTJLXyGg39Xo/5tRH/dQ378MDCFmT5a+/Z8OX9VYDKWIJ4Tsv1a0OusjjkfSa
 gOtNUYTGhqgu4InB8ImKDAbf6b1bkSSkAKVnsHZ7PepsuS2xJSEmvF+/IYsXHJVi
 h0us26JcgJC3woZeQBBP
 =XVNg
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fixes from Borislav Petkov:
 "A fix from Mauro to correct csrow size accounting in sysfs and a
  sparse fix from Stephen Hemminger."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC: Merge mci.mem_is_per_rank with mci.csbased
  amd64_edac: Correct DIMM sizes
  EDAC: Make sysfs functions static
2013-03-22 12:44:22 -07:00
Marcelo Tosatti
09a6e1f4ad Revert "KVM: allow host header to be included even for !CONFIG_KVM"
This reverts commit f445f11eb2 as
it breaks PPC with CONFIG_KVM=n.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-22 08:08:06 -03:00
Johan Hovold
e5b33dc9d1 USB: serial: add modem-status-change wait queue
Add modem-status-change wait queue to struct usb_serial_port that
subdrivers can use to implement TIOCMIWAIT.

Currently subdrivers use a private wait queue which may have been
released when waking up after device disconnected.

Note that we're adding a new wait queue rather than reusing the tty-port
one as we do not want to get woken up at hangup (yet).

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-21 15:59:02 -07:00
Greg Kroah-Hartman
d93acbcacd usb: fixes for v3.9-rc4
udc-core learned that it shouldn't use invalid pointers
 when unloading a gadget driver.
 
 net2272 and net2280 got a fix for a regression caused by
 the udc_start/udc_stop conversion.
 
 We're defining a static inline no-op for otg_ulpi_create()
 to prevent build errors when that driver isn't enabled.
 
 FunctionFS got a fix for an off-by-one error when binding
 and unbinding instances of FunctionFS.
 
 MUSB learned that it shouldn't try to unmap buffers which
 weren't previously mapped.
 
 f_rndis got a fix for a possible NULL pointer dereference
 in a debugging message code.
 
 MUSB's DA8xx glue layer got a build fix due to a typo.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRSu6dAAoJEIaOsuA1yqREdA8P/R4iKKZ3H23xfSJirQaEYADY
 ejeCefHeev+QFXcRVtkhTMGV1Kt6ZLAJ19wrTT3xdENjUbBZdx+9GzN/bUsr5yAO
 SjBFSzVkbbELn4es+hbUPIJgGDa+/DOs9nLgDLlzdaaWbbyvpsl5cbokGAYFgstn
 xy/36e5wctm066cZzG78+cewtuvKxXHANhZt7tTNLWfu/ARaBfiDoYH+fhobJwzq
 poZ47hZPxPE5nn9ppB16jtvpAFAXT8AQZg4SGA2yIRKXkExNNCOUh2xxIHnmWdI2
 k3Qp0YUlCUsoCifFu6k0vhJBxctbi9AVTnnBJXWokw4tX4bVt9uglDkXXZdOfQ/y
 vA3j+lY9rSjuBixmNjVlvwP77qyR85ILFF9WDhwsGrSNJyeUyV/Fmy2s7sRZFLjL
 X6ziN2Tj/3gC6uaO5Rbgmw7aURy8UyML/byBVq/uRMTu0NJPGxnC5xM2WPmGl9nz
 dWP2mcd193rxys8GYH7G6zz4MJ3WFyPgJ0VszsT/kGI/rL0ij1xTuKnw7zsP38Mk
 RINfeBcY2Msi0Mt4KnXJ8vgSIyI3XOwVCebKCAIdc4rktVAryKCJTjtXMlHD4niI
 rTztkXPkM4bcEiqa29FIhGkiYsQfQ66N8a9RbH6nWZSvhbgs1p8Y721FYfTKt+54
 oTvm35fUgthYTtpig5bN
 =wGIr
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

	usb: fixes for v3.9-rc4

	udc-core learned that it shouldn't use invalid pointers
	when unloading a gadget driver.

	net2272 and net2280 got a fix for a regression caused by
	the udc_start/udc_stop conversion.

	We're defining a static inline no-op for otg_ulpi_create()
	to prevent build errors when that driver isn't enabled.

	FunctionFS got a fix for an off-by-one error when binding
	and unbinding instances of FunctionFS.

	MUSB learned that it shouldn't try to unmap buffers which
	weren't previously mapped.

	f_rndis got a fix for a possible NULL pointer dereference
	in a debugging message code.

	MUSB's DA8xx glue layer got a build fix due to a typo.
2013-03-21 08:40:22 -07:00
Linus Torvalds
cd82346934 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A fair chunk of the linecount comes from a fix for a tracing bug that
  corrupts latency tracing buffers when the overwrite mode is changed on
  the fly - the rest is mostly assorted fewliner fixlets."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event
  kprobes/x86: Check Interrupt Flag modifier when registering probe
  kprobes: Make hash_64() as always inlined
  perf: Generate EXIT event only once per task context
  perf: Reset hwc->last_period on sw clock events
  tracing: Prevent buffer overwrite disabled for latency tracers
  tracing: Keep overwrite in sync between regular and snapshot buffers
  tracing: Protect tracer flags with trace_types_lock
  perf tools: Fix LIBNUMA build with glibc 2.12 and older.
  tracing: Fix free of probe entry by calling call_rcu_sched()
  perf/POWER7: Create a sysfs format entry for Power7 events
  perf probe: Fix segfault
  libtraceevent: Remove hard coded include to /usr/local/include in Makefile
  perf record: Fix -C option
  perf tools: check if -DFORTIFY_SOURCE=2 is allowed
  perf report: Fix build with NO_NEWT=1
  perf annotate: Fix build with NO_NEWT=1
  tracing: Fix race in snapshot swapping
2013-03-21 08:29:11 -07:00
Masatake YAMATO
73214f5d9f thermal: shorten too long mcast group name
The original name is too long.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 17:56:58 -04:00
Tom Parkin
44046a593e udp: add encap_destroy callback
Users of udp encapsulation currently have an encap_rcv callback which they can
use to hook into the udp receive path.

In situations where a encapsulation user allocates resources associated with a
udp encap socket, it may be convenient to be able to also hook the proto
.destroy operation.  For example, if an encap user holds a reference to the
udp socket, the destroy hook might be used to relinquish this reference.

This patch adds a socket destroy hook into udp, which is set and enabled
in the same way as the existing encap_rcv hook.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Fabio Estevam
7fa4cd1a78 usb: ulpi: Define a *otg_ulpi_create no-op
Building a kernel for imx_v4_v5_defconfig with CONFIG_USB_ULPI disabled, results
in the following error:

arch/arm/mach-imx/built-in.o: In function 'pca100_init':
platform-mx2-emma.c:(.init.text+0x6788): undefined reference to 'otg_ulpi_create'
platform-mx2-emma.c:(.init.text+0x682c): undefined reference to 'mxc_ulpi_access_ops'

Fix this by providing a no-op definition of *otg_ulpi_create for the case when
CONFIG_USB_ULPI is not defined.

Acked-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2013-03-20 17:30:40 +02:00
Linus Torvalds
ea4a0ce111 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
  KVM: x86: fix deadlock in clock-in-progress request handling
  KVM: allow host header to be included even for !CONFIG_KVM
2013-03-19 18:24:12 -07:00
Tejun Heo
14a40ffccd sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY
PF_THREAD_BOUND was originally used to mark kernel threads which were
bound to a specific CPU using kthread_bind() and a task with the flag
set allows cpus_allowed modifications only to itself.  Workqueue is
currently abusing it to prevent userland from meddling with
cpus_allowed of workqueue workers.

What we need is a flag to prevent userland from messing with
cpus_allowed of certain kernel tasks.  In kernel, anyone can
(incorrectly) squash the flag, and, for worker-type usages,
restricting cpus_allowed modification to the task itself doesn't
provide meaningful extra proection as other tasks can inject work
items to the task anyway.

This patch replaces PF_THREAD_BOUND with PF_NO_SETAFFINITY.
sched_setaffinity() checks the flag and return -EINVAL if set.
set_cpus_allowed_ptr() is no longer affected by the flag.

This will allow simplifying workqueue worker CPU affinity management.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-03-19 13:45:20 -07:00
Linus Torvalds
7b1b3fd74e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang.

 2) Insufficient space reserved for bridge netlink values, fix from
    Stephen Hemminger.

 3) Some dst_neigh_lookup*() callers don't interpret error pointer
    correctly, fix from Zhouyi Zhou.

 4) Fix transport match in SCTP active_path loops, from Xugeng Zhang.

 5) Fix qeth driver handling of multi-order SKB frags, from Frank
    Blaschka.

 6) fec driver is missing napi_disable() call, resulting in crashes on
    unload, from Georg Hofmann.

 7) Don't try to handle PMTU events on a listening socket, fix from Eric
    Dumazet.

 8) Fix timestamp location calculations in IP option processing, from
    David Ward.

 9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig
    tests, from Denis V Lunev.

10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings.

11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd
    Bergmann.

12) bnx2x statistics don't handle 4GB rollover correctly, fix from
    Maciej Żenczykowski.

13) Openvswitch bug fixes for vport del/new error reporting, missing
    genlmsg_end() call in netlink processing, and mis-parsing of
    LLC/SNAP ethernet types.  From Rich Lane.

14) SKB pfmemalloc state should only be propagated from the head page of
    a compound page, fix from Pavel Emelyanov.

15) Fix link handling in tg3 driver for 5715 chips when autonegotation
    is disabled.  From Nithin Sujir.

16) Fix inverted test of cpdma_check_free_tx_desc return value in
    davinci_emac driver, from Mugunthan V N.

17) vlan_depth is incorrectly calculated in skb_network_protocol(), from
    Li RongQing.

18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM
    device mode backwards compat in cdc_ncm driver.  From Bjørn Mork.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  inet: limit length of fragment queue hash table bucket lists
  qeth: Fix scatter-gather regression
  qeth: Fix invalid router settings handling
  qeth: delay feature trace
  tcp: dont handle MTU reduction on LISTEN socket
  bnx2x: fix occasional statistics off-by-4GB error
  vhost/net: fix heads usage of ubuf_info
  bridge: Add support for setting BR_ROOT_BLOCK flag.
  bnx2x: add missing napi deletion in error path
  drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc()
  ethernet/tulip: DE4x5 needs VIRT_TO_BUS
  isdn: hisax: netjet requires VIRT_TO_BUS
  net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
  rtnetlink: Mask the rta_type when range checking
  Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally"
  Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
  smsc75xx: configuration help incorrectly mentions smsc95xx
  net: fec: fix missing napi_disable call
  net: fec: restart the FEC when PHY speed changes
  skb: Propagate pfmemalloc on skb from head page only
  ...
2013-03-19 13:20:51 -07:00