Commit Graph

47796 Commits

Author SHA1 Message Date
Tomer Tayar
76a9a3642a qed: fix handling of concurrent ramrods.
Concurrent non-blocking slowpath ramrods can be completed
out-of-order on the completion chain. Recycling completed elements,
while previously sent elements are still completion pending,
can lead to overriding of active elements on the chain. Furthermore,
sending pending slowpath ramrods currently lacks the update of the
chain element physical pointer.

This patch:
* Ensures that ramrods are sent to the FW with
  consecutive echo values.
* Handles out-of-order completions by freeing only first
  successive completed entries.
* Updates the chain element physical pointer when copying
  a pending element into a free element for sending.

Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 14:14:03 -05:00
Tomer Tayar
4639d60d2b qed: Fix corner case for chain in-between pages
The amount of chain next pointer elements between the producer
and the consumer indices depends on which pages they currently
point to. The current calculation is based only on their difference,
and it can lead to a number of free elements which is higher by 1
than the actual value.

Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 14:14:03 -05:00
Matias Bjørling
16f26c3aa9 lightnvm: replace req queue with nvmdev for lld
In the case where a request queue is passed to the low lever lightnvm
device drive integration, the device driver might pass its admin
commands through another queue. Instead pass nvm_dev, and let the
low level drive the appropriate queue.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Matias Bjørling
57b4bd06ff lightnvm: comments on constants
It is not obvious what NVM_IO_* and NVM_BLK_T_* are used for. Make sure
to comment them appropriately as the other constants.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Andreas Werner
ea013a9b20 libata-eh.c: Introduce new ata port flag for controller which lockup on read log page
Some controller lockup on a ata_read_log_page.
Add new ata port flag ATA_FLAG_NO_LOG_PAGE which can used
to blacklist a controller.

If this flag is set, any attempt to read a log page returns an error
without actually issuing the command.

Signed-off-by: Andreas Werner <andreas.werner@men.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
2015-12-07 10:25:57 -05:00
Tejun Heo
0b98f0c042 Merge branch 'master' into for-4.4-fixes
The following commit which went into mainline through networking tree

  3b13758f51 ("cgroups: Allow dynamically changing net_classid")

conflicts in net/core/netclassid_cgroup.c with the following pending
fix in cgroup/for-4.4-fixes.

  1f7dd3e5a6 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")

The former separates out update_classid() from cgrp_attach() and
updates it to walk all fds of all tasks in the target css so that it
can be used from both migration and config change paths.  The latter
drops @css from cgrp_attach().

Resolve the conflict by making cgrp_attach() call update_classid()
with the css from the first task.  We can revive @tset walking in
cgrp_attach() but given that net_cls is v1 only where there always is
only one target css during migration, this is fine.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Nina Schiff <ninasc@fb.com>
2015-12-07 10:09:03 -05:00
Rainer Weikusat
ea3793ee29 core: enable more fine-grained datagram reception control
The __skb_recv_datagram routine in core/ datagram.c provides a general
skb reception factility supposed to be utilized by protocol modules
providing datagram sockets. It encompasses both the actual recvmsg code
and a surrounding 'sleep until data is available' loop. This is
inconvenient if a protocol module has to use additional locking in order
to maintain some per-socket state the generic datagram socket code is
unaware of (as the af_unix code does). The patch below moves the recvmsg
proper code into a new __skb_try_recv_datagram routine which doesn't
sleep and renames wait_for_more_packets to
__skb_wait_for_more_packets, both routines being exported interfaces. The
original __skb_recv_datagram routine is reimplemented on top of these
two functions such that its user-visible behaviour remains unchanged.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 23:31:54 -05:00
Moni Shoua
5f61385d2e net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode
Due to HW limitations, indexes to MAC and VLAN tables are always taken
from the table of the actual port. So, if a resource holds an index to
a table, it may refer to different values during the lifetime of the
resource,  unless the tables are mirrored. Also, even when
driver is not in HA mode the policy of allocating an index to these
tables is such to make sure, as much as possible, that when the time
comes the mirroring will be successful. This means that in multifunction
mode the allocation of a free index in a port's table tries to make sure
that the same index in the other's port table is also free.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:40:45 -05:00
Felix Fietkau
326fcfa5ac net: remove unnecessary semicolon in netdev_alloc_pcpu_stats()
This semicolon causes a build error if the function call is wrapped in
parentheses.

Fixes: aabc92bbe3 ("net: add __netdev_alloc_pcpu_stats() to indicate gfp flags")
Reported-by: Imre Kaloz <kaloz@openwrt.org>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:32:32 -05:00
Linus Torvalds
19190f5ea9 SCSI fixes on 20151205
This is quite a bumper crop of fixes: Three from Arnd correcting various build
 issues in some configurations, a lock recursion in qla2xxx.  Two potentially
 exploitable issues in hpsa and mvsas, a potential null deref in st, A revert
 of a bdi registration fix that turned out to cause even more problems, a set
 of fixes to allow people who only defined MPT2SAS to still work after the
 mpt2/mpt3sas merger and a couple of fixes for issues turned up by the hyper-v
 storvsc driver.
 
 Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJWY8fDAAoJEDeqqVYsXL0MPhwH/Rs05LUU2vnlaZDdDpH5zY56
 YQgKh5duF+ZH+Y4NxX5kkLLo05wpE6xD5xp2yzzmnjTA0Uf/yLVHNdb5D6tRZgSo
 mZjAX+/wGDb/ErwvwTk/K2mhEvB0iZJJVMyWcG3F9dKgciRCF/p1Gn5EarGmc+vM
 w/9xrs1j24Pw7ipHgBj9zU13w+SPMI7LunR0oYL9CJg24jgXG9sAbrwLkox5kHLo
 FFBCrZhev1mzFKa1C+Ln3s0iSf0yEQMd4khzPJAUElkw812PZ7I6r4bCP0ZPKDed
 JR8zex9jo77RyWZwA7fIathA0/ujv0AeIRXgvzb0/io1Yk577r98vt+S3koQVK8=
 =ptgb
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is quite a bumper crop of fixes: three from Arnd correcting
  various build issues in some configurations, a lock recursion in
  qla2xxx.  Two potentially exploitable issues in hpsa and mvsas, a
  potential null deref in st, a revert of a bdi registration fix that
  turned out to cause even more problems, a set of fixes to allow people
  who only defined MPT2SAS to still work after the mpt2/mpt3sas merger
  and a couple of fixes for issues turned up by the hyper-v storvsc
  driver"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  mpt3sas: fix Kconfig dependency problem for mpt2sas back compatibility
  Revert "scsi: Fix a bdi reregistration race"
  mpt3sas: Add dummy Kconfig option for backwards compatibility
  Fix a memory leak in scsi_host_dev_release()
  block/sd: Fix device-imposed transfer length limits
  scsi_debug: fix prevent_allow+verify regressions
  MAINTAINERS: Add myself as co-maintainer of the SCSI subsystem.
  sd: Make discard granularity match logical block size when LBPRZ=1
  scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
  scsi: advansys needs ISA dma api for ISA support
  scsi_sysfs: protect against double execution of __scsi_remove_device()
  st: fix potential null pointer dereference.
  scsi: report 'INQUIRY result too short' once per host
  advansys: fix big-endian builds
  qla2xxx: Fix rwlock recursion
  hpsa: logical vs bitwise AND typo
  mvsas: don't allow negative timeouts
  mpt3sas: Fix use sas_is_tlr_enabled API before enabling MPI2_SCSIIO_CONTROL_TLR_ON flag
2015-12-06 08:02:25 -08:00
Jiri Pirko
b618aaa91b net: constify netif_is_* helpers net_device param
As suggested by Eric, these helpers should have const dev param.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 18:16:27 -05:00
Andrew Lunn
2f8364a291 WAN: HDLC: Call notifiers before and after changing device type
An HDLC device can change type when the protocol driver is changed.
Calling the notifier change allows potential users of the interface
know about this planned change, and even block it. After the change
has occurred, send a second notification to users can evaluate the new
device type etc.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 17:41:42 -05:00
Herbert Xu
3cf92222a3 rhashtable: Prevent spurious EBUSY errors on insertion
Thomas and Phil observed that under stress rhashtable insertion
sometimes failed with EBUSY, even though this error should only
ever been seen when we're under attack and our hash chain length
has grown to an unacceptable level, even after a rehash.

It turns out that the logic for detecting whether there is an
existing rehash is faulty.  In particular, when two threads both
try to grow the same table at the same time, one of them may see
the newly grown table and thus erroneously conclude that it had
been rehashed.  This is what leads to the EBUSY error.

This patch fixes this by remembering the current last table we
used during insertion so that rhashtable_insert_rehash can detect
when another thread has also done a resize/rehash.  When this is
detected we will give up our resize/rehash and simply retry the
insertion with the new table.

Reported-by: Thomas Graf <tgraf@suug.ch>
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04 14:38:26 -05:00
Linus Torvalds
b1007e73ae Power management and ACPI fixes for v4.4-rc4
- Fix a regression in the ACPI PCI host bridge initialization code
    introduced by the recent consolidation of the host bridge handling
    on x86 and ia64 that forgot to take one special piece of code
    related to NUMA on x86 into account (Liu Jiang).
 
  - Improve the Kconfig help description of the new ACPI AML debugger
    support option to avoid possible confusion (Peter Zijlstra).
 
  - Remove a piece of code in the generic power domains framework
    that should have been removed by one of the recent commits
    modifying that code (Ulf Hansson).
 
  - Reduce the log level of a PCI PM message that generates a lot
    of false-positive log noise for some drivers and improve the
    message itself while at it (Imre Deak).
 
  - Fix the OF-based domain lookup code in the generic power domains
    framework to make it drop references to DT nodes correctly (Eric
    Anholt).
 
  - Prevent the cpufreq core from setting the policy back to the
    default after a CPU offline/online cycle for cpufreq drivers
    providing the ->setpolicy callback (Srinivas Pandruvada).
 
  - Fix a build problem for CONFIG_ACPI unset in the device
    properties framework (Hanjun Guo).
 
  - Fix a stale file path in the ACPI backlight driver entry in
    MAINTAINERS (Dan Carpenter).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWYZySAAoJEILEb/54YlRxecoP/jO9QBExSlPi0tgHbDlAfMkK
 VIm1eSvYmLYdidLJdN8ImpzahrT0ifusgHKL4KgRMjZt1lQhqbMUBrZVvqS7ueKu
 FLG56mLkNrkhnBgA8phVBS7piEtulVX7MY7PZN0uw0YomfZvQHnIEMgrl538t8Y4
 jBWYjtT/5Xz54aV6awuSIc66WGi8MocdQVOfhIPAvjg4N0y1HNwCiMqR3/apnrq/
 myMUtax5/WzLrkAmREb/5wVNM86VPekiSGF7yrkuIRqdsyAGCR1q0F+yhBD8SExe
 NMGeoUqgS6Ty9QSMw9fWBWg0HB5P9Qg/PlMfUaf7sXHMlWnuer1O5IZ5h9uT71Sf
 WW7v9NvZJzi9r3JgC860lrl4D98876lQhiN3zHiQtDbc1N2zTVTEQSyZPULYJ1Wt
 HQvhJVfgELdkpEKLQnjN1G84LBySyvdZ0sUjF6caczjxw6gACrWD/kW3uFZl6HzV
 ypF7GSbfWi1WWsEoYg0SPvQ0Is2bD7CUUYxjwktqQgPx3gABeZ8KNt7Cw3Z5YYec
 uEdJjmVG9Uf65Ixl2A+9Yd2xaYYsJ86bZAQADyqu2wtSfg9ws4WJvUrnLyf8wzXN
 ltZvPkxia54ozflOGASiaj2597D/amumlUIbemsL3Utpeq7tuTneJpOHZ8Z1hb1V
 bojkZMdj+jaXvapUSBwG
 =HrEI
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These fix a recent regression in the ACPI PCI host bridge
  initialization code, clean up some recent changes (generic power
  domains framework, ACPI AML debugger support), fix three older but
  annoying bugs (PCI power management.  generic power domains framework,
  cpufreq) and a build problem (device properties framework), and update
  a stale MAINTAINERS entry (ACPI backlight driver).

  Specifics:

   - Fix a regression in the ACPI PCI host bridge initialization code
     introduced by the recent consolidation of the host bridge handling
     on x86 and ia64 that forgot to take one special piece of code
     related to NUMA on x86 into account (Liu Jiang).

   - Improve the Kconfig help description of the new ACPI AML debugger
     support option to avoid possible confusion (Peter Zijlstra).

   - Remove a piece of code in the generic power domains framework that
     should have been removed by one of the recent commits modifying
     that code (Ulf Hansson).

   - Reduce the log level of a PCI PM message that generates a lot of
     false-positive log noise for some drivers and improve the message
     itself while at it (Imre Deak).

   - Fix the OF-based domain lookup code in the generic power domains
     framework to make it drop references to DT nodes correctly (Eric
     Anholt).

   - Prevent the cpufreq core from setting the policy back to the
     default after a CPU offline/online cycle for cpufreq drivers
     providing the ->setpolicy callback (Srinivas Pandruvada).

   - Fix a build problem for CONFIG_ACPI unset in the device properties
     framework (Hanjun Guo).

   - Fix a stale file path in the ACPI backlight driver entry in
     MAINTAINERS (Dan Carpenter)"

* tag 'pm+acpi-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()
  cpufreq: use last policy after online for drivers with ->setpolicy
  PCI / PM: Tune down retryable runtime suspend error messages
  PM / Domains: Validate cases of a non-bound driver in genpd governor
  MAINTAINERS: ACPI / video: update a file name in drivers/acpi/
  ACPI / property: fix compile error for acpi_node_get_property_reference() when CONFIG_ACPI=n
  x86/PCI/ACPI: Fix regression caused by commit 4d6b4e69a2
  ACPI: Better describe ACPI_DEBUGGER
2015-12-04 08:59:10 -08:00
Alex Williamson
ae5515d663 Revert: "vfio: Include No-IOMMU mode"
Revert commit 033291eccb ("vfio: Include No-IOMMU mode") due to lack
of a user.  This was originally intended to fill a need for the DPDK
driver, but uptake has been slow so rather than support an unproven
kernel interface revert it and revisit when userspace catches up.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-12-04 08:38:42 -07:00
Rafael J. Wysocki
d441fe25e7 Merge branches 'pm-domains' and 'pm-cpufreq'
* pm-domains:
  PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()
  PM / Domains: Validate cases of a non-bound driver in genpd governor

* pm-cpufreq:
  cpufreq: use last policy after online for drivers with ->setpolicy
2015-12-04 14:01:42 +01:00
Rafael J. Wysocki
3e5050e60e Merge branches 'acpica', 'acpi-video' and 'device-properties'
* acpica:
  ACPI: Better describe ACPI_DEBUGGER

* acpi-video:
  MAINTAINERS: ACPI / video: update a file name in drivers/acpi/

* device-properties:
  ACPI / property: fix compile error for acpi_node_get_property_reference() when CONFIG_ACPI=n
2015-12-04 14:01:17 +01:00
David S. Miller
f188b951f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/renesas/ravb_main.c
	kernel/bpf/syscall.c
	net/ipv4/ipmr.c

All three conflicts were cases of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 21:09:12 -05:00
Linus Torvalds
071f5d105a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A lot of Thanksgiving turkey leftovers accumulated, here goes:

   1) Fix bluetooth l2cap_chan object leak, from Johan Hedberg.

   2) IDs for some new iwlwifi chips, from Oren Givon.

   3) Fix rtlwifi lockups on boot, from Larry Finger.

   4) Fix memory leak in fm10k, from Stephen Hemminger.

   5) We have a route leak in the ipv6 tunnel infrastructure, fix from
      Paolo Abeni.

   6) Fix buffer pointer handling in arm64 bpf JIT,f rom Zi Shen Lim.

   7) Wrong lockdep annotations in tcp md5 support, fix from Eric
      Dumazet.

   8) Work around some middle boxes which prevent proper handling of TCP
      Fast Open, from Yuchung Cheng.

   9) TCP repair can do huge kmalloc() requests, build paged SKBs
      instead.  From Eric Dumazet.

  10) Fix msg_controllen overflow in scm_detach_fds, from Daniel
      Borkmann.

  11) Fix device leaks on ipmr table destruction in ipv4 and ipv6, from
      Nikolay Aleksandrov.

  12) Fix use after free in epoll with AF_UNIX sockets, from Rainer
      Weikusat.

  13) Fix double free in VRF code, from Nikolay Aleksandrov.

  14) Fix skb leaks on socket receive queue in tipc, from Ying Xue.

  15) Fix ifup/ifdown crach in xgene driver, from Iyappan Subramanian.

  16) Fix clearing of persistent array maps in bpf, from Daniel
      Borkmann.

  17) In TCP, for the cross-SYN case, we don't initialize tp->copied_seq
      early enough.  From Eric Dumazet.

  18) Fix out of bounds accesses in bpf array implementation when
      updating elements, from Daniel Borkmann.

  19) Fill gaps in RCU protection of np->opt in ipv6 stack, from Eric
      Dumazet.

  20) When dumping proxy neigh entries, we have to accomodate NULL
      device pointers properly, from Konstantin Khlebnikov.

  21) SCTP doesn't release all ipv6 socket resources properly, fix from
      Eric Dumazet.

  22) Prevent underflows of sch->q.qlen for multiqueue packet
      schedulers, also from Eric Dumazet.

  23) Fix MAC and unicast list handling in bnxt_en driver, from Jeffrey
      Huang and Michael Chan.

  24) Don't actively scan radar channels, from Antonio Quartulli"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (110 commits)
  net: phy: reset only targeted phy
  bnxt_en: Setup uc_list mac filters after resetting the chip.
  bnxt_en: enforce proper storing of MAC address
  bnxt_en: Fixed incorrect implementation of ndo_set_mac_address
  net: lpc_eth: remove irq > NR_IRQS check from probe()
  net_sched: fix qdisc_tree_decrease_qlen() races
  openvswitch: fix hangup on vxlan/gre/geneve device deletion
  ipv4: igmp: Allow removing groups from a removed interface
  ipv6: sctp: implement sctp_v6_destroy_sock()
  arm64: bpf: add 'store immediate' instruction
  ipv6: kill sk_dst_lock
  ipv6: sctp: add rcu protection around np->opt
  net/neighbour: fix crash at dumping device-agnostic proxy entries
  sctp: use GFP_USER for user-controlled kmalloc
  sctp: convert sack_needed and sack_generation to bits
  ipv6: add complete rcu protection around np->opt
  bpf: fix allocation warnings in bpf maps and integer overflow
  mvebu: dts: enable IP checksum with jumbo frames for Armada 38x on Port0
  net: mvneta: enable setting custom TX IP checksum limit
  net: mvneta: fix error path for building skb
  ...
2015-12-03 16:02:46 -08:00
Linus Torvalds
2873d32ff4 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A collection of fixes from this series.  The most important here is a
  regression fix for an issue that some folks would hit in blk-merge.c,
  and the NVMe queue depth limit for the screwed up Apple "nvme"
  controller.

  In more detail, this pull request contains:

   - a set of fixes for null_blk, including a fix for a few corner cases
     where we could hang the device.  From Arianna and Paolo.

   - lightnvm:
        - A build improvement from Keith.
        - Update the qemu pci id detection from Matias.
        - Error handling fixes for leaks and other little fixes from
          Sudip and Wenwei.

   - fix from Eric where BLKRRPART would not return EBUSY for whole
     device mounts, only when partitions were mounted.

   - fix from Jan Kara, where EOF O_DIRECT reads would return
     negatively.

   - remove check for rq_mergeable() when checking limits for cloned
     requests.  The check doesn't make any sense.  It's assuming that
     since NOMERGE is set on the request that we don't have to
     recalculate limits since the request didn't change, but that's not
     true if the request has been redirected.  From Hannes.

   - correctly get the bio front segment value set for single segment
     bio's, fixing a BUG() in blk-merge.  From Ming"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme: temporary fix for Apple controller reset
  null_blk: change type of completion_nsec to unsigned long
  null_blk: guarantee device restart in all irq modes
  null_blk: set a separate timer for each command
  blk-merge: fix computing bio->bi_seg_front_size in case of single segment
  direct-io: Fix negative return from dio read beyond eof
  block: Always check queue limits for cloned requests
  lightnvm: missing nvm_lock acquire
  lightnvm: unconverted ppa returned in get_bb_tbl
  lightnvm: refactor and change vendor id for qemu
  lightnvm: do device max sectors boundary check first
  lightnvm: fix ioctl memory leaks
  lightnvm: free memory when gennvm register fails
  lightnvm: Simplify config when disabled
  Return EBUSY from BLKRRPART for mounted whole-dev fs
2015-12-03 15:45:16 -08:00
Asias He
80a19e338d VSOCK: Introduce virtio-vsock-common.ko
This module contains the common code and header files for the following
virtio-vsock and virtio-vhost kernel modules.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:05:55 -05:00
Jakub Kicinski
2d1e0254ef pci_ids: add Netronome Systems vendor
Add PCI vendor id for Netronome Systems.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:17:51 -05:00
James Bottomley
be9e2f775f Merge branch 'mkp-fixes' into fixes 2015-12-03 09:32:33 -08:00
Saeed Mahameed
d6666753c6 net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context
E-Switch vport context is unlike NIC vport context, managed by the
E-Switch manager or vport_group_manager and not by the NIC(VF) driver.

The E-Switch manager can access (read/modify) any of its vports
E-Switch context.

Currently E-Switch vport context includes only clietnt and server
vlan insertion and striping data (for later support of VST mode).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed
81848731ff net/mlx5: E-Switch, Add SR-IOV (FDB) support
Enabling E-Switch SRIOV for nvfs+1 vports.

Create E-Switch FDB for L2 UC/MC mac steering between VFs/PF and
external vport (Uplink).

FDB contains forwarding rules such as:
	UC MAC0 -> vport0(PF).
	UC MAC1 -> vport1.
	UC MAC2 -> vport2.
	MC MACX -> vport0, vport2, Uplink.
	MC MACY -> vport1, Uplink.

For unmatched traffic FDB has the following default rules:
	Unmached Traffic (src vport != Uplink) -> Uplink.
	Unmached Traffic (src vport == Uplink) -> vport0(PF).

FDB rules population:
Each NIC vport (VF) will notify E-Switch manager of its UC/MC vport
context changes via modify vport context command, which will be
translated to an event that will be handled by E-Switch manager (PF)
which will update FDB table accordingly.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed
495716b191 net/mlx5: E-Switch, Introduce FDB hardware capabilities
Define needed hardware structures and capabilities needed
for E-Switch FDB flow tables and read them on driver load.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed
073bb189a4 net/mlx5: Introducing E-Switch and l2 table
E-Switch is the software entity that represents and manages ConnectX4
inter-HCA ethernet l2 switching.

E-Switch has its own Virtual Ports, each Vport/vNIC/VF can be
connected to the device through a vport of an e-switch.

Each e-switch is managed by one vNIC identified by
HCA_CAP.vport_group_manager (usually it is the PF/vport[0]),
and its main responsibility is to forward each packet to the
right vport.

e-Switch needs to manage its own l2-table and FDB tables.

L2 table is a flow table that is managed by FW, it is needed for
Multi-host (Multi PF) configuration for inter HCA switching between
PFs.

FDB table is a flow table that is totally managed by e-Switch driver,
its main responsibility is to switch packets between e-Swtich internal
vports and uplink vport that belong to the same.

This patch introduces only e-Swtich l2 table management, FDB managemnt
will come later when ethernet SRIOV/VFs will be enabled.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:46 -05:00
Saeed Mahameed
c0046cf7b8 net/mlx5: Introduce access functions to modify/query vport vlans
Those functions are needed to notify the upcoming L2 table and SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) vlan table changes.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed
d82b73186d net/mlx5: Introduce access functions to modify/query vport promisc mode
Those functions are needed to notify the upcoming SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) promisc mode changes.

Preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:45 -05:00
Saeed Mahameed
e75465148b net/mlx5: Introduce access functions to modify/query vport state
In preparation for SR-IOV we add here an API to enable each e-switch
manager (PF) to configure its VFs link states in e-switch

preparation for ethernet sriov.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Saeed Mahameed
e16aea2744 net/mlx5: Introduce access functions to modify/query vport mac lists
Those functions are needed to notify the upcoming L2 table and SR-IOV
E-Switch(FDB) manager(PF), of the NIC vport (vf) UC/MC mac lists
changes.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Saeed Mahameed
e1d7d349c6 net/mlx5: Update access functions to Query/Modify vport MAC address
In preparation for SR-IOV we add here an API to enable each e-switch
client (PF/VF) to configure its L2 MAC addresses and for the e-switch
manager (usually the PF) to access them in order to be able to
configure them into the e-switch.
Therefore we now pass vport num parameter to
mlx5_query_nic_vport_context, so PF can access other vports contexts.

preperation for ethernet sriov and l2 table management.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Saeed Mahameed
54f0a411ec net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch
Update HCA capabilities and HW struct to include needed
capabilities for upcoming Ethernet Switch (SR-IOV E-Switch).

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:44 -05:00
Eli Cohen
fc50db98ff net/mlx5_core: Add base sriov support
This patch adds SRIOV base support for mlx5 supported devices. The same
driver is used for both PFs and VFs; VFs are identified by the driver
through the flag MLX5_PCI_DEV_IS_VF added to the pci table entries.
Virtual functions are created as usual through writing a value to the
sriov_numvs sysfs file of the PF device. Upon instantiating VFs, they will
all be probed by the driver on the hypervisor. One can gracefully unbind
them through /sys/bus/pci/drivers/mlx5_core/unbind.

mlx5_wait_for_vf_pages() was added to ensure that when a VF dies without
executing proper teardown, the hypervisor driver waits till all of the
pages that were allocated at the hypervisor to maintain its operation
are returned.

In order for the VF to be operational, the PF needs to call enable_hca
for it. This can be done before the VFs are created through a call to
pci_enable_sriov.

If the there are VFs assigned to a VMs when the driver of the PF is
unloaded, all the VF will experience system error and PF driver unloads
cleanly; in this case pci_disable_sriov is not called and the devices
will show when running lspci. Once the PF driver is reloaded, it will
sync its data structures which maintain state on its VFs.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:08:43 -05:00
Jiri Pirko
fb1b2e3ce5 net: introduce lower state changed info structure for LAG lowers
This is shared info structure for bonding and team. Serves to pass down
info about link state and port activity to notification listeners.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:26 -05:00
Jiri Pirko
04d482660a net: introduce change lower state notifier
When lower device like bonding slave, team/bridge port, etc changes its
state, it is useful for others to notice this change. Currently this is
implemented specificly for bonding as NETDEV_BONDING_INFO notifier. This
patch aims to replace this specific usage and make this more generic to
be used for all upper-lower devices.

Introduce NETDEV_CHANGELOWERSTATE netdev notifier type and
netdev_lower_state_changed() helper.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:26 -05:00
Jiri Pirko
8fd728566a team: fill-up LAG changeupper info struct and pass it along
Initialize netdev_lag_upper_info structure by TX type according to
current team mode and pass it along via netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:26 -05:00
Jiri Pirko
764f5e5441 net: add info struct for LAG changeupper
This struct will be shared by bonding and team to pass internal
information to notifier listeners.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:25 -05:00
Jiri Pirko
29bf24afb2 net: add possibility to pass information about upper device via notifier
Sometimes the drivers and other code would find it handy to know some
internal information about upper device being changed. So allow upper-code
to pass information down to notifier listeners during linking.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:25 -05:00
Jiri Pirko
6dffb0447c net: propagate upper priv via netdev_master_upper_dev_link
Eliminate netdev_master_upper_dev_link_private and pass priv directly as
a parameter of netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:25 -05:00
Jiri Pirko
e0ba1414f3 net: add netif_is_lag_port helper
Some code does not mind if a device is bond slave or team port and treats
them the same, as generic LAG ports.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:24 -05:00
Jiri Pirko
7be6183304 net: add netif_is_lag_master helper
Some code does not mind if the master is bond or team and treats them
the same, as generic LAG.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:24 -05:00
Jiri Pirko
f7f019ee6d net: add netif_is_team_port helper
Similar to other helpers, caller can use this to find out if device is
team port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:24 -05:00
Jiri Pirko
c981e4213e net: add netif_is_team_master helper
Similar to other helpers, caller can use this to find out if device is
team master.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:24 -05:00
Tejun Heo
1f7dd3e5a6 cgroup: fix handling of multi-destination migration from subtree_control enabling
Consider the following v2 hierarchy.

  P0 (+memory) --- P1 (-memory) --- A
                                 \- B
       
P0 has memory enabled in its subtree_control while P1 doesn't.  If
both A and B contain processes, they would belong to the memory css of
P1.  Now if memory is enabled on P1's subtree_control, memory csses
should be created on both A and B and A's processes should be moved to
the former and B's processes the latter.  IOW, enabling controllers
can cause atomic migrations into different csses.

The core cgroup migration logic has been updated accordingly but the
controller migration methods haven't and still assume that all tasks
migrate to a single target css; furthermore, the methods were fed the
css in which subtree_control was updated which is the parent of the
target csses.  pids controller depends on the migration methods to
move charges and this made the controller attribute charges to the
wrong csses often triggering the following warning by driving a
counter negative.

 WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 pids_cancel.constprop.6+0x31/0x40()
 Modules linked in:
 CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #29
 ...
  ffffffff81f65382 ffff88007c043b90 ffffffff81551ffc 0000000000000000
  ffff88007c043bc8 ffffffff810de202 ffff88007a752000 ffff88007a29ab00
  ffff88007c043c80 ffff88007a1d8400 0000000000000001 ffff88007c043bd8
 Call Trace:
  [<ffffffff81551ffc>] dump_stack+0x4e/0x82
  [<ffffffff810de202>] warn_slowpath_common+0x82/0xc0
  [<ffffffff810de2fa>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8118e031>] pids_cancel.constprop.6+0x31/0x40
  [<ffffffff8118e0fd>] pids_can_attach+0x6d/0xf0
  [<ffffffff81188a4c>] cgroup_taskset_migrate+0x6c/0x330
  [<ffffffff81188e05>] cgroup_migrate+0xf5/0x190
  [<ffffffff81189016>] cgroup_attach_task+0x176/0x200
  [<ffffffff8118949d>] __cgroup_procs_write+0x2ad/0x460
  [<ffffffff81189684>] cgroup_procs_write+0x14/0x20
  [<ffffffff811854e5>] cgroup_file_write+0x35/0x1c0
  [<ffffffff812e26f1>] kernfs_fop_write+0x141/0x190
  [<ffffffff81265f88>] __vfs_write+0x28/0xe0
  [<ffffffff812666fc>] vfs_write+0xac/0x1a0
  [<ffffffff81267019>] SyS_write+0x49/0xb0
  [<ffffffff81bcef32>] entry_SYSCALL_64_fastpath+0x12/0x76

This patch fixes the bug by removing @css parameter from the three
migration methods, ->can_attach, ->cancel_attach() and ->attach() and
updating cgroup_taskset iteration helpers also return the destination
css in addition to the task being migrated.  All controllers are
updated accordingly.

* Controllers which don't care whether there are one or multiple
  target csses can be converted trivially.  cpu, io, freezer, perf,
  netclassid and netprio fall in this category.

* cpuset's current implementation assumes that there's single source
  and destination and thus doesn't support v2 hierarchy already.  The
  only change made by this patchset is how that single destination css
  is obtained.

* memory migration path already doesn't do anything on v2.  How the
  single destination css is obtained is updated and the prep stage of
  mem_cgroup_can_attach() is reordered to accomodate the change.

* pids is the only controller which was affected by this bug.  It now
  correctly handles multi-destination migrations and no longer causes
  counter underflow from incorrect accounting.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
2015-12-03 10:18:21 -05:00
KY Srinivasan
c0eb454034 hv_netvsc: Don't ask for additional head room in the skb
The rndis header is 116 bytes big and can be placed in the default
head room that will be available in the skb. Since the netvsc packet
is less than 48 bytes, we can use the skb control buffer
for the netvsc packet. With these changes we don't need to
ask for additional head room.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:43:24 -05:00
Eric Dumazet
45f6fad84c ipv6: add complete rcu protection around np->opt
This patch addresses multiple problems :

UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
while socket is not locked : Other threads can change np->opt
concurrently. Dmitry posted a syzkaller
(http://github.com/google/syzkaller) program desmonstrating
use-after-free.

Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
and dccp_v6_request_recv_sock() also need to use RCU protection
to dereference np->opt once (before calling ipv6_dup_options())

This patch adds full RCU protection to np->opt

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:37:16 -05:00
Srinivas Pandruvada
69030dd1c3 cpufreq: use last policy after online for drivers with ->setpolicy
For cpufreq drivers which use setpolicy interface, after offline->online
the policy is set to default. This can be reproduced by setting the
default policy of intel_pstate or longrun to ondemand and then change to
"performance". After offline and online, the setpolicy will be called with
the policy=ondemand.

For drivers using governors this condition is handled by storing
last_governor, during offline and restoring during online. The same should
be done for drivers using setpolicy interface. Storing last_policy during
offline and restoring during online.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-02 23:50:33 +01:00
Hanjun Guo
64031e3e8a ACPI / property: fix compile error for acpi_node_get_property_reference() when CONFIG_ACPI=n
In commit 60ba032ed7 ("ACPI / property: Drop size_prop from
acpi_dev_get_property_reference()"), the argument "const char *cells_name"
was dropped, but forgot to update the stub function in no-ACPI case,
it will lead to compile error when CONFIG_ACPI=n, easliy remove
"const char *cells_name" to fix it.

Fixes: 60ba032ed7 "ACPI / property: Drop size_prop from acpi_dev_get_property_reference()"
Reported-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-02 15:15:14 +01:00
Sudarsana Kalluru
91420b83ba qed: Add support for changing LED state
Physical LEDs are being controlled by the management FW.
This adds the qed functionality required to request management FW to
change the LED configuration, as well as the necessary APIs for this
functionality to later be used by the protocol drivers.

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 16:02:40 -05:00
Eric Dumazet
ceb5d58b21 net: fix sock_wake_async() rcu protection
Dmitry provided a syzkaller (http://github.com/google/syzkaller)
triggering a fault in sock_wake_async() when async IO is requested.

Said program stressed af_unix sockets, but the issue is generic
and should be addressed in core networking stack.

The problem is that by the time sock_wake_async() is called,
we should not access the @flags field of 'struct socket',
as the inode containing this socket might be freed without
further notice, and without RCU grace period.

We already maintain an RCU protected structure, "struct socket_wq"
so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
is the safe route.

It also reduces number of cache lines needing dirtying, so might
provide a performance improvement anyway.

In followup patches, we might move remaining flags (SOCK_NOSPACE,
SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
being mostly read and let it being shared between cpus.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00
Eric Dumazet
9cd3e072b0 net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
This patch is a cleanup to make following patch easier to
review.

Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()

To ease backports, we rename both constants.

Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00
Nikolay Aleksandrov
1973a4ea6c net: ipmr: move pimsm_enabled to pim.h and rename
Move the inline pimsm_enabled() to pim.h and rename it to
ipmr_pimsm_enabled to show it's for the ipv4 ipmr code since pim.h is
used by IPv6 too.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:26:22 -05:00
Nikolay Aleksandrov
5ea1f13299 net: ipmr: move struct mr_table and VIF_EXISTS to mroute.h
Move the definitions of VIF_EXISTS() and struct mr_table to mroute.h

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:26:22 -05:00
Nikolay Aleksandrov
520191bb40 net: ipmr: adjust mroute.h style and drop extern
Remove extra spaces and tabs, adjust function definitions, remove an
unnecessary ifdef (already used below, just move code) and drop extern
from the functions.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:26:22 -05:00
Nikolay Aleksandrov
06bd6c0370 net: ipmr: remove unused MFC_NOTIFY flag and make the flags enum
MFC_NOTIFY was introduced in kernel 2.1.68 but afaik it hasn't been used
and I couldn't find any users currently so just remove it. Only
MFC_STATIC is left, so move it into an enum, add a description and use
BIT().

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 15:26:22 -05:00
Herbert Xu
1ce0bf50ae net: Generalise wq_has_sleeper helper
The memory barrier in the helper wq_has_sleeper is needed by just
about every user of waitqueue_active.  This patch generalises it
by making it take a wait_queue_head_t directly.  The existing
helper is renamed to skwq_has_sleeper.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-30 14:47:33 -05:00
Martin Blumenstingl
880621c260 packet: Allow packets with only a header (but no payload)
Commit 9c7077622d ("packet: make packet_snd fail on len smaller
than l2 header") added validation for the packet size in packet_snd.
This change enforces that every packet needs a header (with at least
hard_header_len bytes) plus a payload with at least one byte. Before
this change the payload was optional.

This fixes PPPoE connections which do not have a "Service" or
"Host-Uniq" configured (which is violating the spec, but is still
widely used in real-world setups). Those are currently failing with the
following message: "pppd: packet size is too short (24 <= 24)"

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-29 22:17:17 -05:00
Hannes Reinecke
bf4e6b4e75 block: Always check queue limits for cloned requests
When a cloned request is retried on other queues it always needs
to be checked against the queue limits of that queue.
Otherwise the calculations for nr_phys_segments might be wrong,
leading to a crash in scsi_init_sgtable().

To clarify this the patch renames blk_rq_check_limits()
to blk_cloned_rq_check_limits() and removes the symbol
export, as the new function should only be used for
cloned requests and never exported.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Fixes: e2a60da74 ("block: Clean up special command handling logic")
Cc: stable@vger.kernel.org # 3.7+
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:37:27 -07:00
Matias Bjørling
08236c6bb2 lightnvm: unconverted ppa returned in get_bb_tbl
The get_bb_tbl function takes ppa as a generic address, which is
converted to the ppa device address within the device driver. When
the update_bbtbl callback is called from get_bb_tbl, the device
specific ppa is used, instead of the generic ppa.

Make sure to pass the generic ppa.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:58 -07:00
Linus Torvalds
36511e8607 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 - fix tcm-user backend driver expired cmd time processing (agrover)
 - eliminate kref_put_spinlock_irqsave() for I/O completion (bart)
 - fix iscsi login kthread failure case hung task regression (nab)
 - fix COMPARE_AND_WRITE completion use-after-free race (nab)
 - fix COMPARE_AND_WRITE with SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC non zero
   SGL offset data corruption.  (Jan + Doug)
 - fix >= v4.4-rc1 regression for tcm_qla2xxx enable configfs attribute
   (Himanshu + HCH)

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target/stat: print full t10_wwn.model buffer
  target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
  qla2xxx: Fix regression introduced by target configFS changes
  kref: Remove kref_put_spinlock_irqsave()
  target: Invoke release_cmd() callback without holding a spinlock
  target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
  iscsi-target: Fix rx_login_comp hang after login failure
  iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
  target/user: Do not set unused fields in tcmu_ops
  target/user: Fix time calc in expired cmd processing
2015-11-29 09:03:57 -08:00
Linus Torvalds
75a29ec1e8 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
 "Specifics:

 - several fixes and cleanups on Rockchip thermal drivers.

 - add the missing support of RK3368 SoCs in Rockchip driver.

 - small fixes on of-thermal, power_allocator, rcar driver, IMX, and
   QCOM drivers, and also compilation fixes, on thermal.h, when thermal
   is not selected"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  imx: thermal: use CPU temperature grade info for thresholds
  thermal: fix thermal_zone_bind_cooling_device prototype
  Revert "thermal: qcom_spmi: allow compile test"
  thermal: rcar_thermal: remove redundant operation
  thermal: of-thermal: Reduce log level for message when can't fine thermal zone
  thermal: power_allocator: Use temperature reading from tz
  thermal: rockchip: Support the RK3368 SoCs in thermal driver
  thermal: rockchip: consistently use int for temperatures
  thermal: rockchip: Add the sort mode for adc value increment or decrement
  thermal: rockchip: improve the conversion function
  thermal: rockchip: trivial: fix typo in commit
  thermal: rockchip: better to compatible the driver for different SoCs
  dt-bindings: rockchip-thermal: Support the RK3368 SoCs compatible
2015-11-29 08:58:48 -08:00
Bart Van Assche
3a66d7dca1 kref: Remove kref_put_spinlock_irqsave()
The last user is gone. Hence remove this function.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-11-28 19:33:29 -08:00
Linus Torvalds
081f3698e6 PCI updates for v4.4:
NUMA
     Prevent out of bounds access in numa_node override (Mathias Krause)
 
   HiSilicon host bridge driver
     Fix deferred probing (Arnd Bergmann)
 
   Synopsys DesignWare host bridge driver
     Remove incorrect io_base assignment (Stanimir Varbanov)
     Move align_resource function pointer to pci_host_bridge structure (Gabriele Paoloni)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWWgXpAAoJEFmIoMA60/r86BIQAKRdCG2hyBlKVElQf1IH2EXv
 bRTZyQ77oopRfA2E8wqcsWWz33utOKiD5/p5Z75mhxhKi0XlFwZ8IUMEamHtINXG
 hdBHsqUOJExEJuLZmjErn5XLECmiJd4ZzXpBeQw/sHJgGZ/e5gG4wVIPrb/L87bB
 BXiAGks/eDUeriE7L40GytYIoNdPWXBB6Yl7cExE8nCY1CYPwPqLk1p6oh9JIWv0
 4inCExv3m/pMjgTurvBDpXaic3EiGgGNUtzR62lnIZvDzDs/ZUXetf3Rn3JtZNLR
 A2fYklm0VjX+l/SGuuUiwgXOPw4LYiKdxGGUz9/MvbcsbCn+sAQfaiQsYzzG0zgR
 naHu7l7XSVTmyh8Cs+K+gbfEZ1/JX1N2jVFOTKWADq6stw2e4E5qjNR53HZg1HXm
 y8D5wE/9mEObKx65SVTCXBjkkeoWhtR8EmIYY9PhGL5hiiBNfCbpLL+CyQtzGojo
 mTrdp3bvqcGaZXhDGiv08IlI2E/Z+qZ02XTKjS/zD6ZfbWoQxJ8fMpT56mMWAwRr
 QPUpEMwFZ5/dn7C0RhdTanhq9CIBg8oiEkfGGuCF+UFmCLvF5rDtVQNuUcv7X+6k
 L6nAp+W+0LviP/kHqnQf6YS7i3MLsOewSEP+gt50VElt3QfUEjDMu26TtXLzSdvK
 mB90WOy+QZL67Wev8tIA
 =kWE6
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Here are a few fixes I'd like to have in v4.4: a generic one for sysfs
  and three for HiSilicon and DesignWare host controllers.

  Summary:

  NUMA:
   - Prevent out of bounds access in numa_node override (Mathias Krause)

  HiSilicon host bridge driver:
   - Fix deferred probing (Arnd Bergmann)

  Synopsys DesignWare host bridge driver:
   - Remove incorrect io_base assignment (Stanimir Varbanov)
   - Move align_resource function pointer to pci_host_bridge structure
     (Gabriele Paoloni)"

* tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
  PCI: hisi: Fix deferred probing
  PCI: designware: Remove incorrect io_base assignment
  PCI: Prevent out of bounds access in numa_node override
2015-11-28 13:07:41 -08:00
Linus Torvalds
8003a57356 NFS client bugfixes for Linux 4.4
Highlights include:
 
 Stable patches:
 - Fix a NFSv4 callback identifier leak that was also causing client crashes
 - Fix NFSv4 callback decoding issues when incoming requests are truncated
 - Don't declare the attribute cache valid when we call nfs_update_inode with
   an empty attribute structure.
 - Resend LAYOUTGET when there is a race that changes the seqid
 
 Bugfixes:
 - Fix a number of issues with the NFSv4.2 CLONE ioctl()
 - Properly set NFS v4.2 NFSDBG_FACILITY
 - NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
   decoding success
 - Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
 - Ensure that attrcache is revalidated after a SETATTR
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWWPeyAAoJEGcL54qWCgDy9/UQAKNTF09OeHxSqO7oXbM4x0hY
 8a8A4ostTshtu4g6OWxeqI4/89A5lOcdHAoM/KOr+2HzssKA6B9lU4+pzcKfFI+U
 d9WqKVEC3MZA1N4KR+fS5LhtQU62izGKH+CQ9+tHvvesZu+bIiQgQu/uMzKVh2Al
 cKdDu99UxrxNP3PFDCcBtxpBvy27akT+21P8RutG12tqGQkfa1715JIQl9bqgquY
 ZruukMsqamp+LbZlnowgvoaBLBVUo19v8zwI34uSfXwNbQS71xmAV52z7HVHaEFt
 A8HQzS/MaFtMKpq7HOZYEnHB6h8YaYTK4GmHcCCFXHtjXopvHo8LXA6vYLTNhJ8V
 SvLpUJzUWVcGDDQ75x6iX/APPMSq0gxJA4+AZryBer3k2EvKlUoRrP+hgxOIK7HT
 2joWoFFKVe8a5NBj4Pd5+x6dpDEnIvlqGdMQNuXFUiPvcA/l3Uc0gnWhauuqvrhy
 ePrLRcWoSikLlPWxq39DRzJjQUdyUhBWMcCRWkhNzsT6U6HDSip5j0BkUBXD7nlU
 FK9BM2zRHr7kQ5Aax497K9qJNZBWI94y/vFkR/hJg0Z/bVQBF45lGxGgNFbj8Kag
 gR/xcYC9plum1IFD7DcnVnJTxrDSftIsLS8bhjmknxC8Pcyur2jegZvoDXiFk1GF
 gXERq36Ej/4WyyGrNyWm
 =5aPD
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable patches:
   - Fix a NFSv4 callback identifier leak that was also causing client
     crashes
   - Fix NFSv4 callback decoding issues when incoming requests are
     truncated
   - Don't declare the attribute cache valid when we call
     nfs_update_inode with an empty attribute structure.
   - Resend LAYOUTGET when there is a race that changes the seqid

  Bugfixes:
   - Fix a number of issues with the NFSv4.2 CLONE ioctl()
   - Properly set NFS v4.2 NFSDBG_FACILITY
   - NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
     decoding success
   - Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
   - Ensure that attrcache is revalidated after a SETATTR"

* tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs4: resend LAYOUTGET when there is a race that changes the seqid
  nfs: if we have no valid attrs, then don't declare the attribute cache valid
  nfs: ensure that attrcache is revalidated after a SETATTR
  nfs4: limit callback decoding to received bytes
  nfs4: start callback_ident at idr 1
  nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
  NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
  NFS: Properly set NFS v4.2 NFSDBG_FACILITY
  nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
  nfs: use btrfs ioctl defintions for clone
  nfs: allow intra-file CLONE
  nfs: offer native ioctls even if CONFIG_COMPAT is set
  nfs: pass on count for CLONE operations
2015-11-27 17:22:47 -08:00
Linus Torvalds
c64410f3ec ARM: SoC fixes for 4.4-rc
There is a small backlog of at91 patches here, the most significant is
 the addition of some sama5d2 Xplained nodes that were waiting on an MFD include
 file to get merged through another tree. We normally try to sort those out
 before the merge window opens, but the maintainer wasn't aware of that here
 and I decided to merge the changes this time as an exception.
 
 On OMAP a series of audio changes for dra7 missed the merge window but turned
 out to be necessary to fix a boot time imprecise external abort error and to
 get audio working.
 
 The other changes are the usual simple changes, here is a list sorted by
 platform:
 
 at91:
 	removal of a useless defconfig option
 	removal of some legacy DT pieces
 	use of the proper watchdog compatible string
 	update of the MAINTAINERS entries for some Atmel drivers
 drivers/scpi:
 	hide get_scpi_ops in module from built-in code
 imx:
 	add missing .irq_set_type for i.MX GPC irq_chip.
 	fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
 	fix a merge error in Vybrid dts regarding to ADC device property
 keystone:
         fix the optional PDSP firmware loading
         fix linking RAM setup for QMs
         fix crash with clk_ignore_unused
 mediatek:
 	Enable SCPSYS power domain driver by default
 mvebu:
 	fix QNAP TS219 power-off in dts
 	fix legacy get_irqnr_and_base for dove and orion5x
 omap:
 	fix l4 related boot time errors for dm81xx
 	use lockless cldm/pwrdm api in omap4_boot_secondary
 	remove t410 abort handler to avoid hiding other critical errors
 	mark cpuidle tracepoints as _rcuidle
 	fix module alias for omap-ocp2scp
 pxa:
 	palm: Fix typos in PWM lookup table code
 renesas:
 	missing __initconst annotation for r8a7793_boards_compat_dt
 rockchip:
 	disable mmc-tuning on the veyron-minnie board
 	adding the init state for the over-temperature-protection
 zx:
 	only build power domain code when CONFIG_PM=y
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVljO8GCrR//JCVInAQJiuw//SajRIsmVRbsAKip0ujaiZnO32X2NqGL1
 r2r+GJfF0giLO+ln8A1nN+IPzA6jOdTOUY9tPYr21yof/5v92VtvpAn/e6hW3VjY
 nA+4VQm6PbcXJPaugJMp5wEyr74LJWpb13dA6u0KXAuc3/iASfKwRAIUCEvzS6tp
 dPr/d4qCGzt+XGoUq5ZqFDt0krmMSRs1AU9OAuDVmnCtZGnZaYw8jPQkqNUCv0D0
 UL4IMtIJEKU1gWaiISrFuKFM+FYuiOzU+1NFcW+dUT4d4ZCzzL7YyNlX5lPxe0BU
 rlkmScGYrz7PblCJnCXOTWqkPq+5YZ9z61uAWwbHeOmJ6Mbkv3a39A1ZzRdAS4on
 OwrPk3y57CpUI1AD1TcMkiaPEN80NIcM6RyU1QielPofbCvPqRKwBXHSnBKJBOiN
 YbSxkDOeQ4redxbFZbwuHnH+sLN+E52DSbK2oeqqmRAFc2idY+39pEXHZzieGq1f
 TuF9EYsHhTeYtnqOCG/+AhnSoLJskarkfqUa8C8If52rYnk6QXruolXbRMw3aKWY
 56l2zo96O4wmnMLvEGC6yFtI+k9L53QK75aIilPOhsiC86oAvjjurz4CZ1zhhQFA
 PZFxf/XJMWauxG0HZIbKxPKPsXTCrBd7GWU7KKBIEi7o9unUMWwtuxYxnHoxZh66
 zGqpNt3NliE=
 =auFT
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "There is a small backlog of at91 patches here, the most significant is
  the addition of some sama5d2 Xplained nodes that were waiting on an
  MFD include file to get merged through another tree.

  We normally try to sort those out before the merge window opens, but
  the maintainer wasn't aware of that here and I decided to merge the
  changes this time as an exception.

  On OMAP a series of audio changes for dra7 missed the merge window but
  turned out to be necessary to fix a boot time imprecise external abort
  error and to get audio working.

  The other changes are the usual simple changes, here is a list sorted
  by platform:

  at91:
	removal of a useless defconfig option
	removal of some legacy DT pieces
	use of the proper watchdog compatible string
	update of the MAINTAINERS entries for some Atmel drivers

  drivers/scpi:
	hide get_scpi_ops in module from built-in code

  imx:
	add missing .irq_set_type for i.MX GPC irq_chip.
	fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
	fix a merge error in Vybrid dts regarding to ADC device property

  keystone:
        fix the optional PDSP firmware loading
        fix linking RAM setup for QMs
        fix crash with clk_ignore_unused

  mediatek:
	Enable SCPSYS power domain driver by default

  mvebu:
	fix QNAP TS219 power-off in dts
	fix legacy get_irqnr_and_base for dove and orion5x

  omap:
	fix l4 related boot time errors for dm81xx
	use lockless cldm/pwrdm api in omap4_boot_secondary
	remove t410 abort handler to avoid hiding other critical errors
	mark cpuidle tracepoints as _rcuidle
	fix module alias for omap-ocp2scp

  pxa:
	palm: Fix typos in PWM lookup table code

  renesas:
	missing __initconst annotation for r8a7793_boards_compat_dt

  rockchip:
	disable mmc-tuning on the veyron-minnie board
	adding the init state for the over-temperature-protection

  zx:
	only build power domain code when CONFIG_PM=y"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
  ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
  arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
  ARM: orion5x: Fix legacy get_irqnr_and_base
  ARM: dove: Fix legacy get_irqnr_and_base
  soc: Mediatek: Enable SCPSYS power domain driver by default
  ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
  ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
  soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
  soc: ti: use request_firmware_direct() as acc firmware is optional
  ARM: imx: add platform irq type setting in gpc
  ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
  ARM: shmobile: r8a7793: proper constness with __initconst
  scpi: hide get_scpi_ops in module from built-in code
  ARM: zx: only build power domain code when CONFIG_PM=y
  ARM: pxa: palm: Fix typos in PWM lookup table code
  ARM: dts: Kirkwood: Fix QNAP TS219 power-off
  ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
  ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
  MAINTAINERS: Atmel drivers: change NAND and ISI entries
  ARM: at91/dt: sama5d2 Xplained: add several devices
  ...
2015-11-27 14:22:03 -08:00
Linus Torvalds
5d8686276a arm64 fixes:
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
   it only affects arm64 builds; submitted by Arnd Bergmann)
 
 - EFI fixes to deal with early_memremap() returning NULL and correctly
   mapping run-time regions
 
 - Fix CPUID register extraction of unsigned fields (not to be
   sign-extended)
 
 - ASID allocator fix to deal with long-running tasks over multiple
   generation roll-overs
 
 - Revert support for marking page ranges as contiguous PTEs (it leads to
   TLB conflicts and requires additional non-trivial kernel changes)
 
 - Proper early_alloc() failure check
 
 - Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
   larger than the KASan shadow memory)
 
 - Update the fault_info table (original descriptions based on early
   engineering spec)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWWJiuAAoJEGvWsS0AyF7xlWIP/3ma+ZSyitIS0FWOld/uo3c1
 KbH9i7DrEL9tOzz4AhkKHBA7LOs0NvNkjz2sPLbnVg57H6r2y6Bi1ls5ODUWFy6y
 CKI0aaCYhWPyYWDq6H9NfD5Xh6jx0+45dMqKiCy1mvpChEwPfW4aZGceKptNbBrG
 v0VG1H5s0U+SjNqKqZ3W/hbwyQ1ZvAXJ022q7/ihPt6s2U0ebjXqc+6S2TcJyWNn
 C0bDn40+MK7p8jqRrq80bAjAvC5yDQ7/o7fBsNzsVYhuNTA3HR5CG1jGMJwGcVvA
 NJt71vfBq8L4PT2ndt8BxC5G500GdkQk2Nb2i1G9EgakH8Yv5Y2deFTUFDYPTHBg
 EfUgORet2iBiCcLY+lLTonjKICsHi4Bn//DsyyEZ7HXAovS0DIH3rQfKubYNlT3p
 FR2eskr3cDoQei3L9u0YU1zn+OuWRS7yJdjisjcTAEFaRBKqRXYMoczhVvJPb5xQ
 RPtHZNAS0JXH+0Cmdo+nHjSfpEo20nefBvd3Xvs0jvwWKxS6rwexxQWYTKNTbycq
 5iTYOGXlequnyTztK5M0AcfAajE+EVT2mAXkD/C727tUdO7yiCh86CNLIREHK8sH
 cLnc2iJ12IsJmqV7uRPI5YjNmYau7ZQpfcRfflt1LlL7mx1VmSiyb4JeomGEE/gu
 IdJ1iBl2JGguat1DHIXU
 =YgtU
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Build fix when !CONFIG_UID16 (the patch is touching generic files but
   it only affects arm64 builds; submitted by Arnd Bergmann)

 - EFI fixes to deal with early_memremap() returning NULL and correctly
   mapping run-time regions

 - Fix CPUID register extraction of unsigned fields (not to be
   sign-extended)

 - ASID allocator fix to deal with long-running tasks over multiple
   generation roll-overs

 - Revert support for marking page ranges as contiguous PTEs (it leads
   to TLB conflicts and requires additional non-trivial kernel changes)

 - Proper early_alloc() failure check

 - Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
   larger than the KASan shadow memory)

 - Update the fault_info table (original descriptions based on early
   engineering spec)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: efi: fix initcall return values
  arm64: efi: deal with NULL return value of early_memremap()
  arm64: debug: Treat the BRPs/WRPs as unsigned
  arm64: cpufeature: Track unsigned fields
  arm64: cpufeature: Add helpers for extracting unsigned values
  Revert "arm64: Mark kernel page ranges contiguous"
  arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
  arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
  arm64: efi: correctly map runtime regions
  arm64: mm: fix fault_info table xFSC decoding
  arm64: fix building without CONFIG_UID16
  arm64: early_alloc: Fix check for allocation failure
2015-11-27 11:09:59 -08:00
Martin K. Petersen
ca369d51b3 block/sd: Fix device-imposed transfer length limits
Commit 4f258a4634 ("sd: Fix maximum I/O size for BLOCK_PC requests")
had the unfortunate side-effect of removing an implicit clamp to
BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
code. This caused problems for some SMR drives.

Debugging this issue revealed a few problems with the existing
infrastructure since the block layer didn't know how to deal with
device-imposed limits, only limits set by the I/O controller.

 - Introduce a new queue limit, max_dev_sectors, which is used by the
   ULD to signal the maximum sectors for a REQ_TYPE_FS request.

 - Ensure that max_dev_sectors is correctly stacked and taken into
   account when overriding max_sectors through sysfs.

 - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
   values for later processing.

 - In sd_revalidate() set the queue's max_dev_sectors based on the
   MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
   is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
   field size.

 - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
   VPD--if reported and sane--to signal the preferred device transfer
   size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.

 - blk_limits_max_hw_sectors() is no longer used and can be removed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: sweeneygj@gmx.com
Tested-by: Arzeets <anatol.pomozov@gmail.com>
Tested-by: David Eisner <david.eisner@oriel.oxon.org>
Tested-by: Mario Kicherer <dev@kicherer.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-25 21:38:58 -05:00
Arnd Bergmann
d3de94ba4e Linux 4.4-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWUmHZAAoJEHm+PkMAQRiGHtcH/RVRsn8re0WdRWYaTr9+Hknm
 CGlRJN4LKecttgYQ/2bS1QsDbt8usDPBiiYVopqGXQxPBmjyDAqPjsa+8VzCaVc6
 WA+9LDB+PcW28lD6BO+qSZCOAm7hHSZq7dtw9x658IqO+mI2mVeCybsAyunw2iWi
 Kf5q90wq6tIBXuT8YH9MXGrSCQw00NclbYeYwB9CmCt9hT/koEFBdl7uFUFitB+Q
 GSPTz5fXhgc5Lms85n7flZlrVKoQKmtDQe4/DvKZm+SjsATHU9ru89OxDBdS5gSG
 YcEIM4zc9tMjhs3GC9t6WXf6iFOdctum8HOhUoIN/+LVfeOMRRwAhRVqtGJ//Xw=
 =DCUg
 -----END PGP SIGNATURE-----

Merge tag 'v4.4-rc2' into fixes

Linux 4.4-rc2 is backmerged from the keystone fixes.
2015-11-25 23:47:38 +01:00
Gabriele Paoloni
7c7a0e9453 ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
Commit b3a72384fe ("ARM/PCI: Replace pci_sys_data->align_resource with
global function pointer") introduced an ARM-specific align_resource()
function pointer.  This is not portable to other arches and doesn't work
for platforms with two different PCIe host bridge controllers.

Move the function pointer to the pci_host_bridge structure so each host
bridge driver can specify its own align_resource() function.

Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2015-11-25 13:23:38 -06:00
Daniel Borkmann
c9da161c65 bpf: fix clearing on persistent program array maps
Currently, when having map file descriptors pointing to program arrays,
there's still the issue that we unconditionally flush program array
contents via bpf_fd_array_map_clear() in bpf_map_release(). This happens
when such a file descriptor is released and is independent of the map's
refcount.

Having this flush independent of the refcount is for a reason: there
can be arbitrary complex dependency chains among tail calls, also circular
ones (direct or indirect, nesting limit determined during runtime), and
we need to make sure that the map drops all references to eBPF programs
it holds, so that the map's refcount can eventually drop to zero and
initiate its freeing. Btw, a walk of the whole dependency graph would
not be possible for various reasons, one being complexity and another
one inconsistency, i.e. new programs can be added to parts of the graph
at any time, so there's no guaranteed consistent state for the time of
such a walk.

Now, the program array pinning itself works, but the issue is that each
derived file descriptor on close would nevertheless call unconditionally
into bpf_fd_array_map_clear(). Instead, keep track of users and postpone
this flush until the last reference to a user is dropped. As this only
concerns a subset of references (f.e. a prog array could hold a program
that itself has reference on the prog array holding it, etc), we need to
track them separately.

Short analysis on the refcounting: on map creation time usercnt will be
one, so there's no change in behaviour for bpf_map_release(), if unpinned.
If we already fail in map_create(), we are immediately freed, and no
file descriptor has been made public yet. In bpf_obj_pin_user(), we need
to probe for a possible map in bpf_fd_probe_obj() already with a usercnt
reference, so before we drop the reference on the fd with fdput().
Therefore, if actual pinning fails, we need to drop that reference again
in bpf_any_put(), otherwise we keep holding it. When last reference
drops on the inode, the bpf_any_put() in bpf_evict_inode() will take
care of dropping the usercnt again. In the bpf_obj_get_user() case, the
bpf_any_get() will grab a reference on the usercnt, still at a time when
we have the reference on the path. Should we later on fail to grab a new
file descriptor, bpf_any_put() will drop it, otherwise we hold it until
bpf_map_release() time.

Joint work with Alexei.

Fixes: b2197755b2 ("bpf: add support for persistent maps/progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-25 12:14:09 -05:00
Linus Torvalds
4cf193b4b2 Bug fixes for all architectures. Nothing really stands out.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWVcuuAAoJEL/70l94x66DmUEIAKnU6SCojoFOxWY0/EH/PBue
 m53mjiRiHp+YH/74dW0XF843+IKLfbLiADRaWHTqc9VW0ifnXmRjOv/bYpC7I/+R
 8XKHaJZQfpb6yvICEqWvMItBpddoakbhv8DJOf4bUfipNY0zx5F2STFfx0KICtbc
 mHTB4y5bFgIz8mJBLX+Dmh/UyXL0kbjSnksu0WA80Szr0pq2Sr4Csrx8PqGAEfIJ
 e5DUW0h3UXY77J5fQbpgJs93hzp1YwkuRKEeYpB8POx4fmvssHoybmOk46sP0Ipb
 IYxrJ+CUQ4o6Vpp3LTMjzMfJ4Y/NaOHCvYxL0okhxtUuq+UbUZjM5ziclfQ/32M=
 =cbxQ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Bug fixes for all architectures.  Nothing really stands out"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: nVMX: remove incorrect vpid check in nested invvpid emulation
  arm64: kvm: report original PAR_EL1 upon panic
  arm64: kvm: avoid %p in __kvm_hyp_panic
  KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
  KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
  KVM: arm/arm64: Fix preemptible timer active state crazyness
  arm64: KVM: Add workaround for Cortex-A57 erratum 834220
  arm64: KVM: Fix AArch32 to AArch64 register mapping
  ARM/arm64: KVM: test properly for a PTE's uncachedness
  KVM: s390: fix wrong lookup of VCPUs by array index
  KVM: s390: avoid memory overwrites on emergency signal injection
  KVM: Provide function for VCPU lookup by id
  KVM: s390: fix pfmf intercept handler
  KVM: s390: enable SIMD only when no VCPUs were created
  KVM: x86: request interrupt window when IRQ chip is split
  KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space
  KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection
  KVM: x86: fix interrupt window handling in split IRQ chip case
  MIPS: KVM: Uninit VCPU in vcpu_create error path
  MIPS: KVM: Fix CACHE immediate offset sign extension
  ...
2015-11-25 09:01:49 -08:00
Florian Fainelli
9458ceab49 net: phy: bcm7xxx: Add entry for Broadcom BCM7435
Add a PHY entry for the Broadcom BCM7435 chips, this is a 40nm
generation Ethernet PHY which is analogous to its 7425 and 7429 counter
parts.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-25 11:35:14 -05:00
Arnd Bergmann
fbc416ff86 arm64: fix building without CONFIG_UID16
As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
disabled currently fails because the system call table still needs to
reference the individual function entry points that are provided by
kernel/sys_ni.c in this case, and the declarations are hidden inside
of #ifdef CONFIG_UID16:

arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
 __SYSCALL(__NR_lchown, sys_lchown16)

I believe this problem only exists on ARM64, because older architectures
tend to not need declarations when their system call table is built
in assembly code, while newer architectures tend to not need UID16
support. ARM64 only uses these system calls for compatibility with
32-bit ARM binaries.

This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
declarations whenever we need them, but otherwise the behavior is
unchanged.

Fixes: af1839eb4b ("Kconfig: clean up the long arch list for the UID16 config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-11-25 15:49:13 +00:00
Paolo Bonzini
8bd142c016 KVM/ARM Fixes for v4.4-rc3.
Includes some timer fixes, properly unmapping PTEs, an errata fix, and two
 tweaks to the EL2 panic code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWVJ7gAAoJEEtpOizt6ddyD5MH/3M/nhtZTnT6v0RPDvHWJo7s
 5BQmITJYPHFkTO14OHWTVLXiGgLws8gPZnWHxC4jjHjpuJnL+/MM551FpCOqDDd7
 vweYgVlSqD8ANH5nKbv1PPnzjrqhTVN+yi3ZItXy2pxsfvu63FC6Z43B2axelLvw
 XYmHoMZaeWBBw2gHi3djGfju3Yj/2SOe+ozuvAXpxA5+NhSiPHHnMefGy5k3wKnJ
 sETwshPdjiMeK4ItfMhveFTDRjl4uh9uQyORfaa5gqG0uePt3EalYynw+gEjZ6RX
 Bpc3nLwboIfRIa/WwyoHm+nmLIUYjU8dAgLwUOIbdeG0igpdALdvsB0aBHCgngk=
 =+7ED
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM Fixes for v4.4-rc3.

Includes some timer fixes, properly unmapping PTEs, an errata fix, and two
tweaks to the EL2 panic code.
2015-11-24 19:34:40 +01:00
Linus Torvalds
4ce01c518e Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "A round of fixes/updates for the current series.

  This looks a little bigger than it is, but that's mainly because we
  pushed the lightnvm enabled null_blk change out of the merge window so
  it could be updated a bit.  The rest of the volume is also mostly
  lightnvm.  In particular:

   - Lightnvm.  Various fixes, additions, updates from Matias and
     Javier, as well as from Wenwei Tao.

   - NVMe:
        - Fix for potential arithmetic overflow from Keith.
        - Also from Keith, ensure that we reap pending completions from
          a completion queue before deleting it.  Fixes kernel crashes
          when resetting a device with IO pending.
        - Various little lightnvm related tweaks from Matias.

   - Fixup flushes to go through the IO scheduler, for the cases where a
     flush is not required.  Fixes a case in CFQ where we would be
     idling and not see this request, hence not break the idling.  From
     Jan Kara.

   - Use list_{first,prev,next} in elevator.c for cleaner code.  From
     Gelian Tang.

   - Fix for a warning trigger on btrfs and raid on single queue blk-mq
     devices, where we would flush plug callbacks with preemption
     disabled.  From me.

   - A mac partition validation fix from Kees Cook.

   - Two merge fixes from Ming, marked stable.  A third part is adding a
     new warning so we'll notice this quicker in the future, if we screw
     up the accounting.

   - Cleanup of thread name/creation in mtip32xx from Rasmus Villemoes"

* 'for-linus' of git://git.kernel.dk/linux-block: (32 commits)
  blk-merge: warn if figured out segment number is bigger than nr_phys_segments
  blk-merge: fix blk_bio_segment_split
  block: fix segment split
  blk-mq: fix calling unplug callbacks with preempt disabled
  mac: validate mac_partition is within sector
  mtip32xx: use formatting capability of kthread_create_on_node
  NVMe: reap completion entries when deleting queue
  lightnvm: add free and bad lun info to show luns
  lightnvm: keep track of block counts
  nvme: lightnvm: use admin queues for admin cmds
  lightnvm: missing free on init error
  lightnvm: wrong return value and redundant free
  null_blk: do not del gendisk with lightnvm
  null_blk: use device addressing mode
  null_blk: use ppa_cache pool
  NVMe: Fix possible arithmetic overflow for max segments
  blk-flush: Queue through IO scheduler when flush not required
  null_blk: register as a LightNVM device
  elevator: use list_{first,prev,next}_entry
  lightnvm: cleanup queue before target removal
  ...
2015-11-24 10:26:30 -08:00
Jeff Layton
91ab4b4d16 nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
When LAYOUTGET gets NFS4ERR_DELAY, we currently will wait 15s before
retrying the call. That is a _very_ long time, so add a timeout value to
struct nfs4_layoutget and pass nfs4_async_handle_error a pointer to it.
This allows the RPC engine to use a sliding delay window, instead of a
15s delay.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:57:44 -05:00
Arnd Bergmann
c86b3de8c8 thermal: fix thermal_zone_bind_cooling_device prototype
When the prototype for thermal_zone_bind_cooling_device
changed, the static inline wrapper function was left alone,
which in theory can cause build warnings:

I have seen this error in the past:
drivers/thermal/db8500_thermal.c: In function 'db8500_cdev_bind':
drivers/thermal/db8500_thermal.c:78:9: error: too many arguments to function 'thermal_zone_bind_cooling_device'
   ret = thermal_zone_bind_cooling_device(thermal, i, cdev,

while this one no longer shows up, there is no doubt that
the prototype is still wrong, so let's just fix it anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6cd9e9f629 ("thermal: of: fix cooling device weights in device tree")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-11-23 15:34:34 -08:00
Marcelo Ricardo Leitner
f7ccdb96fa netfilter: nf_ct_sctp: move ip_ct_sctp away from UAPI
ip_ct_sctp is an internal structure, embedded by the union
nf_conntrack_proto to store sctp-specific information at conntrack
entries. It has no business with UAPI.

This patch moves it from UAPI to a saner place, together with similar
structs for other protocols.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-11-23 17:54:42 +01:00
Peter Zijlstra
90eec103b9 treewide: Remove old email address
There were still a number of references to my old Red Hat email
address in the kernel source. Remove these while keeping the
Red Hat copyright notices intact.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23 09:44:58 +01:00
Stephane Eranian
614e4c4ebc perf/core: Robustify the perf_cgroup_from_task() RCU checks
This patch reinforces the lockdep checks performed by
perf_cgroup_from_tsk() by passing the perf_event_context
whenever possible. It is okay to not hold the RCU read lock
when we know we hold the ctx->lock. This patch makes sure this
property holds.

In some functions, such as perf_cgroup_sched_in(), we do not
pass the context because we are sure we are holding the RCU
read lock.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: edumazet@google.com
Link: http://lkml.kernel.org/r/1447322404-10920-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23 09:21:03 +01:00
Linus Torvalds
104e2a6f8b Merge branch 'akpm' (patches from Andrew)
Merge slub bulk allocator updates from Andrew Morton:
 "This missed the merge window because I was waiting for some repairs to
  come in.  Nothing actually uses the bulk allocator yet and the changes
  to other code paths are pretty small.  And the net guys are waiting
  for this so they can start merging the client code"

More comments from Jesper Dangaard Brouer:
 "The kmem_cache_alloc_bulk() call, in mm/slub.c, were included in
  previous kernel.  The present version contains a bug.  Vladimir
  Davydov noticed it contained a bug, when kernel is compiled with
  CONFIG_MEMCG_KMEM (see commit 03ec0ed57f: "slub: fix kmem cgroup
  bug in kmem_cache_alloc_bulk").  Plus the mem cgroup counterpart in
  kmem_cache_free_bulk() were missing (see commit 033745189b "slub:
  add missing kmem cgroup support to kmem_cache_free_bulk").

  I don't consider the fix stable-material because there are no in-tree
  users of the API.

  But with known bugs (for memcg) I cannot start using the API in the
  net-tree"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  slab/slub: adjust kmem_cache_alloc_bulk API
  slub: add missing kmem cgroup support to kmem_cache_free_bulk
  slub: fix kmem cgroup bug in kmem_cache_alloc_bulk
  slub: optimize bulk slowpath free by detached freelist
  slub: support for bulk free with SLUB freelists
2015-11-22 15:21:40 -08:00
Linus Torvalds
dcfeda9d5f TTY/Serial fixes for 4.4-rc2
Here are a few small tty/serial driver fixes for 4.4-rc2 that resolve
 some reported problems.
 
 All have been in linux-next, full details are in the shortlog below.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlZSCjgACgkQMUfUDdst+yl4kQCgyYYsaVVUcG2i3HQUpio4CAJg
 EFQAn03Z0OD/EGNHKw7FtsICSgAhSatG
 =JRcn
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are a few small tty/serial driver fixes for 4.4-rc2 that resolve
  some reported problems.

  All have been in linux-next, full details are in the shortlog below"

* tag 'tty-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: export fsl8250_handle_irq
  serial: 8250_mid: Add missing dependency
  tty: audit: Fix audit source
  serial: etraxfs-uart: Fix crash
  serial: fsl_lpuart: Fix earlycon support
  bcm63xx_uart: Use the device name when registering an interrupt
  tty: Fix direct use of tty buffer work
  tty: Fix tty_send_xchar() lock order inversion
2015-11-22 15:10:57 -08:00
Jesper Dangaard Brouer
865762a811 slab/slub: adjust kmem_cache_alloc_bulk API
Adjust kmem_cache_alloc_bulk API before we have any real users.

Adjust API to return type 'int' instead of previously type 'bool'.  This
is done to allow future extension of the bulk alloc API.

A future extension could be to allow SLUB to stop at a page boundary, when
specified by a flag, and then return the number of objects.

The advantage of this approach, would make it easier to make bulk alloc
run without local IRQs disabled.  With an approach of cmpxchg "stealing"
the entire c->freelist or page->freelist.  To avoid overshooting we would
stop processing at a slab-page boundary.  Else we always end up returning
some objects at the cost of another cmpxchg.

To keep compatible with future users of this API linking against an older
kernel when using the new flag, we need to return the number of allocated
objects with this API change.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-22 11:58:44 -08:00
Linus Torvalds
3ad5d7e06a Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "A bunch of fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG
  slub: avoid irqoff/on in bulk allocation
  slub: create new ___slab_alloc function that can be called with irqs disabled
  mm: fix up sparse warning in gfpflags_allow_blocking
  ocfs2: fix umask ignored issue
  PM/OPP: add entry in MAINTAINERS
  kernel/panic.c: turn off locks debug before releasing console lock
  kernel/signal.c: unexport sigsuspend()
  kasan: fix kmemleak false-positive in kasan_module_alloc()
  fat: fix fake_offset handling on error path
  mm/hugetlbfs: fix bugs in fallocate hole punch of areas with holes
  mm/page-writeback.c: initialize m_dirty to avoid compile warning
  various: fix pci_set_dma_mask return value checking
  mm: loosen MADV_NOHUGEPAGE to enable Qemu postcopy on s390
  mm: vmalloc: don't remove inexistent guard hole in remove_vm_area()
  tools/vm/page-types.c: support KPF_IDLE
  ncpfs: don't allow negative timeouts
  configfs: allow dynamic group creation
  MAINTAINERS: add Moritz as reviewer for FPGA Manager Framework
  slab.h: sprinkle __assume_aligned attributes
2015-11-21 10:49:13 -08:00
Peter Hurley
6b2a3d628a tty: audit: Fix audit source
The data to audit/record is in the 'from' buffer (ie., the input
read buffer).

Fixes: 72586c6061 ("n_tty: Fix auditing support for cannonical mode")
Cc: stable <stable@vger.kernel.org> # 4.1+
Cc: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Acked-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-20 16:19:54 -08:00
Jeff Layton
21fa844279 mm: fix up sparse warning in gfpflags_allow_blocking
sparse says:

    include/linux/gfp.h:274:26: warning: incorrect type in return expression (different base types)
    include/linux/gfp.h:274:26:    expected bool
    include/linux/gfp.h:274:26:    got restricted gfp_t

...add a forced cast to silence the warning.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-20 16:17:32 -08:00
Richard Weinberger
9d8a765211 kernel/signal.c: unexport sigsuspend()
sigsuspend() is nowhere used except in signal.c itself, so we can mark it
static do not pollute the global namespace.

But this patch is more than a boring cleanup patch, it fixes a real issue
on UserModeLinux.  UML has a special console driver to display ttys using
xterm, or other terminal emulators, on the host side.  Vegard reported
that sometimes UML is unable to spawn a xterm and he's facing the
following warning:

  WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0()

It turned out that this warning makes absolutely no sense as the UML
xterm code calls sigsuspend() on the host side, at least it tries.  But
as the kernel itself offers a sigsuspend() symbol the linker choose this
one instead of the glibc wrapper.  Interestingly this code used to work
since ever but always blocked signals on the wrong side.  Some recent
kernel change made the WARN_ON() trigger and uncovered the bug.

It is a wonderful example of how much works by chance on computers. :-)

Fixes: 68f3f16d9a ("new helper: sigsuspend()")
Signed-off-by: Richard Weinberger <richard@nod.at>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>	[3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-20 16:17:32 -08:00
Daniel Baluta
5cf6a51e60 configfs: allow dynamic group creation
This patchset introduces IIO software triggers, offers a way of configuring
them via configfs and adds the IIO hrtimer based interrupt source to be used
with software triggers.

The architecture is now split in 3 parts, to remove all IIO trigger specific
parts from IIO configfs core:

(1) IIO configfs - creates the root of the IIO configfs subsys.
(2) IIO software triggers - software trigger implementation, dynamically
    creating /config/iio/triggers group.
(3) IIO hrtimer trigger - is the first interrupt source for software triggers
    (with syfs to follow). Each trigger type can implement its own set of
    attributes.

Lockdep seems to be happy with the locking in configfs patch.

This patch (of 5):

We don't want to hardcode default groups at subsystem
creation time. We export:
	* configfs_register_group
	* configfs_unregister_group
to allow drivers to programatically create/destroy groups
later, after module init time.

This is needed for IIO configfs support.

(akpm: the other 4 patches to be merged via the IIO tree)

Signed-off-by: Daniel Baluta <daniel.baluta@intel.com>
Suggested-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Hartmut Knaack <knaack.h@gmx.de>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Adriana Reus <adriana.reus@intel.com>
Cc: Cristina Opriceana <cristina.opriceana@gmail.com>
Cc: Peter Meerwald <pmeerw@pmeerw.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-20 16:17:32 -08:00
Rasmus Villemoes
94a58c360a slab.h: sprinkle __assume_aligned attributes
The various allocators return aligned memory.  Telling the compiler that
allows it to generate better code in many cases, for example when the
return value is immediately passed to memset().

Some code does become larger, but at least we win twice as much as we lose:

$ scripts/bloat-o-meter /tmp/vmlinux vmlinux
add/remove: 0/0 grow/shrink: 13/52 up/down: 995/-2140 (-1145)

An example of the different (and smaller) code can be seen in mm_alloc(). Before:

:       48 8d 78 08             lea    0x8(%rax),%rdi
:       48 89 c1                mov    %rax,%rcx
:       48 89 c2                mov    %rax,%rdx
:       48 c7 00 00 00 00 00    movq   $0x0,(%rax)
:       48 c7 80 48 03 00 00    movq   $0x0,0x348(%rax)
:       00 00 00 00
:       31 c0                   xor    %eax,%eax
:       48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
:       48 29 f9                sub    %rdi,%rcx
:       81 c1 50 03 00 00       add    $0x350,%ecx
:       c1 e9 03                shr    $0x3,%ecx
:       f3 48 ab                rep stos %rax,%es:(%rdi)

After:

:       48 89 c2                mov    %rax,%rdx
:       b9 6a 00 00 00          mov    $0x6a,%ecx
:       31 c0                   xor    %eax,%eax
:       48 89 d7                mov    %rdx,%rdi
:       f3 48 ab                rep stos %rax,%es:(%rdi)

So gcc's strategy is to do two possibly (but not really, of course)
unaligned stores to the first and last word, then do an aligned rep stos
covering the middle part with a little overlap.  Maybe arches which do not
allow unaligned stores gain even more.

I don't know if gcc can actually make use of alignments greater than 8 for
anything, so one could probably drop the __assume_xyz_alignment macros and
just use __assume_aligned(8).

The increases in code size are mostly caused by gcc deciding to
opencode strlen() using the check-four-bytes-at-a-time trick when it
knows the buffer is sufficiently aligned (one function grew by 200
bytes). Now it turns out that many of these strlen() calls showing up
were in fact redundant, and they're gone from -next. Applying the two
patches to next-20151001 bloat-o-meter instead says

add/remove: 0/0 grow/shrink: 6/52 up/down: 244/-2140 (-1896)

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-20 16:17:32 -08:00
Linus Torvalds
95803066c6 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:

 - A collection of crash and deadlock fixes for DAX that are also tagged
   for -stable.  We will look to re-enable DAX pmd mappings in 4.5, but
   for now 4.4 and -stable should disable it by default.

 - A fixup to ext2 and ext4 to mirror the same warning emitted by XFS
   when mounting with "-o dax"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  block: protect rw_page against device teardown
  mm, dax: fix DAX deadlocks (COW fault)
  dax: disable pmd mappings
  ext2, ext4: warn when mounting with dax enabled
2015-11-20 15:00:50 -08:00
Tejun Heo
16af439645 cgroup: implement cgroup_get_from_path() and expose cgroup_put()
Implement cgroup_get_from_path() using kernfs_walk_and_get() which
obtains a default hierarchy cgroup from its path.  This will be used
to allow cgroup path based matching from outside cgroup proper -
e.g. networking and perf.

v2: Add EXPORT_SYMBOL_GPL(cgroup_get_from_path).

Signed-off-by: Tejun Heo <tj@kernel.org>
2015-11-20 15:55:52 -05:00
Tejun Heo
bd96f76a24 kernfs: implement kernfs_walk_and_get()
Implement kernfs_walk_and_get() which is similar to
kernfs_find_and_get() but can walk a path instead of just a name.

v2: Use strlcpy() instead of strlen() + memcpy() as suggested by
    David.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Miller <davem@davemloft.net>
2015-11-20 15:55:52 -05:00
Tejun Heo
b11cfb5807 cgroup: record ancestor IDs and reimplement cgroup_is_descendant() using it
cgroup_is_descendant() currently walks up the hierarchy and compares
each ancestor to the cgroup in question.  While enough for cgroup core
usages, this can't be used in hot paths to test cgroup membership.
This patch adds cgroup->ancestor_ids[] which records the IDs of all
ancestors including self and cgroup->level for the nesting level.

This allows testing whether a given cgroup is a descendant of another
in three finite steps - testing whether the two belong to the same
hierarchy, whether the descendant candidate is at the same or a higher
level than the ancestor and comparing the recorded ancestor_id at the
matching level.  cgroup_is_descendant() is accordingly reimplmented
and made inline.

Signed-off-by: Tejun Heo <tj@kernel.org>
2015-11-20 15:55:52 -05:00
Guillaume Nault
a8acce6aa5 ppp: remove PPPOX_ZOMBIE socket state
PPPOX_ZOMBIE is never set anymore.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:31:26 -05:00
Javier Gonzalez
2fde0e482d lightnvm: add free and bad lun info to show luns
Add free block, used block, and bad block information to the show debug
interface. This information is used to debug how targets track blocks.

Also, change debug function name to make it more generic.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:21 -07:00
Javier Gonzalez
0b59733b95 lightnvm: keep track of block counts
Maintain number of in use blocks, free blocks, and bad blocks in a per
lun basis. This allows the upper layers to get information about the
state of each lun.

Also, account for blocks reserved to the device on the free block count.
nr_free_blocks matches now the actual number of blocks on the free list
when the device is booted.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:20 -07:00
Linus Torvalds
86eaf54d07 dmaengine fixes for 4.4-rc2
This has odd fixes spreadout drivers, not major here
    - usbdmac fixes for pm
    - edma build and logic fixes
    - build warn fixes for few drivers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWTp6mAAoJEHwUBw8lI4NH2xUP/1XJVXqX0zk09sdylZFGz2d4
 XIGLy6tUXkN9Ks6do65PMcsbEUYFKU3xh2esCUMOBHjwTJYGWYp5bv5HCZoEjbKs
 kspuzKBoSsW32r/UhxnKhn8Ryz5IYnb72cIEFXw9MZTcPa1msUrqOBdNEQATlyvf
 rHQfzUg8Nw9+rLZ31PhNMtPvqpohVxV8mx4eOe7hVgVefdawmpHNu3BGo0xy7b39
 jsSrGjEOdWEtZqg/mxYaZ36eXzCAiFg4S/eJrFWlUx2q/FolHgII+mMuXk0OygPp
 7KQoB1tSYlCVrlzoKjMisJW4RCmkdxRNIqi8nu+sSDPN8cLnx7DZxHy7QaopdoYP
 Jts9rxLXbwJ9+WzBgOKANvYIfwbyNVK1eweaTAt67lJiR0d1G701ySnTr2a/pJq+
 tK9qeI2k4I8uuX/Nt3mTzHxQMxVUn/N2CnApXNuuMKLk/wiGhjAvqlhOEmDDHO7F
 9vyhh02/a8BZK9GDp3VUHQT7cHTPKLAB7UlAICOh8tF5TImD1haSw+jtZDSH5Ku3
 ur7t0dfUpoOnuZWjHFBI7+AQ3WO+Ju9S8DZdIYpRPTDoBtJ9IaUikFmgOkKopR2G
 aK4wcicOuR/CxyzdgFPV6N5tCHP++ueA/OBEdOx7KkvG0ANWeAkENv+08DSAwDX+
 z61lusATzOMBWB97I9pu
 =Dtnx
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.4-rc2' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "This has odd fixes spreadout drivers, not major here

   - usbdmac fixes for pm
   - edma build and logic fixes
   - build warn fixes for few drivers"

* tag 'dmaengine-fix-4.4-rc2' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: at_hdmac: use %pad format string for dma_addr_t
  dmaengine: at_xdmac: use %pad format string for dma_addr_t
  dmaengine: imx-sdma: remove __init annotation on sdma_event_remap
  dmaengine: edma: predecence bug in GET_NUM_QDMACH()
  dmaengine: edma: fix build without CONFIG_OF
  dmaengine: of_dma: Correct return code for of_dma_request_slave_channel in case !CONFIG_OF
  dmaengine: sh: usb-dmac: Fix pm_runtime_{enable,disable}() imbalance
  dmaengine: sh: usb-dmac: Fix crash on runtime suspend
2015-11-19 20:51:31 -08:00
Dan Williams
2e6edc9538 block: protect rw_page against device teardown
Fix use after free crashes like the following:

 general protection fault: 0000 [#1] SMP
 Call Trace:
  [<ffffffffa0050216>] ? pmem_do_bvec.isra.12+0xa6/0xf0 [nd_pmem]
  [<ffffffffa0050ba2>] pmem_rw_page+0x42/0x80 [nd_pmem]
  [<ffffffff8128fd90>] bdev_read_page+0x50/0x60
  [<ffffffff812972f0>] do_mpage_readpage+0x510/0x770
  [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
  [<ffffffff811d86dc>] ? lru_cache_add+0x1c/0x50
  [<ffffffff81297657>] mpage_readpages+0x107/0x170
  [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
  [<ffffffff8128fd20>] ? I_BDEV+0x20/0x20
  [<ffffffff8129058d>] blkdev_readpages+0x1d/0x20
  [<ffffffff811d615f>] __do_page_cache_readahead+0x28f/0x310
  [<ffffffff811d6039>] ? __do_page_cache_readahead+0x169/0x310
  [<ffffffff811c5abd>] ? pagecache_get_page+0x2d/0x1d0
  [<ffffffff811c76f6>] filemap_fault+0x396/0x530
  [<ffffffff811f816e>] __do_fault+0x4e/0xf0
  [<ffffffff811fce7d>] handle_mm_fault+0x11bd/0x1b50

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
[willy: symmetry fixups]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-19 13:47:10 -08:00
Arnd Bergmann
851df3dc11 scpi: hide get_scpi_ops in module from built-in code
The scpi_clock driver can be built-in when CONFIG_COMPILE_TEST
is set even when ARM_SCPI_PROTOCOL is a loadable module, and
that results in a link error:

drivers/built-in.o: In function `scpi_clocks_probe':
(.text+0x14453c): undefined reference to `get_scpi_ops'

Using #if IS_REACHABLE() around the get_scpi_ops() declaration
makes it build successfully in this case for compile-testing,
but the effect is the same as when ARM_SCPI_PROTOCOL is
disabled, as the code will not be used.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
2015-11-19 16:22:43 +01:00