Commit Graph

26689 Commits

Author SHA1 Message Date
Benoit Cousson
4fcd15a032 of: Add helpers to get one string in multiple strings property
Add of_property_read_string_index and of_property_count_strings
to retrieve one string inside a property that will contains
severals strings.

Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Kevin Hilman <khilman@ti.com>
2011-10-04 09:52:23 -07:00
Mark Brown
81204c84ca ASoC: Add WM1811 support
The WM1811 is mostly register compatible with the WM8994 and WM8958,
providing a high performance audio hub CODEC in a small form factor
suitable for ultra compact system designs.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-10-04 11:59:46 +01:00
Mark Brown
b1f43bf3a5 mfd: Add WM1811 support
The WM1811 is mostly register compatible with the WM8994 and WM8958,
providing a high performance audio hub CODEC in a small form factor
suitable for ultra compact system designs.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-10-04 11:59:02 +01:00
Peter Zijlstra
f0f1d32f93 llist: Remove cpu_relax() usage in cmpxchg loops
Initial benchmarks show they're a net loss:

 $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
 $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
 $ ./sembench -t 2048 -w 1900 -o 0

Pre:

 run time 30 seconds 778936 worker burns per second
 run time 30 seconds 912190 worker burns per second
 run time 30 seconds 817506 worker burns per second
 run time 30 seconds 830870 worker burns per second
 run time 30 seconds 845056 worker burns per second

Post:

 run time 30 seconds 905920 worker burns per second
 run time 30 seconds 849046 worker burns per second
 run time 30 seconds 886286 worker burns per second
 run time 30 seconds 822320 worker burns per second
 run time 30 seconds 900283 worker burns per second

So about 4% faster. (!)

cpu_relax() stalls the pipeline, therefore, when used in a tight loop
it has the following benefits:

 - allows SMT siblings to have a go;
 - reduces pressure on the CPU interconnect.

However, cmpxchg loops are unfair and thus have unbounded completion
time, therefore we should avoid getting in such heavily contended
situations where the above benefits make any difference.

A typical cmpxchg loop should not go round more than a handfull of
times at worst, therefore adding extra delays just slows things down.

Since the llist primitives are new, there aren't any bad users yet,
and we should avoid growing them. Heavily contended sites should
generally be better off using the ticket locks for serialization since
they provide bounded completion times (fifo-fair over the cpus).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836358.26517.43.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:44:03 +02:00
Peter Zijlstra
fa14ff4acc sched: Convert to struct llist
Use the generic llist primitives.

We had a private lockless list implementation in the scheduler in the wake-list
code, now that we have a generic llist implementation that provides all required
operations, switch to it.

This patch is not expected to change any behavior.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836353.26517.42.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:58 +02:00
Peter Zijlstra
924f8f5af3 llist: Add llist_next()
So we don't have to expose the struct list_node member.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315836348.26517.41.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:53 +02:00
Huang Ying
38aaf8090d irq_work: Use llist in the struct irq_work logic
Use llist in irq_work instead of the lock-less linked list
implementation in irq_work to avoid the code duplication.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-6-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:49 +02:00
Huang Ying
781f7fd916 llist: Return whether list is empty before adding in llist_add()
Extend the llist_add*() functions to return a success indicator, this
allows us in the scheduler code to send an IPI if the queue was empty.

( There's no effect on existing users, because the list_add_xxx() functions
  are inline, thus this will be optimized out by the compiler if not used
  by callers. )

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-5-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:44 +02:00
Huang Ying
a3127336b7 llist: Move cpu_relax() to after the cmpxchg()
If in llist_add()/etc. functions the first cmpxchg() call succeeds, it is
not necessary to use cpu_relax() before the cmpxchg(). So cpu_relax() in
a busy loop involving cmpxchg() should go after cmpxchg() instead of before
that.

This patch fixes this for all involved llist functions.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-4-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:39 +02:00
Ingo Molnar
2c30245c65 llist: Remove the platform-dependent NMI checks
Remove the nmi() checks spread around the code. in_nmi() is not available
on every architecture and it's a pretty obscure and ugly check in any case.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-3-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:43:11 +02:00
Huang Ying
1230db8e15 llist: Make some llist functions inline
Because llist code will be used in performance critical scheduler
code path, make llist_add() and llist_del_all() inline to avoid
function calling overhead and related 'glue' overhead.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:30:53 +02:00
Ingo Molnar
22f92bacbe Merge branch 'linus' into sched/core
Merge reason: pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:09:08 +02:00
Sangwook Lee
e209c5a7ed net:rfkill: add a gpio setup function into GPIO rfkill
Add a gpio setup function which gives a chance to set up
platform specific configuration such as pin multiplexing,
input/output direction at the runtime or booting time.

Signed-off-by: Sangwook Lee <sangwook.lee@linaro.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-10-03 15:19:19 -04:00
Vasily Averin
349d2895cc ipv4: NET_IPV4_ROUTE_GC_INTERVAL removal
removing obsoleted sysctl,
ip_rt_gc_interval variable no longer used since 2.6.38

Signed-off-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:13:01 -04:00
Jiří Župka
96c131842a Repair wrong named definition aligned_u64
This repairs problem with compile library in userspace (libnl).

Signed-off-by: Jiří Župka <jzupka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:03:48 -04:00
Eric Dumazet
b5c5693bb7 tcp: report ECN_SEEN in tcp_info
Allows ss command (iproute2) to display "ecnseen" if at least one packet
with ECT(0) or ECT(1) or ECN was received by this socket.

"ecn" means ECN was negotiated at session establishment (TCP level)

"ecnseen" means we received at least one packet with ECT fields set (IP
level)

ss -i
...
ESTAB      0      0   192.168.20.110:22  192.168.20.144:38016
ino:5950 sk:f178e400
	 mem:(r0,w0,f0,t0) ts sack ecn ecnseen bic wscale:7,8 rto:210
rtt:12.5/7.5 cwnd:10 send 9.3Mbps rcv_space:14480

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 14:01:21 -04:00
Marc Zyngier
1e7c5fd294 genirq: percpu: allow interrupt type to be set at enable time
As request_percpu_irq() doesn't allow for a percpu interrupt to have
its type configured (it is generally impossible to configure it on all
CPUs at once), add a 'type' argument to enable_percpu_irq().

This allows some low-level, board specific init code to be switched to
a generic API.

[ tglx: Added WARN_ON argument ]

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-03 15:35:27 +02:00
Marc Zyngier
31d9d9b6d8 genirq: Add support for per-cpu dev_id interrupts
The ARM GIC interrupt controller offers per CPU interrupts (PPIs),
which are usually used to connect local timers to each core. Each CPU
has its own private interface to the GIC, and only sees the PPIs that
are directly connect to it.

While these timers are separate devices and have a separate interrupt
line to a core, they all use the same IRQ number.

For these devices, request_irq() is not the right API as it assumes
that an IRQ number is visible by a number of CPUs (through the
affinity setting), but makes it very awkward to express that an IRQ
number can be handled by all CPUs, and yet be a different interrupt
line on each CPU, requiring a different dev_id cookie to be passed
back to the handler.

The *_percpu_irq() functions is designed to overcome these
limitations, by providing a per-cpu dev_id vector:

int request_percpu_irq(unsigned int irq, irq_handler_t handler,
		   const char *devname, void __percpu *percpu_dev_id);
void free_percpu_irq(unsigned int, void __percpu *);
int setup_percpu_irq(unsigned int irq, struct irqaction *new);
void remove_percpu_irq(unsigned int irq, struct irqaction *act);
void enable_percpu_irq(unsigned int irq);
void disable_percpu_irq(unsigned int irq);

The API has a number of limitations:
- no interrupt sharing
- no threading
- common handler across all the CPUs

Once the interrupt is requested using setup_percpu_irq() or
request_percpu_irq(), it must be enabled by each core that wishes its
local interrupt to be delivered.

Based on an initial patch by Thomas Gleixner.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-03 15:35:26 +02:00
Wu Fengguang
9d823e8f6b writeback: per task dirty rate limit
Add two fields to task_struct.

1) account dirtied pages in the individual tasks, for accuracy
2) per-task balance_dirty_pages() call intervals, for flexibility

The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will
scale near-sqrt to the safety gap between dirty pages and threshold.

The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying
pages at exactly the same time, each task will be assigned a large
initial nr_dirtied_pause, so that the dirty threshold will be exceeded
long before each task reached its nr_dirtied_pause and hence call
balance_dirty_pages().

The solution is to watch for the number of pages dirtied on each CPU in
between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages
(3% dirty threshold), force call balance_dirty_pages() for a chance to
set bdi->dirty_exceeded. In normal situations, this safeguarding
condition is not expected to trigger at all.

On the sqrt in dirty_poll_interval():

It will serve as an initial guess when dirty pages are still in the
freerun area.

When dirty pages are floating inside the dirty control scope [freerun,
limit], a followup patch will use some refined dirty poll interval to
get the desired pause time.

   thresh-dirty (MB)    sqrt
		   1      16
		   2      22
		   4      32
		   8      45
		  16      64
		  32      90
		  64     128
		 128     181
		 256     256
		 512     362
		1024     512

The above table means, given 1MB (or 1GB) gap and the dd tasks polling
balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't
be exceeded as long as there are less than 16 (or 512) concurrent dd's.

So sqrt naturally leads to less overheads and more safe concurrent tasks
for large memory servers, which have large (thresh-freerun) gaps.

peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case

CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
Wu Fengguang
7381131cbc writeback: stabilize bdi->dirty_ratelimit
There are some imperfections in balanced_dirty_ratelimit.

1) large fluctuations

The dirty_rate used for computing balanced_dirty_ratelimit is merely
averaged in the past 200ms (very small comparing to the 3s estimation
period for write_bw), which makes rather dispersed distribution of
balanced_dirty_ratelimit.

It's pretty hard to average out the singular points by increasing the
estimation period. Considering that the averaging technique will
introduce very undesirable time lags, I give it up totally. (btw, the 3s
write_bw averaging time lag is much more acceptable because its impact
is one-way and therefore won't lead to oscillations.)

The more practical way is filtering -- most singular
balanced_dirty_ratelimit points can be filtered out by remembering some
prev_balanced_rate and prev_prev_balanced_rate. However the more
reliable way is to guard balanced_dirty_ratelimit with task_ratelimit.

2) due to truncates and fs redirties, the (write_bw <=> dirty_rate)
match could become unbalanced, which may lead to large systematical
errors in balanced_dirty_ratelimit. The truncates, due to its possibly
bumpy nature, can hardly be compensated smoothly. So let's face it. When
some over-estimated balanced_dirty_ratelimit brings dirty_ratelimit
high, dirty pages will go higher than the setpoint. task_ratelimit will
in turn become lower than dirty_ratelimit.  So if we consider both
balanced_dirty_ratelimit and task_ratelimit and update dirty_ratelimit
only when they are on the same side of dirty_ratelimit, the systematical
errors in balanced_dirty_ratelimit won't be able to bring
dirty_ratelimit far away.

The balanced_dirty_ratelimit estimation may also be inaccurate near
@limit or @freerun, however is less an issue.

3) since we ultimately want to

- keep the fluctuations of task ratelimit as small as possible
- keep the dirty pages around the setpoint as long time as possible

the update policy used for (2) also serves the above goals nicely:
if for some reason the dirty pages are high (task_ratelimit < dirty_ratelimit),
and dirty_ratelimit is low (dirty_ratelimit < balanced_dirty_ratelimit),
there is no point to bring up dirty_ratelimit in a hurry only to hurt
both the above two goals.

So, we make use of task_ratelimit to limit the update of dirty_ratelimit
in two ways:

1) avoid changing dirty rate when it's against the position control target
   (the adjusted rate will slow down the progress of dirty pages going
   back to setpoint).

2) limit the step size. task_ratelimit is changing values step by step,
   leaving a consistent trace comparing to the randomly jumping
   balanced_dirty_ratelimit. task_ratelimit also has the nice smaller
   errors in stable state and typically larger errors when there are big
   errors in rate.  So it's a pretty good limiting factor for the step
   size of dirty_ratelimit.

Note that bdi->dirty_ratelimit is always tracking balanced_dirty_ratelimit.
task_ratelimit is merely used as a limiting factor.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:57 +08:00
Wu Fengguang
be3ffa2764 writeback: dirty rate control
It's all about bdi->dirty_ratelimit, which aims to be (write_bw / N)
when there are N dd tasks.

On write() syscall, use bdi->dirty_ratelimit
============================================

    balance_dirty_pages(pages_dirtied)
    {
        task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio();
        pause = pages_dirtied / task_ratelimit;
        sleep(pause);
    }

On every 200ms, update bdi->dirty_ratelimit
===========================================

    bdi_update_dirty_ratelimit()
    {
        task_ratelimit = bdi->dirty_ratelimit * bdi_position_ratio();
        balanced_dirty_ratelimit = task_ratelimit * write_bw / dirty_rate;
        bdi->dirty_ratelimit = balanced_dirty_ratelimit
    }

Estimation of balanced bdi->dirty_ratelimit
===========================================

balanced task_ratelimit
-----------------------

balance_dirty_pages() needs to throttle tasks dirtying pages such that
the total amount of dirty pages stays below the specified dirty limit in
order to avoid memory deadlocks. Furthermore we desire fairness in that
tasks get throttled proportionally to the amount of pages they dirty.

IOW we want to throttle tasks such that we match the dirty rate to the
writeout bandwidth, this yields a stable amount of dirty pages:

        dirty_rate == write_bw                                          (1)

The fairness requirement gives us:

        task_ratelimit = balanced_dirty_ratelimit
                       == write_bw / N                                  (2)

where N is the number of dd tasks.  We don't know N beforehand, but
still can estimate balanced_dirty_ratelimit within 200ms.

Start by throttling each dd task at rate

        task_ratelimit = task_ratelimit_0                               (3)
                         (any non-zero initial value is OK)

After 200ms, we measured

        dirty_rate = # of pages dirtied by all dd's / 200ms
        write_bw   = # of pages written to the disk / 200ms

For the aggressive dd dirtiers, the equality holds

        dirty_rate == N * task_rate
                   == N * task_ratelimit_0                              (4)
Or
        task_ratelimit_0 == dirty_rate / N                              (5)

Now we conclude that the balanced task ratelimit can be estimated by

                                                      write_bw
        balanced_dirty_ratelimit = task_ratelimit_0 * ----------        (6)
                                                      dirty_rate

Because with (4) and (5) we can get the desired equality (1):

                                                       write_bw
        balanced_dirty_ratelimit == (dirty_rate / N) * ----------
                                                       dirty_rate
                                 == write_bw / N

Then using the balanced task ratelimit we can compute task pause times like:

        task_pause = task->nr_dirtied / task_ratelimit

task_ratelimit with position control
------------------------------------

However, while the above gives us means of matching the dirty rate to
the writeout bandwidth, it at best provides us with a stable dirty page
count (assuming a static system). In order to control the dirty page
count such that it is high enough to provide performance, but does not
exceed the specified limit we need another control.

The dirty position control works by extending (2) to

        task_ratelimit = balanced_dirty_ratelimit * pos_ratio           (7)

where pos_ratio is a negative feedback function that subjects to

1) f(setpoint) = 1.0
2) df/dx < 0

That is, if the dirty pages are ABOVE the setpoint, we throttle each
task a bit more HEAVY than balanced_dirty_ratelimit, so that the dirty
pages are created less fast than they are cleaned, thus DROP to the
setpoints (and the reverse).

Based on (7) and the assumption that both dirty_ratelimit and pos_ratio
remains CONSTANT for the past 200ms, we get

        task_ratelimit_0 = balanced_dirty_ratelimit * pos_ratio         (8)

Putting (8) into (6), we get the formula used in
bdi_update_dirty_ratelimit():

                                                write_bw
        balanced_dirty_ratelimit *= pos_ratio * ----------              (9)
                                                dirty_rate

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Wu Fengguang
af6a311384 writeback: add bg_threshold parameter to __bdi_update_bandwidth()
No behavior change.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Wu Fengguang
c8e28ce049 writeback: account per-bdi accumulated dirtied pages
Introduce the BDI_DIRTIED counter. It will be used for estimating the
bdi's dirty bandwidth.

CC: Jan Kara <jack@suse.cz>
CC: Michael Rubin <mrubin@google.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-10-03 21:08:56 +08:00
Linus Walleij
b1e3be0647 clocksource: fixup ux500 build problems
Based on a patch from Arnd Bergmann this fixes up the build
problem of assigning a non-existing global when the ux500 PRCMU
timer is not linked in by passing its base address to the init
function. We also add a missing <linux/errno.h> inclusion and
staticize the dummy function.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-10-03 09:34:16 +02:00
Dan Williams
f6e67035a9 [SCSI] libsas,libata: fix ->change_queue_{depth|type} for sata devices
Pass queue_depth change requests to libata, and prevent queue_type
changes for ATA devices.

Otherwise:
1/ we do not honor the libata specific restrictions on the queue depth
2/ libsas drivers that do not set sdev->tagged_supported are unable to
   change the queue_depth of ata devices via sysfs

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02 12:30:30 -05:00
MyungJoo Ham
ce26c5bb95 PM / devfreq: Add basic governors
Four cpufreq-like governors are provided as examples.

powersave: use the lowest frequency possible. The user (device) should
set the polling_ms as 0 because polling is useless for this governor.

performance: use the highest freqeuncy possible. The user (device)
should set the polling_ms as 0 because polling is useless for this
governor.

userspace: use the user specified frequency stored at
devfreq.user_set_freq. With sysfs support in the following patch, a user
may set the value with the sysfs interface.

simple_ondemand: simplified version of cpufreq's ondemand governor.

When a user updates OPP entries (enable/disable/add), OPP framework
automatically notifies devfreq to update operating frequency
accordingly. Thus, devfreq users (device drivers) do not need to update
devfreq manually with OPP entry updates or set polling_ms for powersave
, performance, userspace, or any other "static" governors.

Note that these are given only as basic examples for governors and any
devices with devfreq may implement their own governors with the drivers
and use them.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-02 00:19:34 +02:00
MyungJoo Ham
a3c98b8b2e PM: Introduce devfreq: generic DVFS framework with device-specific OPPs
With OPPs, a device may have multiple operable frequency and voltage
sets. However, there can be multiple possible operable sets and a system
will need to choose one from them. In order to reduce the power
consumption (by reducing frequency and voltage) without affecting the
performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
scheme may be used.

This patch introduces the DVFS capability to non-CPU devices with OPPs.
DVFS is a techique whereby the frequency and supplied voltage of a
device is adjusted on-the-fly. DVFS usually sets the frequency as low
as possible with given conditions (such as QoS assurance) and adjusts
voltage according to the chosen frequency in order to reduce power
consumption and heat dissipation.

The generic DVFS for devices, devfreq, may appear quite similar with
/drivers/cpufreq.  However, cpufreq does not allow to have multiple
devices registered and is not suitable to have multiple heterogenous
devices with different (but simple) governors.

Normally, DVFS mechanism controls frequency based on the demand for
the device, and then, chooses voltage based on the chosen frequency.
devfreq also controls the frequency based on the governor's frequency
recommendation and let OPP pick up the pair of frequency and voltage
based on the recommended frequency. Then, the chosen OPP is passed to
device driver's "target" callback.

When PM QoS is going to be used with the devfreq device, the device
driver should enable OPPs that are appropriate with the current PM QoS
requests. In order to do so, the device driver may call opp_enable and
opp_disable at the notifier callback of PM QoS so that PM QoS's
update_target() call enables the appropriate OPPs. Note that at least
one of OPPs should be enabled at any time; be careful when there is a
transition.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-02 00:19:15 +02:00
Linus Torvalds
f72a209a3e Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
* 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  irq: Fix check for already initialized irq_domain in irq_domain_add
  irq: Add declaration of irq_domain_simple_ops to irqdomain.h

* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86/rtc: Don't recursively acquire rtc_lock

* 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  posix-cpu-timers: Cure SMP wobbles
  sched: Fix up wchan borkage
  sched/rt: Migrate equal priority tasks to available CPUs
2011-10-01 08:37:25 -07:00
MyungJoo Ham
03ca370fbf PM / OPP: Add OPP availability change notifier.
The patch enables to register notifier_block for an OPP-device in order
to get notified for any changes in the availability of OPPs of the
device. For example, if a new OPP is inserted or enable/disable status
of an OPP is changed, the notifier is executed.

This enables the usage of opp_add, opp_enable, and opp_disable to
directly take effect with any connected entities such as cpufreq or
devfreq.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Mike Turquette <mturquette@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-09-30 22:35:12 +02:00
Arik Nemtsov
07ba55d7f1 nl80211/mac80211: allow adding TDLS peers as stations
When adding a TDLS peer STA, mark it with a new flag in both nl80211 and
mac80211. Before adding a peer, make sure the wiphy supports TDLS and
our operating mode is appropriate (managed).

In addition, make sure all peers are removed on disassociation.

A TDLS peer is first added just before link setup is initiated. In later
setup stages we have more info about peer supported rates, capabilities,
etc. This info is reported via nl80211_set_station().

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: Kalyan C Gaddam <chakkal@iit.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:08 -04:00
Arik Nemtsov
dfe018bf99 mac80211: handle TDLS high-level commands and frames
Register and implement the TDLS cfg80211 callback functions.

Internally prepare and send TDLS management frames. We incorporate
local STA capabilities and supported rates with extra IEs given by
usermode. The resulting packet is either encapsulated in a data frame,
or assembled as an action frame. It is transmitted either directly or
through the AP, as mandated by the TDLS specification.

Declare support for the TDLS external setup wiphy capability. This
tells usermode to handle link setup and discovery on its own, and use the
kernel driver for sending TDLS mgmt packets.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: Kalyan C Gaddam <chakkal@iit.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:07 -04:00
Arik Nemtsov
109086ce0b nl80211: support sending TDLS commands/frames
Add support for sending high-level TDLS commands and TDLS frames via
NL80211_CMD_TDLS_OPER and NL80211_CMD_TDLS_MGMT, respectively. Add
appropriate cfg80211 callbacks for lower level drivers.

Add wiphy capability flags for TDLS support and advertise them via
nl80211.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: Kalyan C Gaddam <chakkal@iit.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-30 15:57:05 -04:00
John W. Linville
8e00f5fbb4 Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-pci.c
	drivers/net/wireless/wl12xx/main.c
2011-09-30 14:52:29 -04:00
Ohad Ben-Cohen
0ed6d2d27b iommu/core: let drivers know if an iommu fault handler isn't installed
Make report_iommu_fault() return -ENOSYS whenever an iommu fault
handler isn't installed, so IOMMU drivers can then do their own
platform-specific default behavior if they wanted.

Fault handlers can still return -ENOSYS in case they want to elicit the
default behavior of the IOMMU drivers.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-30 16:40:32 +02:00
Dimitris Papastamos
6eb0f5e015 regmap: Implement regcache_cache_bypass helper function
Ensure we've got a function so users can enable/disable the
cache bypass option.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-30 13:57:47 +01:00
Peter Zijlstra
d670ec1317 posix-cpu-timers: Cure SMP wobbles
David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$ 

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&barrier, NULL, 2);
	  err = pthread_create(&th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&barrier);

	  err = clock_gettime(process_clock, &process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&sleeptime, NULL);

	  err = clock_gettime(th_clock, &th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-30 14:07:06 +02:00
Ben Dooks
3a1e362e3f OF: Add of_match_ptr() macro
Add a macro of_match_ptr() that allows the .of_match_table
entry in the driver structures to be assigned without having
an #ifdef xxx NULL for the case that OF is not enabled

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-09-29 19:31:48 -06:00
Serge Hallyn
d178bc3a70 user namespace: usb: make usb urbs user namespace aware (v2)
Add to the dev_state and alloc_async structures the user namespace
corresponding to the uid and euid.  Pass these to kill_pid_info_as_uid(),
which can then implement a proper, user-namespace-aware uid check.

Changelog:
Sep 20: Per Oleg's suggestion: Instead of caching and passing user namespace,
	uid, and euid each separately, pass a struct cred.
Sep 26: Address Alan Stern's comments: don't define a struct cred at
	usbdev_open(), and take and put a cred at async_completed() to
	ensure it lasts for the duration of kill_pid_info_as_cred().

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-29 13:13:08 -07:00
Paul E. McKenney
82e78d80fc rcu: Simplify unboosting checks
Commit 7765be (Fix RCU_BOOST race handling current->rcu_read_unlock_special)
introduced a new ->rcu_boosted field in the task structure.  This is
redundant because the existing ->rcu_boost_mutex will be non-NULL at
any time that ->rcu_boosted is nonzero.  Therefore, this commit removes
->rcu_boosted and tests ->rcu_boost_mutex instead.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:39 -07:00
Paul E. McKenney
6206ab9bab rcu: Move __rcu_read_unlock()'s barrier() within if-statement
We only need to constrain the compiler if we are actually exiting
the top-level RCU read-side critical section.  This commit therefore
moves the first barrier() cal in __rcu_read_unlock() to inside the
"if" statement, thus avoiding needless register flushes for inner
rcu_read_unlock() calls.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:35 -07:00
Paul E. McKenney
6846c0c540 rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation
The differences between rcu_assign_pointer() and RCU_INIT_POINTER() are
subtle, and it is easy to use the the cheaper RCU_INIT_POINTER() when
the more-expensive rcu_assign_pointer() should have been used instead.
The consequences of this mistake are quite severe.

This commit therefore carefully lays out the situations in which it it
permissible to use RCU_INIT_POINTER().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:34 -07:00
Eric Dumazet
d322f45cee rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier
Recent changes to gcc give warning messages on rcu_assign_pointers()'s
checks that allow it to determine when it is OK to omit the memory
barrier.  Stephen Hemminger tried a number of gcc tricks to silence
this warning, but #pragmas and CPP macros do not work together in the
way that would be required to make this work.

However, we now have RCU_INIT_POINTER(), which already omits this
memory barrier, and which therefore may be used when assigning NULL to
an RCU-protected pointer that is accessible to readers.  This commit
therefore makes rcu_assign_pointer() unconditionally emit the memory
barrier.

Reported-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:33 -07:00
Shi, Alex
fc0763f53e nohz: Remove nohz_cpu_mask
RCU no longer uses this global variable, nor does anyone else.  This
commit therefore removes this variable.  This reduces memory footprint
and also removes some atomic instructions and memory barriers from
the dyntick-idle path.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:29 -07:00
Paul E. McKenney
22507ed9b9 rcu: Remove unused and redundant interfaces
The rcu_dereference_bh_protected() and rcu_dereference_sched_protected()
macros are synonyms for rcu_dereference_protected() and are not used
anywhere in mainline.  This commit therefore removes them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:26 -07:00
Paul E. McKenney
965a002b4f rcu: Make TINY_RCU also use softirq for RCU_BOOST=n
This patch #ifdefs TINY_RCU kthreads out of the kernel unless RCU_BOOST=y,
thus eliminating context-switch overhead if RCU priority boosting has
not been configured.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:20 -07:00
Paul E. McKenney
29c00b4a1d rcu: Add event-tracing for RCU callback invocation
There was recently some controversy about the overhead of invoking RCU
callbacks.  Add TRACE_EVENT()s to obtain fine-grained timings for the
start and stop of a batch of callbacks and also for each callback invoked.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:12 -07:00
Paul E. McKenney
2c42818e96 rcu: Abstract common code for RCU grace-period-wait primitives
Pull the code that waits for an RCU grace period into a single function,
which is then called by synchronize_rcu() and friends in the case of
TREE_RCU and TREE_PREEMPT_RCU, and from rcu_barrier() and friends in
the case of TINY_RCU and TINY_PREEMPT_RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:36:42 -07:00
Paul E. McKenney
990987511c rcu: Move rcu_head definition to types.h
Take a first step towards untangling Linux kernel header files by
placing the struct rcu_head definition into include/linux/types.h
and including include/linux/types.h in include/linux/rcupdate.h
where struct rcu_head used to be defined.  The actual inclusion point
for include/linux/types.h is with the rest of the #include directives
rather than at the point where struct rcu_head used to be defined,
as suggested by Mathieu Desnoyers.

Once this is in place, then header files that need only rcu_head
can include types.h rather than rcupdate.h.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2011-09-28 21:36:38 -07:00
Paul E. McKenney
b3fbab0571 rcu: Restore checks for blocking in RCU read-side critical sections
Long ago, using TREE_RCU with PREEMPT would result in "scheduling
while atomic" diagnostics if you blocked in an RCU read-side critical
section.  However, PREEMPT now implies TREE_PREEMPT_RCU, which defeats
this diagnostic.  This commit therefore adds a replacement diagnostic
based on PROVE_RCU.

Because rcu_lockdep_assert() and lockdep_rcu_dereference() are now being
used for things that have nothing to do with rcu_dereference(), rename
lockdep_rcu_dereference() to lockdep_rcu_suspicious() and add a third
argument that is a string indicating what is suspicious.  This third
argument is passed in from a new third argument to rcu_lockdep_assert().
Update all calls to rcu_lockdep_assert() to add an informative third
argument.

Also, add a pair of rcu_lockdep_assert() calls from within
rcu_note_context_switch(), one complaining if a context switch occurs
in an RCU-bh read-side critical section and another complaining if a
context switch occurs in an RCU-sched read-side critical section.
These are present only if the PROVE_RCU kernel parameter is enabled.

Finally, fix some checkpatch whitespace complaints in lockdep.c.

Again, you must enable PROVE_RCU to see these new diagnostics.  But you
are enabling PROVE_RCU to check out new RCU uses in any case, aren't you?

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:36:37 -07:00
Richard Cochran
f75159e993 ptp: fix L2 event message recognition
The IEEE 1588 standard defines two kinds of messages, event and general
messages. Event messages require time stamping, and general do not. When
using UDP transport, two separate ports are used for the two message
types.

The BPF designed to recognize event messages incorrectly classifies L2
general messages as event messages. This commit fixes the issue by
extending the filter to check the message type field for L2 PTP packets.
Event messages are be distinguished from general messages by testing
the "general" bit.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 00:32:03 -04:00
Yoshihiro Shimoda
d4fa0e35fd net: sh_eth: move the asm/sh_eth.h to include/linux/
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-28 13:41:51 -04:00
Vladimir Zapolskiy
f786ecba41 connector: add comm change event report to proc connector
Add an event to monitor comm value changes of tasks.  Such an event
becomes vital, if someone desires to control threads of a process in
different manner.

A natural characteristic of threads is its comm value, and helpfully
application developers have an opportunity to change it in runtime.
Reporting about such events via proc connector allows to fine-grain
monitoring and control potentials, for instance a process control daemon
listening to proc connector and following comm value policies can place
specific threads to assigned cgroup partitions.

It might be possible to achieve a pale partial one-shot likeness without
this update, if an application changes comm value of a thread generator
task beforehand, then a new thread is cloned, and after that proc
connector listener gets the fork event and reads new thread's comm value
from procfs stat file, but this change visibly simplifies and extends the
matter.

Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-28 13:41:50 -04:00
Sebastian Andrzej Siewior
f01536e3d6 Input: add a driver for TSC-40 serial touchscreen
This patch adds the TSC-40 serial touchscreen driver and should be
compatible with TSC-10 and TSC-25.

The driver was written by Linutronix on behalf of Bachmann electronic GmbH.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-09-28 10:24:14 -07:00
Alex Shi
9f26490412 slub: correct comments error for per cpu partial
Correct comment errors, that mistake cpu partial objects number as pages
number, may make reader misunderstand.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-09-27 23:03:30 +03:00
Rajkumar Manoharan
e9f935e3e8 nl80211/cfg80211: Add support to disable CCK rate for management frame
Add a new nl80211 attribute to specify whether to send the management
frames in CCK rate or not. As of now the wpa_supplicant is disabling
CCK rate at P2P init itself. So this patch helps to send P2P probe
request/probe response/action frames being sent at non CCK rate in 2GHz
without disabling 11b rates.

This attribute is used with NL80211_CMD_TRIGGER_SCAN and
NL80211_CMD_FRAME commands to disable CCK rate for management frame
transmission.

Cc: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-27 14:34:10 -04:00
Paul Bolle
395cf9691d doc: fix broken references
There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-27 18:08:04 +02:00
Linus Torvalds
b6c8069d35 vfs: remove LOOKUP_NO_AUTOMOUNT flag
That flag no longer makes sense, since we don't look up automount points
as eagerly any more.  Additionally, it turns out that the NO_AUTOMOUNT
handling was buggy to begin with: it would avoid automounting even for
cases where we really *needed* to do the automount handling, and could
return ENOENT for autofs entries that hadn't been instantiated yet.

With our new non-eager automount semantics, one discussion has been
about adding a AT_AUTOMOUNT flag to vfs_fstatat (and thus the
newfstatat() and fstatat64() system calls), but it's probably not worth
it: you can always force at least directory automounting by simply
adding the final '/' to the filename, which works for *all* of the stat
family system calls, old and new.

So AT_NO_AUTOMOUNT (and thus LOOKUP_NO_AUTOMOUNT) really were just a
result of our bad default behavior.

Acked-by: Ian Kent <raven@themaw.net>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-27 08:12:33 -07:00
Russell King
40d3e0f494 clk: provide prepare/unprepare functions
As discussed previously, there's the need on some platforms to run some
parts of clk_enable() in contexts which can schedule.  The solution
which was agreed upon was to provide clk_prepare() and clk_unprepare()
to contain this parts, while clk_enable() and clk_disable() perform
the atomic part.

This patch provides a common definition for clk_prepare() and
clk_unprepare() in linux/clk.h, and provides an upgrade path for
existing implementation and drivers: drivers can start using
clk_prepare() and clk_unprepare() once this patch is merged without
having to wait for platform support.  Platforms can then start to
provide these additional functions.

Eventually, HAVE_CLK_PREPARE will be removed from the kernel, and
everyone will have to provide these new APIs.

Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-27 09:25:02 +01:00
Linus Torvalds
d94c177bee vfs pathname lookup: Add LOOKUP_AUTOMOUNT flag
Since we've now turned around and made LOOKUP_FOLLOW *not* force an
automount, we want to add the ability to force an automount event on
lookup even if we don't happen to have one of the other flags that force
it implicitly (LOOKUP_OPEN, LOOKUP_DIRECTORY, LOOKUP_PARENT..)

Most cases will never want to use this, since you'd normally want to
delay automounting as long as possible, which usually implies
LOOKUP_OPEN (when we open a file or directory, we really cannot avoid
the automount any more).

But Trond argued sufficiently forcefully that at a minimum bind mounting
a file and quotactl will want to force the automount lookup.  Some other
cases (like nfs_follow_remote_path()) could use it too, although
LOOKUP_DIRECTORY would work there as well.

This commit just adds the flag and logic, no users yet, though.  It also
doesn't actually touch the LOOKUP_NO_AUTOMOUNT flag that is related, and
was made irrelevant by the same change that made us not follow on
LOOKUP_FOLLOW.

Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-26 17:44:55 -07:00
Andiry Xu
65580b4321 xHCI: set USB2 hardware LPM
If the device pass the USB2 software LPM and the host supports hardware
LPM, enable hardware LPM for the device to let the host decide when to
put the link into lower power state.

If hardware LPM is enabled for a port and driver wants to put it into
suspend, it must first disable hardware LPM, resume the port into U0,
and then suspend the port.

Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-26 15:51:10 -07:00
Andiry Xu
1ff4df5684 usbcore: check device's LPM capability
Check device's LPM capability by examining the bmAttibutes field of the
USB2.0 Extension Descriptor.

Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-26 15:51:08 -07:00
Andiry Xu
3148bf041d usbcore: get BOS descriptor set
This commit gets BOS(Binary Device Object Store) descriptor set for Super
Speed devices and High Speed devices which support BOS descriptor.

BOS descriptor is used to report additional USB device-level capabilities
that are not reported via the Device descriptor. By getting BOS descriptor
set, driver can check device's device-level capability such as LPM
capability.

Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-26 15:51:08 -07:00
Richard Cochran
3ce23fa978 net: introduce ptp one step time stamp mode for sync packets
The IEEE 1588 standard (PTP) has a provision for a "one step" mode, where
time stamps on outgoing event packets are inserted into the packet by the
hardware on the fly. This patch adds a new flag for the SIOCSHWTSTAMP
ioctl that lets user space programs request this mode.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-26 16:02:46 -04:00
Rafael J. Wysocki
cd0ea672f5 PM / Domains: Split device PM domain data into base and need_restore
The struct pm_domain_data data type is defined in such a way that
adding new fields specific to the generic PM domains code will
require include/linux/pm.h to be modified.  As a result, data types
used only by the generic PM domains code will be defined in two
headers, although they all should be defined in pm_domain.h and
pm.h will need to include more headers, which won't be very nice.

For this reason change the definition of struct pm_subsys_data
so that its domain_data member is a pointer, which will allow
struct pm_domain_data to be subclassed by various PM domains
implementations.  Remove the need_restore member from
struct pm_domain_data and make the generic PM domains code
subclass it by adding the need_restore member to the new data type.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-09-26 20:22:02 +02:00
Rafael J. Wysocki
0d41da2e31 Merge branch 'pm-fixes' into pm-domains
Merge commit e8b364b88c
(PM / Clocks: Do not acquire a mutex under a spinlock) fixing
a regression in drivers/base/power/clock_ops.c.

Conflicts:
	drivers/base/power/clock_ops.c
2011-09-26 20:12:45 +02:00
Benjamin Tissoires
b77c3920e9 HID: add autodetection of multitouch devices
As mentioned by http://www.microsoft.com/whdc/device/input/DigitizerDrvs_touch.mspx
multitouch devices are those that have the input report HID_CONTACTID.

This patch detects this and unloads the generic-usb driver.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@enac.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-26 14:18:18 +02:00
Milan Broz
983c7db347 dm crypt: always disable discard_zeroes_data
If optional discard support in dm-crypt is enabled, discards requests
bypass the crypt queue and blocks of the underlying device are discarded.
For the read path, discarded blocks are handled the same as normal
ciphertext blocks, thus decrypted.

So if the underlying device announces discarded regions return zeroes,
dm-crypt must disable this flag because after decryption there is just
random noise instead of zeroes.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-09-25 23:26:21 +01:00
Avi Kivity
7460fb4a34 KVM: Fix simultaneous NMIs
If simultaneous NMIs happen, we're supposed to queue the second
and next (collapsing them), but currently we sometimes collapse
the second into the first.

Fix by using a counter for pending NMIs instead of a bool; since
the counter limit depends on whether the processor is currently
in an NMI handler, which can only be checked in vcpu context
(via the NMI mask), we add a new KVM_REQ_NMI to request recalculation
of the counter.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:52:59 +03:00
Jan Kiszka
bd80158aff KVM: Clean up and extend rate-limited output
The use of printk_ratelimit is discouraged, replace it with
pr*_ratelimited or __ratelimit. While at it, convert remaining
guest-triggerable printks to rate-limited variants.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:52:43 +03:00
Alexander Graf
930b412a00 KVM: PPC: Enable the PAPR CAP for Book3S
Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
mode.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:26 +03:00
Alexander Graf
a15bd354f0 KVM: PPC: Add support for explicit HIOR setting
Until now, we always set HIOR based on the PVR, but this is just wrong.
Instead, we should be setting HIOR explicitly, so user space can decide
what the initial HIOR value is - just like on real hardware.

We keep the old PVR based way around for backwards compatibility, but
once user space uses the SREGS based method, we drop the PVR logic.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:23 +03:00
Sasha Levin
743eeb0b01 KVM: Intelligent device lookup on I/O bus
Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
is to call the read or write callback for each device registered
on the bus until we find a device which handles it.

Since the number of devices on a bus can be significant due to ioeventfds
and coalesced MMIO zones, this leads to a lot of overhead on each IO
operation.

Instead of registering devices, we now register ranges which points to
a device. Lookup is done using an efficient bsearch instead of a linear
search.

Performance test was conducted by comparing exit count per second with
200 ioeventfds created on one byte and the guest is trying to access a
different byte continuously (triggering usermode exits).
Before the patch the guest has achieved 259k exits per second, after the
patch the guest does 274k exits per second.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:59 +03:00
Sasha Levin
2b3c246a68 KVM: Make coalesced mmio use a device per zone
This patch changes coalesced mmio to create one mmio device per
zone instead of handling all zones in one device.

Doing so enables us to take advantage of existing locking and prevents
a race condition between coalesced mmio registration/unregistration
and lookups.

Suggested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:57 +03:00
Sasha Levin
8c3ba334f8 KVM: x86: Raise the hard VCPU count limit
The patch raises the hard limit of VCPU count to 254.

This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.

To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.

Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Suggested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:57 +03:00
Laurent Pinchart
1cd7acc4ef USB: export video.h to the includes available for userspace
The uvcvideo extension unit API requires constants defined in the
video.h header. Add it to the list of includes exported to userspace.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-09-23 23:07:26 -03:00
David S. Miller
fb7a6d4e7d Merge git://github.com/Jkirsher/net-next 2011-09-23 13:56:44 -04:00
Greg Rose
6777829cfe pci: Add flag indicating device has been assigned by KVM
Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines.  By adding a flag for use by the KVM module
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.

CC: Ian Campbell <ijc@hellion.org.uk>
CC: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-09-23 09:05:44 -07:00
Colin Cross
ab10023e00 cpu_pm: Add cpu power management notifiers
During some CPU power modes entered during idle, hotplug and
suspend, peripherals located in the CPU power domain, such as
the GIC, localtimers, and VFP, may be powered down.  Add a
notifier chain that allows drivers for those peripherals to
be notified before and after they may be reset.

Notified drivers can include VFP co-processor, interrupt controller
and it's PM extensions, local CPU timers context save/restore which
shouldn't be interrupted. Hence CPU PM event APIs  must be called
with interrupts disabled.

Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-and-Acked-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Kevin Hilman <khilman@ti.com>
Tested-by: Vishwanath BS <vishwanath.bs@ti.com>
2011-09-23 12:05:29 +05:30
Søren Holm
06315348b1 serial: Support the EFR-register of XR1715x uarts.
The EFR (Enhenced-Features-Register) is located at a different offset
than the other devices supporting UART_CAP_EFR. This change add a special
setup quick to set UPF_EXAR_EFR on the port. UPF_EXAR_EFR is then used to
the port type to PORT_XR17D15X since it is for sure a XR17D15X uart.

Signed-off-by: Søren Holm <sgh@sgh.dk>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-22 15:50:38 -07:00
Stephen Warren
aba3dfff9a dt: add empty for_each_child_of_node, of_find_property
The patch adds a couple empty functions for non-dt build, so that
drivers migrating to dt can save some '#ifdef CONFIG_OF'.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-09-22 11:29:17 -06:00
Shawn Guo
611cad7201 dt: add of_alias_scan and of_alias_get_id
The patch adds function of_alias_scan to populate a global lookup
table with the properties of 'aliases' node and function
of_alias_get_id for drivers to find alias id from the lookup table.

v3: Split out automatic addition of aliases on id lookup so that it can be
    debated separately from the core functionality.
v2: - Add of_chosen/of_aliases populating and of_alias_scan() invocation
    for OF_PROMTREE.
    - Add locking
    - rework parse loop

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-09-22 11:12:10 -06:00
Linus Torvalds
fae3f6f2ee Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] kvm: extension capability for new address space layout
  [S390] kvm: fix address mode switching
2011-09-22 09:32:21 -07:00
Peter Ujfalusi
ab6cf13943 ASoC/MFD: twl6040: Combine bit definitions for Headset control registers
Use one set of defines for the HS bits, since they are identical in both
control register.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-22 17:20:22 +01:00
Peter Ujfalusi
d17bf31832 ASoC: twl6040: Introduce SW only shadow register
Software only shadow register to be used by the driver.
For example Earpiece path will need this shadow register.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-22 17:20:21 +01:00
Mattias Wallin
489bccea63 clocksource: add DBX500 PRCMU Timer support
This patch adds the DBX500 PRCMU Timer driver as a clocksource
and as sched_clock.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mattias Wallin <mattias.wallin@stericsson.com>
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-09-22 15:42:57 +02:00
Takashi Iwai
af1910a817 Merge branch 'topic/asoc' into topic/remove-irqf_disable 2011-09-22 09:56:12 +02:00
David S. Miller
8decf86879 Merge branch 'master' of github.com:davem330/net
Conflicts:
	MAINTAINERS
	drivers/net/Kconfig
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
	drivers/net/ethernet/broadcom/tg3.c
	drivers/net/wireless/iwlwifi/iwl-pci.c
	drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
	drivers/net/wireless/rt2x00/rt2800usb.c
	drivers/net/wireless/wl12xx/main.c
2011-09-22 03:23:13 -04:00
Linus Torvalds
fed678dc8a Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  floppy: use del_timer_sync() in init cleanup
  blk-cgroup: be able to remove the record of unplugged device
  block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request
  mm: Add comment explaining task state setting in bdi_forker_thread()
  mm: Cleanup clearing of BDI_pending bit in bdi_forker_thread()
  block: simplify force plug flush code a little bit
  block: change force plug flush call order
  block: Fix queue_flag update when rq_affinity goes from 2 to 1
  block: separate priority boosting from REQ_META
  block: remove READ_META and WRITE_META
  xen-blkback: fixed indentation and comments
  xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.
2011-09-21 13:20:21 -07:00
Ohad Ben-Cohen
300bab9770 hwspinlock/core: register a bank of hwspinlocks in a single API call
Hardware Spinlock devices usually contain numerous locks (known
devices today support between 32 to 256 locks).

Originally hwspinlock core required drivers to register (and later,
when needed, unregister) each lock separately.

That worked, but required hwspinlocks drivers to do a bit extra work
when they were probed/removed.

This patch changes hwspin_lock_{un}register() to allow a bank of
hwspinlocks to be {un}registered in a single invocation.

A new 'struct hwspinlock_device', which contains an array of 'struct
hwspinlock's is now being passed to the core upon registration (so
instead of wrapping each struct hwspinlock, a priv member has been added
to allow drivers to piggyback their private data with each hwspinlock).

While at it, several per-lock members were moved to be per-device:
1. struct device *dev
2. struct hwspinlock_ops *ops

In addition, now that the array of locks is handled by the core,
there's no reason to maintain a per-lock 'int id' member: the id of the
lock anyway equals to its index in the bank's array plus the bank's
base_id.
Remove this per-lock id member too, and instead use a simple pointers
arithmetic to derive it.

As a result of this change, hwspinlocks drivers are now simpler and smaller
(about %20 code reduction) and the memory footprint of the hwspinlock
framework is reduced.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
2011-09-21 19:45:34 +03:00
Ohad Ben-Cohen
c536abfdf5 hwspinlock/core: remove stubs for register/unregister
hwspinlock drivers must anyway select CONFIG_HWSPINLOCK,
so there's no point in having register/unregister stubs.

Removing those stubs will only make it easier for developers
to catch CONFIG_HWSPINLOCK mis-.config-urations.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
2011-09-21 19:45:34 +03:00
Ohad Ben-Cohen
c3c1250e93 hwspinlock/core/omap: fix id issues on multiple hwspinlock devices
hwspinlock devices provide system-wide hardware locks that are used
by remote processors that have no other way to achieve synchronization.

To achieve that, each physical lock must have a system-wide id number
that is agreed upon, otherwise remote processors can't possibly assume
they're using the same hardware lock.

Usually boards have a single hwspinlock device, which provides several
hwspinlocks, and in this case, they can be trivially numbered 0 to
(num-of-locks - 1).

In case boards have several hwspinlocks devices, a different base id
should be used for each hwspinlock device (they can't all use 0 as
a starting id!).

While this is certainly not common, it's just plain wrong to just
silently use 0 as a base id whenever the hwspinlock driver is probed.

This patch provides a hwspinlock_pdata structure, that boards can use
to set a different base id for each of the hwspinlock devices they may
have, and demonstrates how to use it with the omap hwspinlock driver.

While we're at it, make sure the hwspinlock core prints an explicit
error message in case an hwspinlock is registered with an id number
that already exists; this will help users catch such base id issues.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Acked-by: Tony Lindgren <tony@atomide.com>
2011-09-21 19:45:32 +03:00
Hans Verkuil
74a4579086 [media] videodev2.h: add V4L2_CTRL_FLAG_VOLATILE
Add a new VOLATILE control flag that is set for volatile controls.
That way applications know whether the value of the control is volatile
(i.e. can change continuously) or not.

Until now this was an internal property, but it is useful to know in
userspace as well.

A typical use case is the gain value when autogain is on. In that case the
hardware will continuously adjust the gain based various environmental
factors.

This patch just adds and documents the flag, it's not yet used.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-09-21 09:52:21 -03:00
hank
cbbc719fcc time: Change jiffies_to_clock_t() argument type to unsigned long
The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it
to a sign-extended u64 type.

Change the type to unsigned long so we get the correct result.

Signed-off-by: hank <pyu@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <stable@kernel.org>
[ build fix ]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:28:51 +02:00
Suresh Siddha
d3f138106b iommu: Rename the DMAR and INTR_REMAP config options
Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.

Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.

And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:22:03 +02:00
Suresh Siddha
f5d1b97bcd iommu: Cleanup ifdefs in detect_intel_iommu()
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.386003047@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:21:57 +02:00
Suresh Siddha
318fe7df9d iommu: Move IOMMU specific code to intel-iommu.c
Move the IOMMU specific routines to intel-iommu.c leaving the
dmar.c to the common ACPI dmar code shared between DMA-remapping
and Interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.282401285@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:21:54 +02:00
Suresh Siddha
41750d31fc x86, x2apic: Enable the bios request for x2apic optout
On the platforms which are x2apic and interrupt-remapping
capable, Linux kernel is enabling x2apic even if the BIOS
doesn't. This is to take advantage of the features that x2apic
brings in.

Some of the OEM platforms are running into issues because of
this, as their bios is not x2apic aware. For example, this was
resulting in interrupt migration issues on one of the platforms.
Also if the BIOS SMI handling uses APIC interface to send SMI's,
then the BIOS need to be aware of x2apic mode that OS has
enabled.

On some of these platforms, BIOS doesn't have a HW mechanism to
turnoff the x2apic feature to prevent OS from enabling it.

To resolve this mess, recent changes to the VT-d2 specification:

 http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf

includes a mechanism that provides BIOS a way to request system
software to opt out of enabling x2apic mode.

Look at the x2apic optout flag in the DMAR tables before
enabling the x2apic mode in the platform. Also print a warning
that we have disabled x2apic based on the BIOS request.

Kernel boot parameter "intremap=no_x2apic_optout" can be used to
override the BIOS x2apic optout request.

Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.171766616@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:21:50 +02:00
Suresh Jayaraman
75df713627 block: document blk-plug
Thus spake Andrew Morton:

"And I have the usual maintainability whine.  If someone comes up to
vmscan.c and sees it calling blk_start_plug(), how are they supposed to
work out why that call is there?  They go look at the blk_start_plug()
definition and it is undocumented.  I think we can do better than this?"

Adapted from the LWN article - http://lwn.net/Articles/438256/ by Jens
Axboe and from an earlier attempt by Shaohua Li to document blk-plug.

[akpm@linux-foundation.org: grammatical and spelling tweaks]
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-09-21 10:00:16 +02:00
Vinod Koul
0745c9a5e3 Merge branch 'samsung_dma' into next 2011-09-21 11:53:30 +05:30
Brian Norris
7387ce7732 mtd: define `mtd_is_*()' functions
These functions can be used instead of referencing -EUCLEAN and -EBADMSG
all over the place. They should help make code a little bit more
readable.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-21 09:19:06 +03:00
Michael Tandy
5eb9f900e5 Input: adxl34x - documentation cleanup
This patch clarifies a few bits of documentation in the header file
for the adxl34x driver.

Signed-off-by: Michael Tandy <lkml@mkt.me.uk>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-09-20 22:46:32 -07:00
Mimi Zohar
b78049831f lib: add error checking to hex2bin
hex2bin converts a hexadecimal string to its binary representation.
The original version of hex2bin did not do any error checking.  This
patch adds error checking and returns the result.

Changelog v1:
- removed unpack_hex_byte()
- changed return code from boolean to int

Changelog:
- use the new unpack_hex_byte()
- add __must_check compiler option (Andy Shevchenko's suggestion)
- change function API to return error checking result
  (based on Tetsuo Handa's initial patch)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2011-09-20 23:24:44 -04:00
Eric Dumazet
d24f22f3df ip6_tunnel: add optional fwmark inherit
Add IP6_TNL_F_USE_ORIG_FWMARK to ip6_tunnel, so that ip6_tnl_xmit2()
makes a route lookup taking into account skb->fwmark and doesnt cache
lookup result.

This permits more flexibility in policies and firewall setups.

To setup such a tunnel, "fwmark inherit" option should be added to "ip
-f inet6 tunnel" command.

Reported-by: Anders Franzen <Anders.Franzen@ericsson.com>
CC: Hans Schillström <hans.schillstrom@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-20 14:50:00 -04:00
Ilan Elias
8b3fe7b591 NFC: Add dev_up and dev_down control operations
Add 2 new nfc control operations:
dev_up to turn on the nfc device
dev_down to turn off the nfc device

Signed-off-by: Ilan Elias <ilane@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-20 14:43:49 -04:00
Christian Borntraeger
b6cf8788a3 [S390] kvm: extension capability for new address space layout
598841ca99 ([S390] use gmap address
spaces for kvm guest images) changed kvm on s390 to use a separate
address space for kvm guests. We can now put KVM guests anywhere
in the user address mode with a size up to 8PB - as long as the
memory is 1MB-aligned. This change was done without KVM extension
capability bit.
The change was added after 3.0, but we still have a chance to add
a feature bit before 3.1 (keeping the releases in a sane state).
We use number 71 to avoid collisions with other pending kvm patches
as requested by Alexander Graf.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-09-20 17:07:34 +02:00
Mark Brown
523d9cfbb2 mfd: Support software initiated shutdown of WM831x PMICs
In systems where there is no hardware signal from the processor to the
PMIC to initiate the final power off sequence we must initiate the
shutdown with a register write to the PMIC. Support such systems in the
driver. Since this may prevent a full shutdown of the system platform
data is used to enable the feature.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-09-20 13:02:09 +01:00
Rob Herring
5bd078dda4 irq: Add declaration of irq_domain_simple_ops to irqdomain.h
irq_domain_simple_ops is exported, but is not declared in irqdomain.h,
so add it.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: marc.zyngier@arm.com
Cc: thomas.abraham@linaro.org
Cc: jamie@jamieiles.com
Cc: b-cousson@ti.com
Cc: shawn.guo@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree-discuss@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1316017900-19918-2-git-send-email-robherring2@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-20 12:16:22 +02:00
Anton Blanchard
590e4d8571 sched: Allow SD_NODES_PER_DOMAIN to be overridden
We want to override the default value of SD_NODES_PER_DOMAIN on ppc64,
so move it into linux/topology.h.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:21 +10:00
Benjamin Herrenschmidt
1cce058b29 Merge remote-tracking branch 'origin/master' into next
(Merge in order to get the PCIe mps/mrss code fixes)
2011-09-20 15:52:38 +10:00
Milton Miller
3a8f7558e4 dma-mapping: Add get_required_mask if arch overrides default
If an architecture sets ARCH_HAS_DMA_GET_REQUIRED_MASK and has settable
dma_map_ops, the required mask may change by the ops implementation.
For example, a system that always has an mmu inline may only require 32
bits while a swiotlb would desire bits to cover all of memory.

Therefore add the field if the architecture does not use the generic
definition of dma_get_required_mask. The first use will by by powerpc.
Note that this does add some dependency on the order in which files are
visible here.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
2011-09-20 09:18:38 +10:00
Peter Ujfalusi
a52762eee9 ASoC: twl6040: Chip initialization cleanup
There is no need to write to the vio registers at probe time, since most
them either read only, or shared with MFD or not used.
On the other hand it is a good idea to updated the ASICREV register in
the cache at this time.

After power up we need to restore some registers. Clean up the list to
contain only the registers we are going to restore.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 23:15:57 +01:00
Peter Ujfalusi
a69882aec3 MFD: twl6040: Add accessor for revision ID
For client driver to use, if they need chip resvision information.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 23:15:38 +01:00
Jouni Malinen
c9df56b48e cfg80211/nl80211: Add PMKSA caching candidate event
When the driver (or most likely firmware) decides which AP to use
for roaming based on internal scan result processing, user space
needs to be notified of PMKSA caching candidates to allow RSN
pre-authentication to be used.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-19 16:10:14 -04:00
Rafał Miłecki
3861b2c5d9 bcma: cc: export more control functions
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-19 16:10:11 -04:00
Eliad Peller
0c28ec587a cfg80211: add cfg80211_find_vendor_ie() function
Add function to find vendor-specific ie (along with
vendor-specific ie struct definition and P2P OUI values)

Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-19 15:49:11 -04:00
John W. Linville
b53d63ecce Merge branch 'master' of ssh://infradead/~/public_git/wireless-next into for-davem 2011-09-19 15:00:16 -04:00
Mark Brown
92afb286d7 regmap: Allow drivers to control cache_only flag
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:37 +01:00
Mark Brown
39a58439d6 regmap: Prototype regcache_sync()
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:36 +01:00
Dimitris Papastamos
2cbbb579bc regmap: Add the LZO cache support
This patch adds support for LZO compression when storing the register
cache.

For a typical device whose register map would normally occupy 25kB or 50kB
by using the LZO compression technique, one can get down to ~5-7kB.  There
might be a performance penalty associated with each individual read/write
due to decompressing/compressing the underlying cache, however that should not
be noticeable.  These memory benefits depend on whether the target architecture
can get rid of the memory occupied by the original register defaults cache
which is marked as __devinitconst.  Nevertheless there will be some memory
gain even if the target architecture can't get rid of the original register
map, this should be around ~30-32kB instead of 50kB.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:33 +01:00
Dimitris Papastamos
28644c809f regmap: Add the rbtree cache support
This patch adds support for the rbtree cache compression type.

Each rbnode manages a variable length block of registers.  There can be no
two nodes with overlapping blocks.  Each block has a base register and a
currently top register, all the other registers, if any, lie in between these
two and in ascending order.

The reasoning behind the construction of this rbtree is simple.  In the
snd_soc_rbtree_cache_init() function, we iterate over the register defaults
provided by the regcache core.  For each register value that is non-zero we
insert it in the rbtree.  In order to determine in which rbnode we need
to add the register, we first look if there is another register already
added that is adjacent to the one we are about to add.  If that is the case
we append it in that rbnode block, otherwise we create a new rbnode
with a single register in its block and add it to the tree.

There are various optimizations across the implementation to speed up lookups
by caching the most recently used rbnode.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:33 +01:00
Dimitris Papastamos
195af65ca9 regmap: Add the indexed cache support
This is the simplest form of a cache available in regcache.  Any
registers whose default value is 0 are ignored.  If any of those
registers are modified in the future, they will be placed in the
cache on demand.  The cache layout is essentially using the provided
register defaults by the regcache core directly and does not re-map
it to another representation.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:32 +01:00
Dimitris Papastamos
9fabe24e9b regmap: Introduce caching support
This patch introduces caching support for regmap.  The regcache API
has evolved essentially out of ASoC soc-cache so most of the actual
caching types (except LZO) have been tested in the past.

The purpose of regcache is to optimize in time and space the handling
of register caches.  Time optimization is achieved by not having to go
over a slow bus like I2C to read the value of a register, instead it is
cached locally in memory and can be retrieved faster.  Regarding space
optimization, some of the cache types are better at packing the caches,
for e.g. the rbtree and the LZO caches.  By doing this the sacrifice in
time still wins over doing I2C transactions.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-19 19:06:31 +01:00
Steven Rostedt
6249687f76 tracing: Add a counter clock for those that do not trust clocks
When debugging tight race conditions, it can be helpful to have a
synchronized tracing method. Although in most cases the global clock
provides this functionality, if timings is not the issue, it is more
comforting to know that the order of events really happened in a precise
order.

Instead of using a clock, add a "counter" that is simply an incrementing
atomic 64bit counter that orders the events as they are perceived to
happen.

The trace_clock_counter() is added from the attempt by Peter Zijlstra
trying to convert the trace_clock_global() to it. I took Peter's counter
code and made trace_clock_counter() instead, and added it to the choice
of clocks. Just echo counter > /debug/tracing/trace_clock to activate
it.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-19 11:35:58 -04:00
Viresh Kumar
b7f69d9d42 dmaengine/amba-pl08x: Add support for sg len greater than one for slave transfers
Untill now, sg_len greater than one is not supported. This patch adds support to
do that.

Note: Still, if peripheral is flow controller, sg_len can't be greater that one.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-19 15:13:06 +05:30
Guennadi Liakhovetski
937bb6e4c6 serial: sh-sci: don't filter on DMA device, use only channel ID
On some sh-mobile systems there are more than one DMA controllers, that
can be used for serial ports. Specifying a DMA device in sh-sci platform
data unnecessarily restricts the driver to only use one DMA controller.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
[Fixed the trivial conflict in include/linux/serial_sci.h]
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-19 08:39:17 +05:30
Linus Torvalds
6bf3b0dc32 Merge branch 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6
* 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6:
  mfd: Fix omap-usb-host build failure
  mfd: Make omap-usb-host TLL mode work again
  mfd: Set MAX8997 irq pointer
  mfd: Fix initialisation of tps65910 interrupts
  mfd: Check for twl4030-madc NULL pointer
  mfd: Copy the device pointer to the twl4030-madc structure
  mfd: Rename wm8350 static gpio_set_debounce()
  mfd: Fix value of WM8994_CONFIGURE_GPIO
2011-09-18 18:18:55 -07:00
Timur Tabi
2b7a905dd0 drivers/video: fsl-diu-fb: remove unusued MEM_ALLOC_THRESHOLD
If there was ever any code that used MEM_ALLOC_THRESHOLD, it was removed a
long time ago.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-09-18 20:08:58 +00:00
Timur Tabi
251b9b0d40 drivers/video: fsl-diu-fb: the video buffer is not I/O memory
The video buffer is not uncached memory-mapped I/O, so don't tag the virtual
address as __iomem.  It's also not a u8*.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-09-18 20:08:58 +00:00
Timur Tabi
4a64e49df2 drivers/video: fsl-diu-fb: remove unused ioctls
Remove some unused ioctl commands, and treat those commands as unsupported
instead of ignored.

Also remove struct mfb_alpha, which isn't used by any ioctl.  It may have
been once intended for MFB_SET_ALPHA, but that ioctl uses a different
data structure.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
2011-09-18 20:08:56 +00:00
Linus Torvalds
b0e7031ac0 Merge git://github.com/davem330/net
* git://github.com/davem330/net: (62 commits)
  ipv6: don't use inetpeer to store metrics for routes.
  can: ti_hecc: include linux/io.h
  IRDA: Fix global type conflicts in net/irda/irsysctl.c v2
  net: Handle different key sizes between address families in flow cache
  net: Align AF-specific flowi structs to long
  ipv4: Fix fib_info->fib_metrics leak
  caif: fix a potential NULL dereference
  sctp: deal with multiple COOKIE_ECHO chunks
  ibmveth: Fix checksum offload failure handling
  ibmveth: Checksum offload is always disabled
  ibmveth: Fix issue with DMA mapping failure
  ibmveth: Fix DMA unmap error
  pch_gbe: support ML7831 IOH
  pch_gbe: added the process of FIFO over run error
  pch_gbe: fixed the issue which receives an unnecessary packet.
  sfc: Use 64-bit writes for TX push where possible
  Revert "sfc: Use write-combining to reduce TX latency" and follow-ups
  bnx2x: Fix ethtool advertisement
  bnx2x: Fix 578xx link LED
  bnx2x: Fix XMAC loopback test
  ...
2011-09-18 11:02:26 -07:00
Ingo Molnar
bfa322c48d Merge branch 'linus' into sched/core
Merge reason: We are queueing up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-18 14:01:39 +02:00
Greg Kroah-Hartman
073b854693 Merge branch 'for-next' of git://gitorious.org/usb/usb into usb-next
* 'for-next' of git://gitorious.org/usb/usb: (47 commits)
  usb: musb: Enable DMA mode1 RX for transfers without short packets
  usb: musb: fix build breakage
  usb: gadget: audio: queue wLength-sized requests
  usb: gadget: audio: actually support both speeds
  usb: gadget: storage: make FSG_NUM_BUFFERS variable size
  USB: gadget: storage: remove alignment assumption
  usb: gadget: storage: adapt logic block size to bound block devices
  usb: dwc3: gadget: improve debug on link state change
  usb: dwc3: omap: set idle and standby modes
  usb: dwc3: ep0: introduce ep0_expect_in flag
  usb: dwc3: ep0: giveback requests on stall_and_restart
  usb: dwc3: gadget: drop the useless dma_sync_single* calls
  usb: dwc3: gadget: fix GCTL programming
  usb: dwc3: define ScaleDown macro helper
  usb: dwc3: Fix definition of DWC3_GCTL_U2RSTECN
  usb: dwc3: gadget: do not map/unmap ZLP transfers
  usb: dwc3: omap: fix IRQ handling
  usb: dwc3: omap: change IRQ name to dwc3-omap
  usb: dwc3: add module.h to dwc3-omap.c and core.c
  usb: dwc3: omap: distinguish between SW and HW modes
  ...
2011-09-18 01:45:29 -07:00
Michal Nazarewicz
e538dfdae8 usb: Provide usb_speed_string() function
In a few places in the kernel, the code prints
a human-readable USB device speed (eg. "high speed").
This involves a switch statement sometimes wrapped
around in ({ ... }) block leading to code repetition.

To mitigate this issue, this commit introduces
usb_speed_string() function, which returns
a human-readable name of provided speed.

It also changes a few places switch was used to use
this new function.  This changes a bit the way the
speed is printed in few instances at the same time
standardising it.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-18 01:29:04 -07:00
Mauro Carvalho Chehab
7577911244 Merge tag 'v3.1-rc6' into staging/for_v3.2
* tag 'v3.1-rc6': (1902 commits)
  Linux 3.1-rc6
  ioctl: register LTTng ioctl
  fuse: fix memory leak
  fuse: fix flock breakage
  Btrfs: add dummy extent if dst offset excceeds file end in
  Btrfs: calc file extent num_bytes correctly in file clone
  btrfs: xattr: fix attribute removal
  Btrfs: fix wrong nbytes information of the inode
  Btrfs: fix the file extent gap when doing direct IO
  Btrfs: fix unclosed transaction handle in btrfs_cont_expand
  Btrfs: fix misuse of trans block rsv
  Btrfs: reset to appropriate block rsv after orphan operations
  Btrfs: skip locking if searching the commit root in csum lookup
  btrfs: fix warning in iput for bad-inode
  Btrfs: fix an oops when deleting snapshots
  [media] vp7045: fix buffer setup
  [media] nuvoton-cir: simplify raw IR sample handling
  [media] [Resend] viacam: Don't explode if pci_find_bus() returns NULL
  [media] v4l2: Fix documentation of the codec device controls
  [media] gspca - sonixj: Fix the darkness of sensor om6802 in 320x240
  ...
2011-09-17 10:29:49 -03:00
Ben Hutchings
473e64ee46 ethtool: Update ethtool_rxnfc::rule_cnt on return from ETHTOOL_GRXCLSRLALL
A user-space process must use ETHTOOL_GRXCLSRLCNT to find the number
of classification rules, then allocate a buffer of the right size,
then use ETHTOOL_GRXCLSRLALL to fill the buffer.  If some other
process inserts or deletes a rule between those two operations,
the user buffer might turn out to be the wrong size.

If it's too small, the return value will be -EMSGSIZE.  But if it's
too large, there is no indication of this.  Fix this by updating
the rule_cnt field on return.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 19:25:10 -04:00
Ben Hutchings
815c7db5c8 ethtool: Clean up definitions of rule location arrays in RX NFC
Correct the description of ethtool_rxnfc::rule_locs; it is an array
of currently used locations, not all possible valid locations.

Add note that drivers must not use ethtool_rxnfc::rule_locs.

The rule_locs argument to ethtool_ops::get_rxnfc is either NULL or a
pointer to an array of u32, so change the parameter type accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 19:25:10 -04:00
Ben Hutchings
434495c50e ethtool: Explicitly state that RX NFC rule locations are priorities
The location of an RX flow classification rule is needed to identify
it for retrieval, replacement or deletion.  However it also defines
the priority of the rule in the case that a flow is matched by
multiple rules.  This is what I intended to imply by referring to the
use of a TCAM, commonly used to implement that behaviour.

However there are other ways this can be done, and it is better to
specify this explicitly.  Further, I want to add the option for
automatic selection of rule locations.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 19:25:10 -04:00
Ben Hutchings
9927c893f4 ethtool: Make struct ethtool_rxnfc kernel-doc more self-consistent
Refer consistently to 'classification rules' or just 'rules' rather
than 'filter specifications' or 'filter rules'.

Refer consistently to rule 'locations' and not 'indices'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 19:25:10 -04:00
stephen hemminger
d97a077a15 wan: make LAPB callbacks const
This is compile tested only.
Suggested by dumpster diving in PAX.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 19:20:20 -04:00
Oliver Hartkopp
c1aabdf379 can-gw: add netlink based CAN routing
This patch adds a CAN Gateway/Router to route (and modify) CAN frames.

It is based on the PF_CAN core infrastructure for msg filtering and msg
sending and can optionally modify routed CAN frames on the fly.
CAN frames can *only* be routed between CAN network interfaces (one hop).
They can be modified with AND/OR/XOR/SET operations as configured by the
netlink configuration interface known e.g. from iptables. From the netlink
view this can-gw implements RTM_{NEW|DEL|GET}ROUTE for PF_CAN.

The CAN specific userspace tool to manage CAN routing entries can be found in
the CAN utils http://svn.berlios.de/wsvn/socketcan/trunk/can-utils/cangw.c
at the SocketCAN SVN on BerliOS.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-16 17:37:51 -04:00
Eliad Peller
910868db3f nl80211/cfg80211/mac80211: fix wme docs
Add/fix some missing docs.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-16 16:36:35 -04:00
David S. Miller
52b9aca7ae Merge branch 'master' of ../netdev/ 2011-09-16 01:09:02 -04:00
Jiri Pirko
4bc71cb983 net: consolidate and fix ethtool_ops->get_settings calling
This patch does several things:
- introduces __ethtool_get_settings which is called from ethtool code and
  from drivers as well. Put ASSERT_RTNL there.
- dev_ethtool_get_settings() is replaced by __ethtool_get_settings()
- changes calling in drivers so rtnl locking is respected. In
  iboe_get_rate was previously ->get_settings() called unlocked. This
  fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same
  problem. Also fixed by calling __dev_get_by_index() instead of
  dev_get_by_index() and holding rtnl_lock for both calls.
- introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create()
  so bnx2fc_if_create() and fcoe_if_create() are called locked as they
  are from other places.
- use __ethtool_get_settings() in bonding code

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

v2->v3:
	-removed dev_ethtool_get_settings()
	-added ASSERT_RTNL into __ethtool_get_settings()
	-prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock
	 around it and __ethtool_get_settings() call
v1->v2:
        add missing export_symbol
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits]
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-15 17:32:26 -04:00
Mark Einon
e8aaebc6b2 mii: Remove references to DP83840 PHY in mii.h
There are references to this PHY chip in the generic mii.h header, so removing them.
Re-jiggle the changed comments, in response to points raised by Ben Hutchings.

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-15 15:36:34 -04:00
Mark Einon
c44f7eb4c1 mii: Convert spaces to tabs in mii.h
Whitespace changes - spaces converted to tabs after each define name and value

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-15 15:36:34 -04:00
Michael S. Tsirkin
48c830120f net: copy userspace buffers on device forwarding
dev_forward_skb loops an skb back into host networking
stack which might hang on the memory indefinitely.
In particular, this can happen in macvtap in bridged mode.
Copy the userspace fragments to avoid blocking the
sender in that case.

As this patch makes skb_copy_ubufs extern now,
I also added some documentation and made it clear
the SKBTX_DEV_ZEROCOPY flag automatically instead
of doing it in all callers. This can be made into a separate
patch if people feel it's worth it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-15 14:49:44 -04:00
Eric Dumazet
946cedccbd tcp: Change possible SYN flooding messages
"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-15 14:49:43 -04:00
Jiri Kosina
e060c38434 Merge branch 'master' into for-next
Fast-forward merge with Linus to be able to merge patches
based on more recent version of the tree.
2011-09-15 15:08:18 +02:00
Jesper Juhl
e81b15168e Remove unneeded version.h includes from include/
It was pointed out by 'make versioncheck' that some includes of
linux/version.h are not needed in include/.
This patch removes them.

When I last posted the patch, the ceph bit was ACK'ed by Sage Weil, so
I've added that below.

The pwc-ioctl change generated quite a bit of discussion about V4L version
numbers in general, but as far as I can tell, no concensus was reached on
what the long term solution should be, so in the mean time I think we
could start by just removing the unneeded include, which is why I'm
resending the patch with that hunk still included.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Sage Weil <sage@newdream.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:57:06 +02:00
Joe Perches
1d273b929c drbd: Use angle brackets for system includes
Use the normal include style.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:02:57 +02:00
Russell King
4f5b04800a drivers/gpio/gpio-generic.c: fix build errors
Building a kernel with hotplug disabled results in a link failure:

  `bgpio_remove' referenced in section `___ksymtab_gpl+bgpio_remove' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o

This is because of bgpio_remove() is exported.  It is illegal to export
symbols which are discarded either at link time or as part of an
init/exit section.

Fix this by dropping the __devexit attributation from bgpio_remove().
Also drop the __devinit attributation from bgpio_init().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-14 18:09:38 -07:00
Johannes Weiner
185efc0f9a memcg: Revert "memcg: add memory.vmscan_stat"
Revert the post-3.0 commit 82f9d486e5 ("memcg: add
memory.vmscan_stat").

The implementation of per-memcg reclaim statistics violates how memcg
hierarchies usually behave: hierarchically.

The reclaim statistics are accounted to child memcgs and the parent
hitting the limit, but not to hierarchy levels in between.  Usually,
hierarchical statistics are perfectly recursive, with each level
representing the sum of itself and all its children.

Since this exports statistics to userspace, this may lead to confusion
and problems with changing things after the release, so revert it now,
we can try again later.

Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-14 18:09:38 -07:00
Mimi Zohar
566be59ab8 evm: permit mode bits to be updated
Before permitting 'security.evm' to be updated, 'security.evm' must
exist and be valid.  In the case that there are no existing EVM protected
xattrs, it is safe for posix acls to update the mode bits.

To differentiate between no 'security.evm' xattr and no xattrs used to
calculate 'security.evm', this patch defines INTEGRITY_NOXATTR.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2011-09-14 15:24:52 -04:00
Mimi Zohar
bf6d0f5dcd evm: posix acls modify i_mode
The posix xattr acls are 'system' prefixed, which normally would not
affect security.evm.  An interesting side affect of writing posix xattr
acls is their modifying of the i_mode, which is included in security.evm.

This patch updates security.evm when posix xattr acls are written.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2011-09-14 15:24:51 -04:00
Javier Cardona
2154c81c32 mac80211: Mesh data frames must have the QoS header
Per sec 7.1.3.5 of draft 12.0 of 802.11s, mesh frames indicate the
presence of the mesh control header in their QoS header.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-14 13:56:50 -04:00
Daniel Mack
00137425fe USB: Add endpoint usage definitions to ch9.h
The endpoint usage field is described in the USB 2.0 specification,
chapter 9.6.6.

Also, move the sync type fields block down by some lines to reflect the
fact that these are also stuffed in bmAttributes.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-09-14 17:06:47 +02:00
Ohad Ben-Cohen
4f3f8d9db3 iommu/core: Add fault reporting mechanism
Add iommu fault report mechanism to the IOMMU API, so implementations
could report about mmu faults (translation errors, hardware errors,
etc..).

Fault reports can be used in several ways:
- mere logging
- reset the device that accessed the faulting address (may be necessary
  in case the device is a remote processor for example)
- implement dynamic PTE/TLB loading

A dedicated iommu_set_fault_handler() API has been added to allow
users, who are interested to receive such reports, to provide
their handler.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-14 15:35:36 +02:00
Mi Jinlong
038c01598e SUNRPC: compare scopeid for link-local addresses
For ipv6 link-local addresses, sunrpc do not compare those scope id.
This patch let sunrpc compares scope id only on link-local addresses.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-14 08:22:02 -04:00
Mi Jinlong
849a1cf13d SUNRPC: Replace svc_addr_u by sockaddr_storage
For IPv6 local address, lockd can not callback to client for
missing scope id when binding address at inet6_bind:

 324       if (addr_type & IPV6_ADDR_LINKLOCAL) {
 325               if (addr_len >= sizeof(struct sockaddr_in6) &&
 326                   addr->sin6_scope_id) {
 327                       /* Override any existing binding, if another one
 328                        * is supplied by user.
 329                        */
 330                       sk->sk_bound_dev_if = addr->sin6_scope_id;
 331               }
 332
 333               /* Binding to link-local address requires an interface */
 334               if (!sk->sk_bound_dev_if) {
 335                       err = -EINVAL;
 336                       goto out_unlock;
 337               }

Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
besides address.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-14 08:21:48 -04:00
Mark Brown
da07ecd93b regulator: Implement deferred disable support
It is a reasonably common pattern for hardware to require some delay after
being quiesced before the disable has finalised, especially in mixed signal
devices. For example, an active discharge may be required to ensure that
the circuit starts up again in a known state. Avoid having to implement
such delays in the regulator API by providing regulator_deferred_disable()
which will do a regulator_disable() a specified number of milliseconds
after it is called.

Due to the reference counting done on regulators a deferred disable can
be cancelled by doing another regulator_enable().

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
2011-09-14 10:58:23 +01:00
Boojin Kim
1b9bb715e7 DMA: PL330: Update PL330 DMA API driver
This patch updates following 3 items.
1. Removes unneccessary code.
2. Add AMBA, PL330 configuration
3. Change the meaning of 'peri_id' variable
   from PL330 event number to specific dma id by user.

Signed-off-by: Boojin Kim <boojin.kim@samsung.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-14 11:10:01 +05:30
Trond Myklebust
2f1ddda174 NFSD: Remove the ex_pathname field from struct svc_export
There are no more users...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 22:44:10 -04:00
Trond Myklebust
ed748aacb8 NFSD: Cleanup for nfsd4_path()
The current code is sort of hackish in that it assumes a referral is always
matched to an export. When we add support for junctions that may not be the
case.
We can replace nfsd4_path() with a function that encodes the components
directly from the dentries. Since nfsd4_path is currently the only user of
the 'ex_pathname' field in struct svc_export, this has the added benefit
of allowing us to get rid of that.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 22:43:42 -04:00
Luciano Coelho
a1f1c21c18 nl80211/cfg80211: add match filtering for sched_scan
Introduce filtering for scheduled scans to reduce the number of
unnecessary results (which cause useless wake-ups).

Add a new nested attribute where sets of parameters to be matched can
be passed when starting a scheduled scan.  Only scan results that
match any of the sets will be returned.

At this point, the set consists of a single parameter, an SSID.  This
can be easily extended in the future to support more complex matches.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-13 15:53:45 -04:00
Eliad Peller
cedb5412ba nl80211/cfg80211: add WIPHY_FLAG_AP_UAPSD flag
add WIPHY_FLAG_AP_UAPSD flag to indicate uapsd support on
AP mode.

Advertise it to userspace by including a new
NL80211_ATTR_SUPPORT_AP_UAPSD attribute.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-13 15:50:56 -04:00
Vivek Natarajan
f4b34b550a cfg80211/nl80211: Indicate roaming feature capability to userspace.
When the rssi of the current AP drops, both wpa_supplicant and the
firmware may do a background scan to find a better AP and try to
associate. Since firmware based roaming is faster, inform
wpa_supplicant to avoid roaming and let the firmware decide to
roam if necessary.

For fullmac drivers like ath6kl, it is just enough to provide the
ESSID and the firmware will decide on the BSSID. Since it is not
possible to do pre-auth during roaming for fullmac drivers, the
wpa_supplicant needs to completely disconnect with the old AP and
reconnect with the new AP. This consumes lot of time and it is
better to leave the roaming decision to the firmware.

Signed-off-by: Vivek Natarajan <nataraja@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-13 15:42:31 -04:00
Rafał Miłecki
e2d646ce6c ssb: use u16 for storing board rev
Specs say about size 2 (u16) and my 14e4:4727 has board rev 0x1211.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-09-13 15:42:30 -04:00
Thomas Gleixner
3b8f404815 locking, x86, iommu: Annotate qi->q_lock as raw
The qi->q_lock lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:12:20 +02:00
Thomas Gleixner
1f5b3c3fd2 locking, x86, iommu: Annotate iommu->register_lock as raw
The iommu->register_lock can be taken in atomic context and therefore
must not be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:12:17 +02:00
Thomas Gleixner
2d21a29fb6 locking, oprofile: Annotate oprofilefs lock as raw
The oprofilefs_lock can be taken in atomic context (in profiling
interrupts) and therefore cannot cannot be preempted on -rt -
annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:12:05 +02:00
Thomas Gleixner
ddb6c9b58a locking, rwsem: Annotate inner lock as raw
There is no reason to allow the lock protecting rwsems (the
ownerless variant) to be preemptible on -rt. Convert it to raw.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:59 +02:00
Thomas Gleixner
8292c9e15c locking, semaphores: Annotate inner lock as raw
There is no reason to have the spin_lock protecting the semaphore
preemptible on -rt. Annotate it as a raw_spinlock.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

( On rt this also solves lockdep complaining about the
  rt_mutex.wait_lock being not initialized. )

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:57 +02:00
Thomas Gleixner
ee30a7b2fc locking, sched: Annotate thread_group_cputimer as raw
The thread_group_cputimer lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:55 +02:00
Thomas Gleixner
07354eb1a7 locking, printk: Annotate logbuf_lock as raw
The logbuf_lock lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ merged and fixed it ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:54 +02:00
Thomas Gleixner
740969f91e locking, lib/proportions: Annotate prop_local_percpu::lock as raw
The prop_local_percpu::lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:50 +02:00
Thomas Gleixner
f032a45081 locking, percpu_counter: Annotate ::lock as raw
The percpu_counter::lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:47 +02:00
Thomas Gleixner
ec484608c5 locking, kprobes: Annotate the hash locks and kretprobe.lock as raw
The kprobe locks can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-13 11:11:45 +02:00
Christoph Hellwig
5a7bbad27a block: remove support for bio remapping from ->make_request
There is very little benefit in allowing to let a ->make_request
instance update the bios device and sector and loop around it in
__generic_make_request when we can archive the same through calling
generic_make_request from the driver and letting the loop in
generic_make_request handle it.

Note that various drivers got the return value from ->make_request and
returned non-zero values for errors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-09-12 12:12:01 +02:00
Jens Axboe
c20e8de27f block: rename __make_request() to blk_queue_bio()
Now that it's exported, lets put it in a more sane namespace.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-09-12 12:08:31 +02:00
Christoph Hellwig
166e1f901b block: export __make_request
Avoid the hacks need for request based device mappers currently by simply
exporting the symbol instead of trying to get it through the back door.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-09-12 12:08:27 +02:00
Santosh Shilimkar
60f96b41f7 genirq: Add IRQCHIP_SKIP_SET_WAKE flag
Some irq chips need the irq_set_wake() functionality, but do not
require a irq_set_wake() callback. Instead of forcing an empty
callback to be implemented add a flag which notes this fact. Check for
the flag in set_irq_wake_real() and return success when set.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2011-09-12 09:52:49 +02:00
Brian Norris
4a89ff885f mtd: nand: kill member ops' of struct nand_chip'
The nand_chip.ops field is a struct that is passed around globally with
no particular reason. Every time it is used, it could just as easily be
replaced with a local struct that is updated on each operation. So make
it local.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:57:44 +03:00
Brian Norris
4180f24a7b mtd: document ABI
We're missing a lot of important documentation in include/mtd/mtd-abi.h:

* add a simple description of each ioctl (feel free to expand!)
* give full explanations of recently added and modified operations
* explain the usage of "RAW" that appear in different modes and types of
  operations
* fix some comment style along the way

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:57:44 +03:00
Brian Norris
0612b9ddc2 mtd: rename MTD_OOB_* to MTD_OPS_*
These modes are not necessarily for OOB only. Particularly, MTD_OOB_RAW
affected operations on in-band page data as well. To clarify these
options and to emphasize that their effect is applied per-operation, we
change the primary prefix to MTD_OPS_.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:28:59 +03:00
Brian Norris
905c6bcdb4 mtd: move mtd_oob_mode_t to shared kernel/user space
We will want to use the MTD_OOB_{PLACE,AUTO,RAW} modes in user-space
applications through the introduction of new ioctls, so we should make
this enum a shared type.

This enum is now anonymous.

Artem: tweaked the patch.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:26:20 +03:00
Brian Norris
c46f6483d2 mtd: support reading OOB without ECC
This fixes issues with `nanddump -n' and the MEMREADOOB[64] ioctls on
hardware that performs error correction when reading only OOB data. A
driver for such hardware needs to know when we're doing a RAW vs. a
normal write, but mtd_do_read_oob does not pass such information to the
lower layers (e.g., NAND). We should pass MTD_OOB_RAW or MTD_OOB_PLACE
based on the MTD file mode.

For now, most drivers can get away with just setting:

  chip->ecc.read_oob_raw = chip->ecc.read_oob

This is done by default; but for systems that behave as described above,
you must supply your own replacement function.

This was tested with nandsim as well as on actual SLC NAND.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Cc: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:13:38 +03:00
Brian Norris
e9195edc59 mtd: nand: document nand_chip.oob_poi
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:02:19 +03:00
Brian Norris
9ce244b3fb mtd: support writing OOB without ECC
This fixes issues with `nandwrite -n -o' and the MEMWRITEOOB[64] ioctls
on hardware that writes ECC when writing OOB. The problem arises as
follows: `nandwrite -n' can write page data to flash without applying
ECC, but when used with the `-o' option, ECC is applied (incorrectly),
contrary to the `--noecc' option.

I found that this is the case because my hardware computes and writes
ECC data to flash upon either OOB write or page write. Thus, to support
a proper "no ECC" write, my driver must know when we're performing a raw
OOB write vs. a normal ECC OOB write. However, MTD does not pass any raw
mode information to the write_oob functions.  This patch addresses the
problems by:

1) Passing MTD_OOB_RAW down to lower layers, instead of just defaulting
   to MTD_OOB_PLACE
2) Handling MTD_OOB_RAW within the NAND layer's `nand_do_write_oob'
3) Adding a new (replaceable) function pointer in struct ecc_ctrl; this
   function should support writing OOB without ECC data. Current
   hardware often can use the same OOB write function when writing
   either with or without ECC

This was tested with nandsim as well as on actual SLC NAND.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Cc: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:02:18 +03:00
Brian Norris
e2e24e8ebf mtd: style fixups in multi-line comment, indentation
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:02:18 +03:00
Brian Norris
32c8db8f62 mtd: nand: fix spelling error (date => data)
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:02:18 +03:00
Brian Norris
87ed114bb2 mtd: remove CONFIG_MTD_DEBUG
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2011-09-11 15:02:16 +03:00
Tobias Klauser
b4ca74738a mtd: plat-nand: Fixup kerneldoc for struct platform_nand_chip
The set_parts and priv members of struct platform_nand_chip where
removed in commit c36a6ef3845262ade529afb9f458738b1f196f83 but the
kerneldoc wasn't updated.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-09-11 15:02:14 +03:00
Brian Norris
7854d3f749 mtd: spelling, capitalization, uniformity
Therefor -> Therefore
[Intern], [Internal] -> [INTERN]
[REPLACABLE] -> [REPLACEABLE]
syndrom, syndom -> syndrome
ecc -> ECC
buswith -> buswidth
endianess -> endianness
dont -> don't
occures -> occurs
independend -> independent
wihin -> within
erease -> erase
blockes -> blocks
...

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:13 +03:00
Dmitry Eremin-Solenikov
15c60a508a mtd: drop mtd_device_register
mtd_device_register() is a limited version of mtd_device_parse_register.
Replace it with macro calling mtd_device_parse_register().

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:13 +03:00
Dmitry Eremin-Solenikov
953b3bd191 mtd: remove put_partition_parser() from public header
There is no need to pollute public header with a definition private
to mtdpart.c. Move it from mtd/partitions.h to mtdpart.c

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:13 +03:00
Dmitry Eremin-Solenikov
3165f44bcd mtd: hide parse_mtd_partitions
There is no need to export parse_mtd_partitions() now , as it's fully handled
by registration functions. So move the definition to private header and
remove respective EXPORT_SYMBOL_GPL.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:13 +03:00
Kyungmin Park
e1c10243df mtd: OneNAND: Detect the correct NOP when 4KiB pagesize
There are two different 4KiB pagesize chips
KFM4G16Q4M series have NOP 4 with version ID 0x0131
But KFM4G16Q5M has NOP 1 with versoin ID 0x013e

Note that Q5M means that it has NOP 1.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:13 +03:00
Dmitry Eremin-Solenikov
628376fb53 mtd: drop of_mtd_parse_partitions()
All users have been converted to call of_mtd_parse_partitions through
parse_mtd_partitions() multiplexer. Drop obsolete API.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-09-11 15:02:12 +03:00
Dmitry Eremin-Solenikov
d26c87d64e mtd: prepare to convert of_mtd_parse_partitions to partition parser
Prepare to convert of_mtd_parse_partitions() to usual partitions parser:
1) Register ofpart parser
2) Internally don't use passed device for error printing
3) Add device_node to mtd_part_parser_data struct
4) Move of_mtd_parse_partitions from __devinit to common text section
5) add ofpart to the default list of partition parsers

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-09-11 15:02:10 +03:00
Dmitry Eremin-Solenikov
c797533015 mtd: abstract last MTD partition parser argument
Encapsulate last MTD partition parser argument into a separate
structure. Currently it holds only 'origin' field for RedBoot parser,
but will be extended in future to contain at least device_node for OF
devices.

Amended commentary to make kerneldoc happy

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-09-11 15:02:10 +03:00
Dmitry Eremin-Solenikov
1c4c215cbd mtd: add new API for handling MTD registration
Lots (nearly all) mtd drivers contain nearly the similar code that
calls parse_mtd_partitions, provides some platform-default values, if
parsing fails, and registers  mtd device.

This is an aim to provide single implementation of this scenario:
mtd_device_parse_register() which will handle all this parsing and
defaults.

Artem: amended comments

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:05 +03:00
Dmitry Eremin-Solenikov
0dc8626a17 mtd: plat-nand: drop unused fields from platform_nand_data
Drop now unused set_parts from struct platform_nand_data. Also, while we are
at it, drop long unused priv field from platform_nand_data.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:04 +03:00
Dmitry Eremin-Solenikov
1a31368bf9 mtd: add a flags for partitions which should just leave smth. after them
Add support for MTDPART_OFS_RETAIN: such partitions start at the current
offset, take as much space as possible, but rain part->size bytes after
the end of the partitions for other parts. Primarily this is intended
for ts72xx arm platforms cleanup.

Artem: tweaked the patch a bit

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:02:04 +03:00
Dmitry Eremin-Solenikov
ae2dbad7e9 mtd: drop mtd_has_cmdlinepart()
This function is unused now. Drop it.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:58 +03:00
Dmitry Eremin-Solenikov
13e0fe49f6 mtd: drop physmap_configure
physmap_configure() and physmap_set_partitions() have no users in kernel.
Out of kernel users should have been converted to regular platform device
long ago. Drop support for this obsolete API.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:57 +03:00
Brian Norris
9eeff82436 mtd: nand: improve comment on NAND_BBT_DYNAMIC_STRUCT
In an attempt to improve the documentation of the BBT code, I am expanding
the comments I left in commit:
    58373ff0af
    mtd: nand: more BB Detection refactoring and dynamic scan options

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:57 +03:00
Brian Norris
b4dc53e16f mtd: nand: renumber the reorganized flags in nand.h / bbm.h
After several steps of rearrangement and consolidation, it is probably
worth re-sequencing the numbers on some of our affected flags in nand.h
and bbm.h.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
53d5d88850 mtd: nand: rename CREATE_EMPTY bbt flag with proper prefix
According to our new prefix rules, we should rename NAND_CREATE_EMPTY_BBT
with a NAND_BBT prefix, i.e., NAND_BBT_CREATE_EMPTY.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
b8f8068405 mtd: nand: move NAND_CREATE_EMPTY_BBT flag
The NAND_CREATE_EMPTY_BBT flag was added by commit:
  453281a973
  mtd: nand: introduce NAND_CREATE_EMPTY_BBT
This flag is not used within the kernel and not explained well, so I
took the liberty to edit its comments.

Also, this is a BBT-related flag (and closely tied with NAND_BBT_CREATE)
so I'm moving it to bbm.h next to NAND_BBT_CREATE, thus requiring that
we use the flag in nand_chip.bbt_options, *not* in nand_chip.options.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
bb9ebd4e71 mtd: nand: rename NAND_USE_FLASH_BBT
Recall the recently added prefix requirements:
 * "NAND_" for flags in nand.h, used in nand_chip.options
 * "NAND_BBT_" for flags in bbm.h, used in nand_chip.bbt_options
        or in nand_bbt_descr.options

Thus, I am changing NAND_USE_FLASH_BBT to NAND_BBT_USE_FLASH.

Again, this flag is found in bbm.h and so should NOT be used in the
"nand_chip.options" field.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
a40f73419f mtd: nand: consolidate redundant flash-based BBT flags
This patch works with the following three flags from two headers (nand.h
and bbm.h):
  (1) NAND_USE_FLASH_BBT (nand.h)
  (2) NAND_USE_FLASH_BBT_NO_OOB (nand.h)
  (3) NAND_BBT_NO_OOB (bbm.h)

These flags are all related and interdependent, yet they were in
different headers. Flag (2) is simply the combination of (1) and (3) and
can be eliminated.

This patch accomplishes the following:
  * eliminate NAND_USE_FLASH_BBT_NO_OOB (i.e., flag (2))
  * move NAND_USE_FLASH_BBT (i.e., flag (1)) to bbm.h

It's important to note that because (1) and (3) are now both found in
bbm.h, they should NOT be used in the "nand_chip.options" field.

I removed a small section from the mtdnand DocBook because it referes to
NAND_USE_FLASH_BBT in nand.h, which has been moved to bbm.h.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
5fb1549dfc mtd: nand: separate chip options / bbt_options
This patch handles the problems we've been having with using conflicting
flags from nand.h and bbm.h in the same nand_chip.options field. We
should try to separate these two spaces a little more clearly, and so I
have added a bbt_options field to nand_chip.

Important notes about nand_chip fields:
* bbt_options field should contain ONLY flags from bbm.h. They should be
  able to pass safely to a nand_bbt_descr data structure.
    - BBT option flags start with the "NAND_BBT_" prefix.
* options field should contian ONLY flags from nand.h. Ideally, they
  should not be involved in any BBT related options.
    - NAND chip option flags start with the "NAND_" prefix.
* Every flag should have a nice comment explaining what the flag is. While
  this is not yet the case on all existing flags, please be sure to write
  one for new flags. Even better, you can help document the code better
  yourself!

Please try to follow these conventions to make everyone's lives easier.

Among the flags that are being moved to the new bbt_options field
throughout various drivers, etc. are:
 * NAND_BBT_SCANLASTPAGE
 * NAND_BBT_SCAN2NDPAGE
and there will be more to come.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:56 +03:00
Brian Norris
a0dc552951 mtd: nand: remove NAND_BBT_SCANBYTE1AND6 option
This patch reverts most of:
    commit 58373ff0af
    mtd: nand: more BB Detection refactoring and dynamic scan options

According to the discussion at:
    http://lists.infradead.org/pipermail/linux-mtd/2011-May/035696.html
the NAND_BBT_SCANBYTE1AND6 flag, although technically valid, can break
some existing ECC layouts that use the 6th byte in the OOB for ECC data.
Furthermore, we apparently do not need to scan both bytes 1 and 6 in
the OOB region of the devices under consideration; instead, we only need
to scan one or the other.

Thus, the NAND_BBT_SCANBYTE1AND6 flag is at best unnecessary and at
worst a regression.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-09-11 15:01:46 +03:00
Zhiwu Song
684f741446 ARM: CSR: add rtc i/o bridge interface for SiRFprimaII
The module is a bridge between the RTC clock domain and the CPU interface
clock domain. ARM access the register of SYSRTC, GPSRTC and PWRC through
this module.

Signed-off-by: Zhiwu Song <zhiwu.song@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2011-09-11 09:17:53 +08:00
James Morris
5dbe3040c7 security: sparse fix: Move security_fixup_op to security.h
Fix sparse warning by moving declaraion to global header.

Signed-off-by: James Morris <jmorris@namei.org>
2011-09-09 16:56:33 -07:00
rongqing.li@windriver.com
fc9ff9b7e3 security: Fix a typo
Fix a typo.

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-09-09 16:40:31 -07:00
Mark Brown
4ae7335dae Merge branch 'topic/interface' of git://opensource.wolfsonmicro.com/regmap into for-3.2 2011-09-09 11:11:24 -07:00
Mark Brown
069af897f9 regmap: Provide device read and write map interface for merging
Add the externally visible interface introduced by Lars-Peter's commit
6f3064 (regmap: Add support for device specific write and read flag
masks) separately in order to allow merge into other subsystems for
integration with drivers.  Drivers relying on this feature will not be
functional until they are merged with the implementation.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-09 11:05:46 -07:00
Felipe Balbi
9962444f59 usb: dwc3: omap: distinguish between SW and HW modes
The OMAP wrapper allows us to either control internal
OTG signals via SW or HW. Different boards might wish
to use one or the other mode of operation. Let's have
have that information passed via platform_data for now.

After DT conversion is finished for OMAP, we can easily
convert this to a DT attribute.

Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-09-09 13:02:45 +03:00
Randy Dunlap
bff747c58c regulator: fix kernel-doc warning in consumer.h
Fix kernel-doc warning about internal/private data by marking it
as "private:" so that kernel-doc will ignore it.

  Warning(include/linux/regulator/consumer.h:128): No description found for parameter 'ret'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-08 14:43:03 -07:00
Peter Zijlstra
e8abccb719 posix-cpu-timers: Cure SMP accounting oddities
David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&barrier, NULL, 2);
	  err = pthread_create(&th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&barrier);

	  err = clock_gettime(process_clock, &process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&sleeptime, NULL);

	  err = clock_gettime(th_clock, &th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 15:25:52 +02:00
Martin Schwidefsky
65516f8a7c clockevents: Add direct ktime programming function
There is at least one architecture (s390) with a sane clockevent device
that can be programmed with the equivalent of a ktime. No need to create
a delta against the current time, the ktime can be used directly.

A new clock device function 'set_next_ktime' is introduced that is called
with the unmodified ktime for the timer if the clock event device has the 
CLOCK_EVT_FEAT_KTIME bit set.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:56 +02:00
Martin Schwidefsky
d1748302f7 clockevents: Make minimum delay adjustments configurable
The automatic increase of the min_delta_ns of a clockevents device
should be done in the clockevents code as the minimum delay is an
attribute of the clockevents device.

In addition not all architectures want the automatic adjustment, on a
massively virtualized system it can happen that the programming of a
clock event fails several times in a row because the virtual cpu has
been rescheduled quickly enough. In that case the minimum delay will
erroneously be increased with no way back. The new config symbol
GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
adjustment. The config option is selected only for x86.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:56 +02:00
Dmitry Torokhov
7e66eaf14e Merge commit 'v3.1-rc4' into next 2011-09-07 14:18:36 -07:00
Mark Brown
8efcc57ded mfd: Fix value of WM8994_CONFIGURE_GPIO
This needs to be an out of band value for the register and on this device
registers are 16 bit so we must shift left one to the 17th bit.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-09-06 16:37:58 +02:00
Lars-Peter Clausen
6f306441e9 regmap: Add support for device specific write and read flag masks.
Some buses like SPI have no standard notation of read or write operations.
The general scheme here is to set or clear specific bits in the register
address to indicate whether the operation is a read or write. We already
support having a read flag mask per bus, but as there is no standard
the bits which need to be set or cleared differ between devices and vendors,
thus we need a mechanism to specify them per device.

This patch adds two new entries to the regmap_config struct, read_flag_mask and
write_flag_mask. These will be or'ed onto the top byte when doing a read or
write operation. If both masks are empty the device will fallback to the
regmap_bus masks.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-05 14:55:57 -07:00
Mark Brown
5b457e3910 regmap: Remove redundant owner field from the bus type struct
No longer used as users link directly with the bus types so the core
module infrastructure does refcounting for us.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-09-05 10:57:04 -07:00
Dan Carpenter
d2159fb7b8 jbd2: use gfp_t instead of int
This silences some Sparse warnings:
fs/jbd2/transaction.c:135:69: warning: incorrect type in argument 2 (different base types)
fs/jbd2/transaction.c:135:69:    expected restricted gfp_t [usertype] flags
fs/jbd2/transaction.c:135:69:    got int [signed] gfp_mask

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-04 10:20:14 -04:00
Andreas Oberritter
c0f856d3f0 [media] DVB: increment minor version after addition of SYS_TURBO
Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-09-03 19:07:21 -03:00
Andreas Oberritter
83dc314bea [media] DVB: Add SYS_TURBO for north american turbo code FEC
Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-09-03 18:07:14 -03:00
Vinod Koul
8516f52fa4 Merge branch 'next' into v3.1-rc4
Fixed trivial conflicts  in  drivers/dma/amba-pl08x.c

Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-02 16:43:44 +05:30
Theodore Ts'o
1cd9f0976a ext2,ext3,ext4: don't inherit APPEND_FL or IMMUTABLE_FL for new inodes
This doesn't make much sense, and it exposes a bug in the kernel where
attempts to create a new file in an append-only directory using
O_CREAT will fail (but still leave a zero-length file).  This was
discovered when xfstests #79 was generalized so it could run on all
file systems.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc:stable@kernel.org
2011-08-31 11:54:51 -04:00
J. Bruce Fields
c152292f9e nfsd: remove include/linux/nfsd/syscall.h
We don't need this any more.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 11:50:11 -04:00
Vaibhav Nagarnaik
c64e148a3b trace: Add ring buffer stats to measure rate of events
The stats file under per_cpu folder provides the number of entries,
overruns and other statistics about the CPU ring buffer. However, the
numbers do not provide any indication of how full the ring buffer is in
bytes compared to the overall size in bytes. Also, it is helpful to know
the rate at which the cpu buffer is filling up.

This patch adds an entry "bytes: " in printed stats for per_cpu ring
buffer which provides the actual bytes consumed in the ring buffer. This
field includes the number of bytes used by recorded events and the
padding bytes added when moving the tail pointer to next page.

It also adds the following time stamps:
"oldest event ts:" - the oldest timestamp in the ring buffer
"now ts:"  - the timestamp at the time of reading

The field "now ts" provides a consistent time snapshot to the userspace
when being read. This is read from the same trace clock used by tracing
event timestamps.

Together, these values provide the rate at which the buffer is filling
up, from the formula:
bytes / (now_ts - oldest_event_ts)

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: David Sharp <dhsharp@google.com>
Link: http://lkml.kernel.org/r/1313531179-9323-3-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-08-30 12:27:45 -04:00
John W. Linville
ba6e5eb107 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-08-29 14:52:20 -04:00
Greg Kroah-Hartman
6ed962a208 Merge 3.1-rc4 into usb-next
This was done to resolve a conflict in this file:
	drivers/usb/host/xhci-ring.c

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-29 08:56:17 -07:00
Greg Kroah-Hartman
6eafa4604c Merge 3.1-rc4 into staging-next
This resolves a conflict with:
	drivers/staging/brcm80211/brcmsmac/types.h

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-29 08:47:46 -07:00
Sakari Ailus
69d232ae8e [media] omap3isp: ccdc: Use generic frame sync event instead of private HS_VS event
The ccdc block in the omap3isp produces events whenever it starts receiving
a new frame. A private HS_VS event was used for this previously. Now, the
generic V4L2_EVENT_FRAME_SYNC event is being used for the purpose.

This patch also provides the frame sequence number to user space.

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-08-29 12:38:51 -03:00
Sakari Ailus
eab00a0da2 [media] v4l: events: Define V4L2_EVENT_FRAME_SYNC
Define a frame sync event to tell user space when the reception of a frame
starts.

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-08-29 12:37:15 -03:00
Stephane Eranian
a8d757ef07 perf events: Fix slow and broken cgroup context switch code
The current cgroup context switch code was incorrect leading
to bogus counts. Furthermore, as soon as there was an active
cgroup event on a CPU, the context switch cost on that CPU
would increase by a significant amount as demonstrated by a
simple ping/pong example:

 $ ./pong
 Both processes pinned to CPU1, running for 10s
 10684.51 ctxsw/s

Now start a cgroup perf stat:
 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 100

$ ./pong
 Both processes pinned to CPU1, running for 10s
 6674.61 ctxsw/s

That's a 37% penalty.

Note that pong is not even in the monitored cgroup.

The results shown by perf stat are bogus:
 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 100

 Performance counter stats for 'sleep 100':

 CPU1 <not counted> cycles   test
 CPU1 16,984,189,138 cycles  #    0.000 GHz

The second 'cycles' event should report a count @ CPU clock
(here 2.4GHz) as it is counting across all cgroups.

The patch below fixes the bogus accounting and bypasses any
cgroup switches in case the outgoing and incoming tasks are
in the same cgroup.

With this patch the same test now yields:
 $ ./pong
 Both processes pinned to CPU1, running for 10s
 10775.30 ctxsw/s

Start perf stat with cgroup:

 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

Run pong outside the cgroup:
 $ /pong
 Both processes pinned to CPU1, running for 10s
 10687.80 ctxsw/s

The penalty is now less than 2%.

And the results for perf stat are correct:

$ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

 Performance counter stats for 'sleep 10':

 CPU1 <not counted> cycles test #    0.000 GHz
 CPU1 23,933,981,448 cycles      #    0.000 GHz

Now perf stat reports the correct counts for
for the non cgroup event.

If we run pong inside the cgroup, then we also get the
correct counts:

$ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

 Performance counter stats for 'sleep 10':

 CPU1 22,297,726,205 cycles test #    0.000 GHz
 CPU1 23,933,981,448 cycles      #    0.000 GHz

      10.001457237 seconds time elapsed

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110825135803.GA4697@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-08-29 12:28:33 +02:00
Nao Nishijima
a72c5e5eb7 [SCSI] genhd: add a new attribute "alias" in gendisk
This patch allows the user to set an "alias" of the disk via sysfs interface.

This patch only adds a new attribute "alias" in gendisk structure.
To show the alias instead of the device name in kernel messages,
we need to revise printk messages and use alias_name() in them.

Example:
(current) printk("disk name is %s\n", disk->disk_name);
(new)     printk("disk name is %s\n", alias_name(disk));

Users can use alphabets, numbers, '-' and '_' in "alias" attribute. A disk can
have an "alias" which length is up to 255 bytes. This attribute is write-once.

Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Suggested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nao Nishijima <nao.nishijima.xt@hitachi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-29 00:16:19 -07:00
Bhanu Prakash Gollapudi
3c9c36bced net: Define NETDEV_FCOE_WWNN, NETDEV_FCOE_WWPN only when CONFIG_LIBFCOE is enabled
bnx2fc driver calls netdev->netdev_ops->ndo_fcoe_get_wwn() and it may not
be defined with the current Kconfig dependencies.  ndo_fcoe_get_wwn is
dependent on CONFIG_FCOE, but bnx2fc does not select CONFIG_FCOE, as it does
not depend on fcoe driver. Since both fcoe and bnx2fc drivers select
CONFIG_LIBFCOE, define NETDEV_FCOE_WWNN and NETDEV_FCOE_WWPN when
CONFIG_LIBFCOE is defined.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Yi Zou <yi.zou@intel.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-28 17:08:28 -04:00
Xin Xie
500c524aad regulator: tps6586x: add SMx slew rate setting
Add output vlotage slew rate setting for SM0/SM1

Signed-off-by: Xin Xie <xxie@nvidia.com>
Signed-off-by: Danny Huang <dahuang@nvidia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-08-28 17:41:28 +01:00
J. Bruce Fields
a9004abc34 nfsd4: cleanup and consolidate seqid_mutating_err
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:26 -04:00
J. Bruce Fields
c10bd39d80 Remove include/linux/nfsd/const.h
Userspace shouldn't have a use for these constants.  Nothing here is
used outside fs/nfsd.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-26 18:22:52 -04:00
J. Bruce Fields
8cfb791340 nfsd: remove unused defines
At least one of these is actually wrong anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-26 18:22:51 -04:00
NeilBrown
f5b9409973 All Arch: remove linkage for sys_nfsservctl system call
The nfsservctl system call is now gone, so we should remove all
linkage for it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-26 15:09:58 -07:00
Linus Torvalds
efe45ab1ee Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  omap-serial: Allow IXON and IXOFF to be disabled.
  TTY: serial, document ignoring of uart->ops->startup error
  TTY: pty, fix pty counting
  8250: Fix race condition in serial8250_backup_timeout().
  serial/8250_pci: delete duplicate data definition
  8250_pci: add support for Rosewill RC-305 4x serial port card
  tty: Add "spi:" prefix for spi modalias
  atmel_serial: fix atmel_default_console_device
  serial: 8250_pnp: add Intermec CV60 touchscreen device
  drivers/serial/ucc_uart.c: Fix compiler warning
  pch_uart: Set PCIe bus number using probe parameter
  serial: samsung: Fix build error
2011-08-26 13:06:06 -07:00
Linus Torvalds
3ab47029d9 Merge branch 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  drivers:misc: ti-st: fix unexpected UART close
  drivers:misc: ti-st: free skb on firmware download
  drivers:misc: ti-st: wait for completion at fail
  drivers:misc: ti-st: reinit completion before send
  drivers:misc: ti-st: fail-safe on wrong pkt type
  drivers:misc: ti-st: reinit completion on ver read
  drivers:misc:ti-st: platform hooks for chip states
  drivers:misc: ti-st: avoid a misleading dbg msg
  base/devres.c: quiet sparse noise about context imbalance
  pti: add missing CONFIG_PCI dependency
  drivers/base/devtmpfs.c: correct annotation of `setup_done'
  driver core: fix kernel-doc warning in platform.c
  firmware: fix google/gsmi.c build warning
2011-08-26 13:05:09 -07:00
Uwe Kleine-König
01dcc60a7c new helper to create platform devices with dma mask
compared to the most powerful and already existing helper (namely
platform_device_register_resndata) this allows to specify a dma_mask.
To make eventual extensions later more easy, a struct holding the used
information is created instead of passing the information by function
parameters.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-26 11:31:09 -07:00
chetan loke
bc59ba3991 af_packet: Prefixed tpacket_v3 structs to avoid name space collision
structs introduced in tpacket_v3 implementation are prefixed with 'tpacket'
to avoid namespace collision.

Compile tested.

Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:38:44 -04:00
Ben Hutchings
767df1c717 headers, can: Add missing #include to <linux/can/bcm.h>
<linux/can/bcm.h> uses type canid_t, defined in <linux/can.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:51 -04:00
Ben Hutchings
5740bb5693 headers, xtables: Add missing #include <linux/netfilter.h>
Various headers use union nf_inet_addr, defined in <linux/netfilter.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
598aaff2ee headers, netfilter: Add missing #include <limits.h> for userland
Various headers use INT_MIN and INT_MAX, which are defined for
userland in <limits.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
3828620bc0 headers, tipc: Add missing #include to <linux/tipc_config.h> for userland
<linux/tipc_config.h> defines inline functions using ntohs() etc.
For userland these are defined in <arpa/inet.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
7ff30c43f3 headers, netfilter: Use kernel type names __u8, __u16, __u32
These types are guaranteed to be defined by <linux/types.h> for
both userland and kernel, unlike u_intN_t.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
bcb949b884 headers, net: Use __kernel_sa_family_t in more definitions shared with userland
Complete the work started with commit
6602a4baf4 ('net: Make userland include
of netlink.h more sane').

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
d4b172d2ec headers, pppol2tp: Use __kernel_pid_t in <linux/pppol2tp.h>
<linux/types.h> defines __kernel_pid_t for userland; pid_t is
defined elsewhere (and potentially differently).

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
c2e0cd8869 headers, ax25: Add missing #include to <linux/netrom.h>, <linux/rose.h>
These headers use the ax25_address type defined in <linux/ax25.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Ben Hutchings
22ad72b027 headers, pppox: Add missing #include to <linux/if_pppox.h>
<linux/if_ppox.h> uses ETH_ALEN, defined in <linux/if_ether.h>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:02:50 -04:00
Eliad Peller
c75786c9ef nl80211/cfg80211: add STA WME parameters
Add new NL80211_ATTR_STA_WME nested attribute that contains
wme params needed by the low-level driver (uapsd_queues and
max_sp).

Add these params to the station_parameters struct as well.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:56 -04:00
Dilan Lee
cc7993f643 backlight: add a callback 'notify_after' for backlight control
We need a callback to do some things after pwm_enable, pwm_disable
and pwm_config.

Signed-off-by: Dilan Lee <dilee@nvidia.com>
Reviewed-by: Robert Morell <rmorell@nvidia.com>
Reviewed-by: Arun Murthy <arun.murthy@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25 16:25:34 -07:00
Alexandre Bounine
284fb68d00 rapidio: fix use of non-compatible registers
Replace/remove use of RIO v.1.2 registers/bits that are not
forward-compatible with newer versions of RapidIO specification.

RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.

Use of removed (since RIO v.1.3) register bits affects users of
currently available 1.3 and 2.x compliant devices who may use not so
recent kernel versions.

Removing checks for unsupported bits makes corresponding routines
compatible with all versions of RapidIO specification.  Therefore,
backporting makes stable kernel versions compliant with RIO v.1.3 and
later as well.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Chul Kim <chul.kim@idt.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25 16:25:34 -07:00
Evgeniy Polyakov
a801876638 MAINTAINERS: Evgeniy has moved
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25 16:25:33 -07:00
Josh Boyer
e096d0c7e2 lockdep: Add helper function for dir vs file i_mutex annotation
Purely in-memory filesystems do not use the inode hash as the dcache
tells us if an entry already exists.  As a result, they do not call
unlock_new_inode, and thus directory inodes do not get put into a
different lockdep class for i_sem.

We need the different lockdep classes, because the locking order for
i_mutex is different for directory inodes and regular inodes.  Directory
inodes can do "readdir()", which takes i_mutex *before* possibly taking
mm->mmap_sem (due to a page fault while copying the directory entry to
user space).

In contrast, regular inodes can be mmap'ed, which takes mm->mmap_sem
before accessing i_mutex.

The two cases can never happen for the same inode, so no real deadlock
can occur, but without the different lockdep classes, lockdep cannot
understand that.  As a result, if CONFIG_DEBUG_LOCK_ALLOC is set, this
can lead to false positives from lockdep like below:

    find/645 is trying to acquire lock:
     (&mm->mmap_sem){++++++}, at: [<ffffffff81109514>] might_fault+0x5c/0xac

    but task is already holding lock:
     (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff81149f34>]
    vfs_readdir+0x5b/0xb4

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&sb->s_type->i_mutex_key#15){+.+.+.}:
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff814db822>] __mutex_lock_common+0x4c/0x361
          [<ffffffff814dbc46>] mutex_lock_nested+0x40/0x45
          [<ffffffff811daa87>] hugetlbfs_file_mmap+0x82/0x110
          [<ffffffff81111557>] mmap_region+0x258/0x432
          [<ffffffff811119dd>] do_mmap_pgoff+0x2ac/0x306
          [<ffffffff81111b4f>] sys_mmap_pgoff+0x118/0x16a
          [<ffffffff8100c858>] sys_mmap+0x22/0x24
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

    -> #0 (&mm->mmap_sem){++++++}:
          [<ffffffff8108a4bc>] __lock_acquire+0xa1a/0xcf7
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff81109541>] might_fault+0x89/0xac
          [<ffffffff81149cff>] filldir+0x6f/0xc7
          [<ffffffff811586ea>] dcache_readdir+0x67/0x205
          [<ffffffff81149f54>] vfs_readdir+0x7b/0xb4
          [<ffffffff8114a073>] sys_getdents+0x7e/0xd1
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

This patch moves the directory vs file lockdep annotation into a helper
function that can be called by in-memory filesystems and has hugetlbfs
call it.

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25 10:50:18 -07:00
Linus Torvalds
e33f2d238e Merge branch 'urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback
* 'urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback:
  squeeze max-pause area and drop pass-good area
2011-08-25 10:40:12 -07:00
Greg Kroah-Hartman
a91befc116 Staging: hv: add driver_data to hv_vmbus_device_id
This is going to be needed by some drivers that handle more than one
device, like all other bus types do, so prepare for that in advance
before the user/kernel api is used.

Cc:  K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-25 10:25:37 -07:00
K. Y. Srinivasan
17be18c2b5 Staging: hv: Add struct hv_vmbus_device_id to mod_devicetable.h
In preparation for implementing vmbus aliases for auto-loading
Hyper-V drivers, define vmbus specific device ID.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-25 10:23:07 -07:00
Andi Kleen
be27425dcc Add a personality to report 2.6.x version numbers
I ran into a couple of programs which broke with the new Linux 3.0
version.  Some of those were binary only.  I tried to use LD_PRELOAD to
work around it, but it was quite difficult and in one case impossible
because of a mix of 32bit and 64bit executables.

For example, all kind of management software from HP doesnt work, unless
we pretend to run a 2.6 kernel.

  $ uname -a
  Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux

  $ hpacucli ctrl all show

  Error: No controllers detected.

  $ rpm -qf /usr/sbin/hpacucli
  hpacucli-8.75-12.0

Another notable case is that Python now reports "linux3" from
sys.platform(); which in turn can break things that were checking
sys.platform() == "linux2":

  https://bugzilla.mozilla.org/show_bug.cgi?id=664564

It seems pretty clear to me though it's a bug in the apps that are using
'==' instead of .startswith(), but this allows us to unbreak broken
programs.

This patch adds a UNAME26 personality that makes the kernel report a
2.6.40+x version number instead.  The x is the x in 3.x.

I know this is somewhat ugly, but I didn't find a better workaround, and
compatibility to existing programs is important.

Some programs also read /proc/sys/kernel/osrelease.  This can be worked
around in user space with mount --bind (and a mount namespace)

To use:

  wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
  gcc -o uname26 uname26.c
  ./uname26 program

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25 10:17:28 -07:00
Jiri Slaby
a57a7bf3fc TTY: define tty_wait_until_sent_from_close
We need this helper to fix system stalls. The issue is that the rest
of the system TTYs wait for us to finish waiting. This wasn't an issue
with BKL. BKL used to unlock implicitly.

This is based on the Arnd suggestion.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-25 09:00:40 -07:00
Viresh Kumar
0a2356572b dmaengine/amba-pl08x: Pass flow controller information with slave channel data
At least, on SPEAr platforms there is one peripheral, JPEG, which can be flow
controller for DMA transfer. Currently DMA controller driver didn't support
peripheral flow controller configurations.

This patch adds device_fc field in struct pl08x_channel_data, which will be used
only for slave transfers and is not used in case of mem2mem transfers.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-08-25 19:33:39 +05:30
Viresh Kumar
16a2e7d359 dmaengine/amba-pl08x: Get rid of pl08x_pre_boundary()
Pl080 Manual says: "Bursts do not cross the 1KB address boundary"

We can program the controller to cross 1 KB boundary on a burst and controller
can take care of this boundary condition by itself.

Following is the discussion with ARM Technical Support Guys (David):
[Viresh] Manual says: "Bursts do not cross the 1KB address boundary"

What does that actually mean? As, Maximum size transferable with a single LLI is
4095 * 4 =16380 ~ 16KB. So, if we don't have src/dest address aligned to burst
size, we can't use this big of an LLI.

[David] There is a difference between bursts describing the total data
transferred by the DMA controller and AHB bursts. Bursts described by the
programmable parameters in the PL080 have no direct connection with the bursts
that are seen on the AHB bus.

The statement that "Bursts do not cross the 1KB address boundary" in the TRM is
referring to AHB bursts, where this limitation is a requirement of the AHB spec.
You can still issue bursts within the PL080 that are in excess of 1KB. The
PL080 will make sure that its bursts are broken down into legal AHB bursts which
will be formatted to ensure that no AHB burst crosses a 1KB boundary.

Based on above discussion, this patch removes all code related to 1 KB boundary
as we are not required to handle this in driver.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-08-25 19:33:38 +05:30
Viresh Kumar
5a61233073 dmaengine/amba-pl08x: Complete doc comment for struct pl08x_txd
Doc comment for struct pl08x_txd was incomplete. Complete that.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-08-25 19:33:37 +05:30
Rafael J. Wysocki
0aa2a22169 PM / Domains: Preliminary support for devices with power.irq_safe set
The generic PM domains framework currently doesn't work with devices
whose power.irq_safe flag is set, because runtime PM callbacks for
such devices are run with interrupts disabled and the callbacks
provided by the generic PM domains framework use domain mutexes
and may sleep.  However, such devices very well may belong to
power domains on some systems, so the generic PM domains framework
should take them into account.

For this reason, modify the generic PM domains framework so that the
domain .power_off() and .power_on() callbacks are never executed for
a domain containing devices with power.irq_safe set, although the
.stop_device() and .start_device() callbacks are still run for them.

Additionally, introduce a flag allowing the creator of a
struct generic_pm_domain object to indicate that its .stop_device()
and .start_device() callbacks may be run in interrupt context
(might_sleep_if() triggers if that flag is not set and one of those
callbacks is run in interrupt context).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:37:04 +02:00
Jean Pihet
b66213cdb0 PM QoS: Add global notification mechanism for device constraints
Add a global notification chain that gets called upon changes to the
aggregated constraint value for any device.
The notification callbacks are passing the full constraint request data
in order for the callees to have access to it. The current use is for the
platform low-level code to access the target device of the constraint.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:47 +02:00
Jean Pihet
91ff4cb803 PM QoS: Implement per-device PM QoS constraints
Implement the per-device PM QoS constraints by creating a device
PM QoS API, which calls the PM QoS constraints management core code.

The per-device latency constraints data strctures are stored
in the device dev_pm_info struct.

The device PM code calls the init and destroy of the per-device constraints
data struct in order to support the dynamic insertion and removal of the
devices in the system.

To minimize the data usage by the per-device constraints, the data struct
is only allocated at the first call to dev_pm_qos_add_request.
The data is later free'd when the device is removed from the system.
A global mutex protects the constraints users from the data being
allocated and free'd.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:41 +02:00
Jean Pihet
abe98ec2d8 PM QoS: Generalize and export constraints management code
In preparation for the per-device constratins support:
 - rename update_target to pm_qos_update_target
 - generalize and export pm_qos_update_target for usage by the upcoming
   per-device latency constraints framework:
   * operate on struct pm_qos_constraints for constraints management,
   * introduce an 'action' parameter for constraints add/update/remove,
   * the return value indicates if the aggregated constraint value has
     changed,
 - update the internal code to operate on struct pm_qos_constraints
 - add a NULL pointer check in the API functions

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:34 +02:00
Jean Pihet
4e1779baaa PM QoS: Reorganize data structs
In preparation for the per-device constratins support, re-organize
the data strctures:
 - add a struct pm_qos_constraints which contains the constraints
 related data
 - update struct pm_qos_object contents to the PM QoS internal object
 data. Add a pointer to struct pm_qos_constraints
 - update the internal code to use the new data structs.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:27 +02:00
Jean Pihet
cc74998618 PM QoS: Minor clean-ups
- Misc fixes to improve code readability:
  * rename struct pm_qos_request_list to struct pm_qos_request,
  * rename pm_qos_req parameter to req in internal code,
    consistenly use req in the API parameters,
  * update the in-kernel API callers to the new parameters names,
  * rename of fields names (requests, list, node, constraints)

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:12 +02:00
Jean Pihet
e8db0be124 PM QoS: Move and rename the implementation files
The PM QoS implementation files are better named
kernel/power/qos.c and include/linux/pm_qos.h.

The PM QoS support is compiled under the CONFIG_PM option.

Signed-off-by: Jean Pihet <j-pihet@ti.com>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:35:03 +02:00
Rafael J. Wysocki
b5e8d269d8 PM: Move clock-related definitions and headers to separate file
Since the PM clock management code in drivers/base/power/clock_ops.c
is used for both runtime PM and system suspend/hibernation, the
definitions of data structures and headers related to it should not
be located in include/linux/pm_rumtime.h.  Move them to a separate
header file.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:34:19 +02:00
Rafael J. Wysocki
4605ab653c PM / Domains: Use power.sybsys_data to reduce overhead
Currently pm_genpd_runtime_resume() has to walk the list of devices
from the device's PM domain to find the corresponding device list
object containing the need_restore field to check if the driver's
.runtime_resume() callback should be executed for the device.
This is suboptimal and can be simplified by using power.sybsys_data
to store device information used by the generic PM domains code.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:34:12 +02:00
Rafael J. Wysocki
ef27bed187 PM: Reference counting of power.subsys_data
Since the power.subsys_data device field will be used by multiple
filesystems, introduce a reference counting mechanism for it to avoid
freeing it prematurely or changing its value at a wrong time.

Make the PM clocks management code that currently is the only user of
power.subsys_data use the new reference counting.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:34:01 +02:00
Rafael J. Wysocki
5c095a0e0d PM: Introduce struct pm_subsys_data
Introduce struct pm_subsys_data that may be subclassed by subsystems
to store subsystem-specific information related to the device.  Move
the clock management fields accessed through the power.subsys_data
pointer in struct device to the new strucutre.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:33:50 +02:00
Rafael J. Wysocki
17877eb5a9 PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER
Since it is now possible for a PM domain to have multiple masters
instead of one parent, rename the "wait for parent" status to reflect
the new situation.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:33:45 +02:00
Rafael J. Wysocki
5063ce1571 PM / Domains: Allow generic PM domains to have multiple masters
Currently, for a given generic PM domain there may be only one parent
domain (i.e. a PM domain it depends on).  However, there is at least
one real-life case in which there should be two parents (masters) for
one PM domain (the A3RV domain on SH7372 turns out to depend on the
A4LC domain and it depends on the A4R domain and the same time). For
this reason, allow a PM domain to have multiple parents (masters) by
introducing objects representing links between PM domains.

The (logical) links between PM domains represent relationships in
which one domain is a master (i.e. it is depended on) and another
domain is a slave (i.e. it depends on the master) with the rule that
the slave cannot be powered on if the master is not powered on and
the master cannot be powered off if the slave is not powered off.
Each struct generic_pm_domain object representing a PM domain has
two lists of links, a list of links in which it is a master and
a list of links in which it is a slave.  The first of these lists
replaces the list of subdomains and the second one is used in place
of the parent pointer.

Each link is represented by struct gpd_link object containing
pointers to the master and the slave and two struct list_head
members allowing it to hook into two lists (the master's list
of "master" links and the slave's list of "slave" links).  This
allows the code to get to the link from each side (either from
the master or from the slave) and follow it in each direction.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:33:44 +02:00
Rafael J. Wysocki
3f241775c3 PM / Domains: Add "wait for parent" status for generic PM domains
The next patch will make it possible for a generic PM domain to have
multiple parents (i.e. multiple PM domains it depends on).  To
prepare for that change it is necessary to change pm_genpd_poweron()
so that it doesn't jump to the start label after running itself
recursively for the parent domain.  For this purpose, introduce a new
PM domain status value GPD_STATE_WAIT_PARENT that will be set by
pm_genpd_poweron() before calling itself recursively for the parent
domain and modify the code in drivers/base/power/domain.c so that
the GPD_STATE_WAIT_PARENT status is guaranteed to be preserved during
the execution of pm_genpd_poweron() for the parent.

This change also causes pm_genpd_add_subdomain() and
pm_genpd_remove_subdomain() to wait for started pm_genpd_poweron() to
complete and allows pm_genpd_runtime_resume() to avoid dropping the
lock after powering on the PM domain.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:33:44 +02:00
Rafael J. Wysocki
c4bb3160c8 PM / Domains: Implement subdomain counters as atomic fields
Currently, pm_genpd_poweron() and pm_genpd_poweroff() need to take
the parent PM domain's lock in order to modify the parent's counter
of active subdomains in a nonracy way.  This causes the locking to be
considerably complex and in fact is not necessary, because the
subdomain counters may be implemented as atomic fields and they
won't have to be modified under a lock.

Replace the unsigned in sd_count field in struct generic_pm_domain
by an atomic_t one and modify the code in drivers/base/power/domain.c
to take this change into account.

This patch doesn't change the locking yet, that is going to be done
in a separate subsequent patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-25 15:33:43 +02:00
Nandita Dukkipati
a262f0cdf1 Proportional Rate Reduction for TCP.
This patch implements Proportional Rate Reduction (PRR) for TCP.
PRR is an algorithm that determines TCP's sending rate in fast
recovery. PRR avoids excessive window reductions and aims for
the actual congestion window size at the end of recovery to be as
close as possible to the window determined by the congestion control
algorithm. PRR also improves accuracy of the amount of data sent
during loss recovery.

The patch implements the recommended flavor of PRR called PRR-SSRB
(Proportional rate reduction with slow start reduction bound) and
replaces the existing rate halving algorithm. PRR improves upon the
existing Linux fast recovery under a number of conditions including:
  1) burst losses where the losses implicitly reduce the amount of
outstanding data (pipe) below the ssthresh value selected by the
congestion control algorithm and,
  2) losses near the end of short flows where application runs out of
data to send.

As an example, with the existing rate halving implementation a single
loss event can cause a connection carrying short Web transactions to
go into the slow start mode after the recovery. This is because during
recovery Linux pulls the congestion window down to packets_in_flight+1
on every ACK. A short Web response often runs out of new data to send
and its pipe reduces to zero by the end of recovery when all its packets
are drained from the network. Subsequent HTTP responses using the same
connection will have to slow start to raise cwnd to ssthresh. PRR on
the other hand aims for the cwnd to be as close as possible to ssthresh
by the end of recovery.

A description of PRR and a discussion of its performance can be found at
the following links:
- IETF Draft:
    http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
- IETF Slides:
    http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
    http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
- Paper to appear in Internet Measurements Conference (IMC) 2011:
    Improving TCP Loss Recovery
    Nandita Dukkipati, Matt Mathis, Yuchung Cheng

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:40:40 -07:00
chetan loke
0d4691ce11 af-packet: Added TPACKET_V3 headers.
Added TPACKET_V3 definitions.

Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:40:39 -07:00
Ian Campbell
ea2ab69379 net: convert core to skb paged frag APIs
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:52:11 -07:00
Bernhard Roth
83cac9f3b4 atmel_serial: RS485: receiving enabled when sending data
By default the atmel_serial driver in RS485 mode disables receiving data until
all data in the send buffer has been sent. This flag allows to receive data
even whilst sending data.

Signed-off-by: Bernhard Roth <br@pwrnet.de>
Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-24 15:27:59 -07:00
Greg Kroah-Hartman
9a234349de Revert "tty: serial8250: add helpers for the DesignWare 8250"
This reverts commit 6b1a98d1c4.

It causes a build error that needs to be resolved differently.

Cc: Alan Cox <alan@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-24 15:25:49 -07:00
Samuel Ortiz
e8753043f9 NFC: Reserve tx head and tail room
We can have the NFC core layer allocating the tx head and tail
room for the drivers and avoid 1 or more SKBs copy on write on
the Tx path.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 14:41:44 -04:00
Javier Cardona
16dd7267f4 {nl,cfg,mac}80211: let userspace make meshif mesh gate
Allow userspace to set NL80211_MESHCONF_GATE_ANNOUNCEMENTS attribute,
which will advertise this mesh node as being a mesh gate.
NL80211_HWMP_ROOTMODE must be set or this will do nothing.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:43 -04:00
Javier Cardona
0507e159a2 {nl,cfg,mac}80211: let userspace set RANN interval
Allow userspace to set Root Announcement Interval for our mesh
interface. Also, RANN interval is now in proper units of TUs.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:43 -04:00
Javier Cardona
5ee68e5b39 mac80211: mesh gate implementation
In this implementation, a mesh gate is a root node with a certain bit
set in its RANN flags. The mpath to this root node is marked as a path
to a gate, and added to our list of known gates for this if_mesh. Once a
path discovery process fails, we forward the unresolved frames to a
known gate. Thanks to Luis Rodriguez for refactoring and bug fix help.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:42 -04:00
Linus Torvalds
051732bcbe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: check size of FUSE_NOTIFY_INVAL_ENTRY message
  fuse: mark pages accessed when written to
  fuse: delete dead .write_begin and .write_end aops
  fuse: fix flock
  fuse: fix non-ANSI void function notation
2011-08-24 09:14:42 -07:00
Shaohua Li
56ebdaf2fa block: simplify force plug flush code a little bit
Cleaning up the code a little bit. attempt_plug_merge() traverses the plug
list anyway, we can do the request counting there, so stack size is reduced
a little bit.
The motivation here is I suspect if we should count the requests for each
queue (task could handle multiple disks in the meantime), but my test doesn't
show it's worthy doing. If somebody proves we should do it, below change
will make that more easier.

Signed-off-by: Shaohua Li <shli@kernel.org>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-24 16:04:34 +02:00
Daniel Kurtz
d5051272fc Input: add BTN_TOOL_QUINTTAP for reporting 5 fingers on touchpad
"4-finger scroll" is a gesture supported by some applications and
operating systems.

"Resting thumb" is when a clickpad user rests a finger (e.g., a
thumb), in a "click zone" (typically the bottom of the touchpad) in
anticipation of click+move=select gestures.

Thus, "4-finger scroll + resting thumb" is a 5-finger gesture.
To allow userspace to detect this gesture, we send BTN_TOOL_QUINTTAP.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Acked-by: Chase Douglas <chase.douglas@canonical.com>
Acked-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-08-23 23:08:29 -07:00
Kay Sievers
e03c8dd149 loop: always allow userspace partitions and optionally support automatic scanning
Automatic partition scanning can be requested individually per loop
device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
partition tables are scanned.

Userspace can now always add and remove partitions from all loop
devices, regardless if the in-kernel partition scanner is enabled or
not.

The needed partition minor numbers are allocated from the extended
minors space, the main loop device numbers will continue to match the
loop minors, regardless of the number of partitions used.

  # grep . /sys/class/block/loop1/loop/*
  /sys/block/loop1/loop/autoclear:0
  /sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
  /sys/block/loop1/loop/offset:0
  /sys/block/loop1/loop/partscan:1
  /sys/block/loop1/loop/sizelimit:0

  # ls -l /dev/loop*
  brw-rw---- 1 root disk   7,   0 Aug 14 20:22 /dev/loop0
  brw-rw---- 1 root disk   7,   1 Aug 14 20:23 /dev/loop1
  brw-rw---- 1 root disk 259,   0 Aug 14 20:23 /dev/loop1p1
  brw-rw---- 1 root disk 259,   1 Aug 14 20:23 /dev/loop1p2
  brw-rw---- 1 root disk   7,  99 Aug 14 20:23 /dev/loop99
  brw-rw---- 1 root disk 259,   2 Aug 14 20:23 /dev/loop99p1
  brw-rw---- 1 root disk 259,   3 Aug 14 20:23 /dev/loop99p2
  crw------T 1 root root  10, 237 Aug 14 20:22 /dev/loop-control

Cc: Karel Zak  <kzak@redhat.com>
Cc: Davidlohr Bueso <dave@gnu.org>
Acked-By: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23 20:12:04 +02:00
Tejun Heo
d27769ec3d block: add GENHD_FL_NO_PART_SCAN
There are cases where suppressing partition scan is useful - e.g. for
lo devices and pseudo SATA devices which advertise to be a disk but
get upset on partition scan (some port multiplier control devices show
such behavior).

This patch adds GENHD_FL_NO_PART_SCAN which suppresses partition scan
regardless of the number of possible partitions.  disk_partitionable()
is renamed to disk_part_scan_enabled() as suppressing partition scan
doesn't imply the device can't be partitioned using
BLKPG_ADD/DEL_PARTITION calls from userland.  show_partition() now
directly tests disk_max_parts() to maintain backward-compatibility.

-v2: Updated to make it clear that only partition scan is suppressed
     not partitioning itself as suggested by Kay Sievers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23 20:01:04 +02:00
Jamie Iles
6b1a98d1c4 tty: serial8250: add helpers for the DesignWare 8250
The Synopsys DesignWare 8250 is an 8250 that has an extra interrupt that
gets raised when writing to the LCR when busy.  To handle this we need
special serial_out, serial_in and handle_irq methods.  Add a new
function serial8250_use_designware_io() that configures a uart_port with
these accessors.

Cc: Alan Cox <alan@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:54:19 -07:00
Jamie Iles
4834d02897 tty: serial8250: remove UPIO_DWAPB{,32}
Now that platforms can override the port IRQ handler and the only user
of these UPIO modes has been converted over, kill off UPIO_DWAPB and
UPIO_DWAPB32.

Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:53:00 -07:00
Jamie Iles
583d28e92f tty: serial8250: allow platforms to override irq handler
Some ports (e.g. Synopsys DesignWare 8250) have special requirements for
handling the interrupts.  Allow these platforms to specify their own
interrupt handler that will override the default.
serial8250_handle_irq() is provided so that platforms can extend the IRQ
handler rather than completely replacing it.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:52:59 -07:00
Jamie Iles
a74036f512 tty: serial: allow ports to override the irq handler
Some serial ports may have unusal requirements for interrupt handling
(e.g. the Synopsys DesignWare 8250-alike port and it's busy detect
interrupt).  Add a .handle_irq callback that can be used for platforms
to override the interrupt behaviour in a similar fashion to the
.serial_out and .serial_in callbacks.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:52:58 -07:00
Jiri Slaby
906cbe1364 TTY: remove tty_locked
We used it really only serial and ami_serial. The rest of the
callsites were BUG/WARN_ONs to check if BTM is held. Now that we
pruned tty_locked from both of the real users, we can get rid of
tty_lock along with __big_tty_mutex_owner.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:34:07 -07:00
Jiri Slaby
6a3e492b6d TTY: serial, remove tasklet for tty_wakeup
tty_wakeup can be called from any context. So there is no need to have
an extra tasklet for calling that. Hence save some space and remove
the tasklet completely.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:34:06 -07:00
Jiri Slaby
24d406a6bf TTY: pty, fix pty counting
tty_operations->remove is normally called like:
queue_release_one_tty
 ->tty_shutdown
   ->tty_driver_remove_tty
     ->tty_operations->remove

However tty_shutdown() is called from queue_release_one_tty() only if
tty_operations->shutdown is NULL. But for pty, it is not.
pty_unix98_shutdown() is used there as ->shutdown.

So tty_operations->remove of pty (i.e. pty_unix98_remove()) is never
called. This results in invalid pty_count. I.e. what can be seen in
/proc/sys/kernel/pty/nr.

I see this was already reported at:
  https://lkml.org/lkml/2009/11/5/370
But it was not fixed since then.

This patch is kind of a hackish way. The problem lies in ->install. We
allocate there another tty (so-called tty->link). So ->install is
called once, but ->remove twice, for both tty and tty->link. The fix
here is to count both tty and tty->link and divide the count by 2 for
user.

And to have ->remove called, let's make tty_driver_remove_tty() global
and call that from pty_unix98_shutdown() (tty_operations->shutdown).

While at it, let's document that when ->shutdown is defined,
tty_shutdown() is not called.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:10:38 -07:00
Kuninori Morimoto
29cc88979a USB: use usb_endpoint_maxp() instead of le16_to_cpu()
Now ${LINUX}/drivers/usb/* can use usb_endpoint_maxp(desc) to get maximum packet size
instead of le16_to_cpu(desc->wMaxPacketSize).
This patch fix it up

Cc: Armin Fuerst <fuerst@in.tum.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Johannes Erdfelt <johannes@erdfelt.com>
Cc: Vojtech Pavlik <vojtech@suse.cz>
Cc: Oliver Neukum <oliver@neukum.name>
Cc: David Kubicek <dave@awk.cz>
Cc: Johan Hovold <jhovold@gmail.com>
Cc: Brad Hards <bhards@bigpond.net.au>
Acked-by: Felipe Balbi <balbi@ti.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Dahlmann <dahlmann.thomas@arcor.de>
Cc: David Brownell <david-b@pacbell.net>
Cc: David Lopo <dlopo@chipidea.mips.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Michal Nazarewicz <m.nazarewicz@samsung.com>
Cc: Xie Xiaobo <X.Xie@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Jiang Bo <tanya.jiang@freescale.com>
Cc: Yuan-hsin Chen <yhchen@faraday-tech.com>
Cc: Darius Augulis <augulis.darius@gmail.com>
Cc: Xiaochen Shen <xiaochen.shen@intel.com>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: OKI SEMICONDUCTOR, <toshiharu-linux@dsn.okisemi.com>
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Ben Dooks <ben@simtec.co.uk>
Cc: Thomas Abraham <thomas.ab@samsung.com>
Cc: Herbert Pötzl <herbert@13thfloor.at>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Cc: Roman Weissgaerber <weissg@vienna.at>
Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Tony Olech <tony.olech@elandigitalsystems.com>
Cc: Florian Floe Echtler <echtler@fs.tum.de>
Cc: Christian Lucht <lucht@codemercs.com>
Cc: Juergen Stuber <starblue@sourceforge.net>
Cc: Georges Toth <g.toth@e-biz.lu>
Cc: Bill Ryder <bryder@sgi.com>
Cc: Kuba Ober <kuba@mareimbrium.org>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 09:47:40 -07:00
Christoph Hellwig
65299a3b78 block: separate priority boosting from REQ_META
Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.

All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23 14:50:29 +02:00
Christoph Hellwig
5dc06c5a70 block: remove READ_META and WRITE_META
Replace all occurnanced of the undocumented READ_META with READ | REQ_META
and remove the unused WRITE_META define.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23 14:49:55 +02:00
Jason Baron
b5fb0a0321 dynamic_debug: make netif_dbg() call __netdev_printk()
Previously, netif_dbg() was using dynamic_dev_dbg() to perform
the underlying printk. Fix it to use __netdev_printk(), instead.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 18:23:06 -07:00
Jason Baron
ffa10cb47a dynamic_debug: make netdev_dbg() call __netdev_printk()
Previously, if dynamic debug was enabled netdev_dbg() was using
dynamic_dev_dbg() to print out the underlying msg. Fix this by making
sure netdev_dbg() uses __netdev_printk().

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 18:23:06 -07:00
Jason Baron
fae8d206a1 dynamic_debug: remove unused control variables
Remove no longer used dynamic debug control variables. The
definitions were removed a while ago, but we forgot to clean
up the extern references.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 18:23:05 -07:00
Joe Perches
cbc4663552 dynamic_debug: Add __dynamic_dev_dbg
Unlike dynamic_pr_debug, dynamic uses of dev_dbg can not
currently add task_pid/KBUILD_MODNAME/__func__/__LINE__
to selected debug output.

Add a new function similar to dynamic_pr_debug to
optionally emit these prefixes.

Cc: Aloisio Almeida <aloisio.almeida@openbossa.org>
Noticed-by: Aloisio Almeida <aloisio.almeida@openbossa.org>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 18:23:04 -07:00
Ian Campbell
131ea6675c net: add APIs for manipulating skb page fragments.
The primary aim is to add skb_frag_(ref|unref) in order to remove the use of
bare get/put_page on SKB pages fragments and to isolate users from subsequent
changes to the skb_frag_t data structure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-22 16:59:54 -07:00
Sebastian Andrzej Siewior
da6819dbff usb: ch9: add function defines from ch9, USB 3.0 spec
not to confuse with Table 9-7 in USB 2.0 spec

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 15:59:25 -07:00
kuninori.morimoto.gx@renesas.com
939f325f4a usb: add usb_endpoint_maxp() macro
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 15:54:38 -07:00
Pavan Savoy
0d7c5f2572 drivers:misc:ti-st: platform hooks for chip states
Certain platform specific or Host-WiLink Interface specific actions would be
required to be taken when the chip is being enabled and after the chip is
disabled such as configuration of the mux modes for the GPIO of host connected
to the nshutdown of the chip or relinquishing UART after the chip is disabled.

Similar actions can also be taken when the chip is in deep sleep or when the
chip is awake. Performance enhancements such as configuring the host to run
faster when chip is awake and slower when chip is asleep can also be made
here.

Signed-off-by: Pavan Savoy <pavan_savoy@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22 14:13:32 -07:00
Thomas Pedersen
25d49e4d63 mac80211: update mesh path selection frame format
Make mesh path selection frames Mesh Action category, remove outdated
Mesh Path Selection category and defines, use updated reason codes, add
mesh_action_is_path_sel for readability, and update/correct path
selection IEs.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:46:00 -04:00
Thomas Pedersen
36c704fded ieee80211: add mesh action codes
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:46:00 -04:00
Thomas Pedersen
8db098507c mac80211: update mesh peering frame format
This patch updates the mesh peering frames to the format specified in
the recently ratified 802.11s standard. Several changes took place to
make this happen:

	- Change RX path to handle new self-protected frames
	- Add new Peering management IE
	- Remove old Peer Link IE
	- Remove old plink_action field in ieee80211_mgmt header

These changes by themselves would either break peering, or work by
coincidence, so squash them all into this patch.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:46:00 -04:00
Thomas Pedersen
6709a6d96e ieee80211: introduce Self Protected Action codes
802.11s introduces a new action frame category, add action codes as well
as an entry in ieee80211_mgmt.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:45:59 -04:00
Rafał Miłecki
984e5befba bcma: implement BCM4331 workaround for external PA lines
We need to disable ext. PA lines for reading SPROM. It's disabled by
default, but this patch allows using bcma after loading wl, which leaves
workaround enabled.

Cc: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:45:59 -04:00
Helmut Schaa
c1407b6cb2 wireless: Introduce defines for BAR TID_INFO & MULTI_TID fields
While at it also fix the indention of the other IEEE80211_BAR_CTRL_ defines.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-22 14:45:58 -04:00
John W. Linville
b38d355eaa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/staging/ath6kl/miscdrv/ar3kps/ar3kpsparser.c
	drivers/staging/ath6kl/os/linux/ar6000_drv.c
2011-08-22 14:28:50 -04:00
Mark Brown
fec4fe26ec Merge branch 'regmap-mfd' into regmap-next 2011-08-22 12:33:11 +01:00
Mark Brown
50eeef5d3c mfd: Convert WM8400 to regmap API
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-22 12:32:22 +01:00
Mark Brown
d6c645fc00 mfd: Convert WM8994 to use new register map API
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-08-22 12:25:15 +01:00
Mark Brown
1df5981b82 mfd: Convert WM831x to use regmap API
Factor out the register read/write code to use the register map API.  We
still need some wm831x specific code and locking in place to check that
the user key is handled correctly but only on the write side, reads are
not affected by the key.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-08-22 12:23:22 +01:00
Mark Brown
ae130d22de Merge branch 'regmap-interface' into regmap-next 2011-08-21 12:55:20 +01:00
Mark Brown
bd20eb541e regmap: Allow drivers to specify register defaults
It is useful for the register cache code to be able to specify the
default values for the device registers. The major use is when restoring
the register cache after suspend, knowing the register defaults allows
us to skip registers that are at their default values when we resume which
can be a substantial win on larger modern devices. For some devices
(mostly older ones) the hardware does not support readback so the only way we
can know the values is from code and so initializing the cache with default
values makes it much easier for drivers work with read/modify/write
updates.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-21 12:54:54 +01:00
David S. Miller
823dcd2506 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net 2011-08-20 10:39:12 -07:00
Alan Stern
5b1b0b812a PM / Runtime: Add macro to test for runtime PM events
This patch (as1482) adds a macro for testing whether or not a
pm_message value represents an autosuspend or autoresume (i.e., a
runtime PM) event.  Encapsulating this notion seems preferable to
open-coding the test all over the place.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-19 23:49:48 +02:00
Linus Torvalds
5ccc38740a Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block: (23 commits)
  Revert "cfq: Remove special treatment for metadata rqs."
  block: fix flush machinery for stacking drivers with differring flush flags
  block: improve rq_affinity placement
  blktrace: add FLUSH/FUA support
  Move some REQ flags to the common bio/request area
  allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
  xen/blkback: Make description more obvious.
  cfq-iosched: Add documentation about idling
  block: Make rq_affinity = 1 work as expected
  block: swim3: fix unterminated of_device_id table
  block/genhd.c: remove useless cast in diskstats_show()
  drivers/cdrom/cdrom.c: relax check on dvd manufacturer value
  drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
  bsg-lib: add module.h include
  cfq-iosched: Reduce linked group count upon group destruction
  blk-throttle: correctly determine sync bio
  loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
  loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
  loop: add management interface for on-demand device allocation
  loop: replace linked list of allocated devices with an idr index
  ...
2011-08-19 10:47:07 -07:00
Eric Dumazet
11fd165c68 sunrpc: use better NUMA affinities
Use NUMA aware allocations to reduce latencies and increase throughput.

sunrpc kthreads can use kthread_create_on_node() if pool_mode is
"percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
also take into account NUMA node affinity for memory allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: Neil Brown <neilb@suse.de>
CC: David Miller <davem@davemloft.net>
Reviewed-by: Greg Banks <gnb@fastmail.fm>
[bfields@redhat.com: fix up caller nfs41_callback_up]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19 13:25:36 -04:00
J. Bruce Fields
778fc546f7 locks: fix tracking of inprogress lease breaks
We currently use a bit in fl_flags to record whether a lease is being
broken, and set fl_type to the type (RDLCK or UNLCK) that it will
eventually have.  This means that once the lease break starts, we forget
what the lease's type *used* to be.  Breaking a read lease will then
result in blocking read opens, even though there's no conflict--because
the lease type is now F_UNLCK and we can no longer tell whether it was
previously a read or write lease.

So, instead keep fl_type as the original type (the type which we
enforce), and keep track of whether we're unlocking or merely
downgrading by replacing the single FL_INPROGRESS flag by
FL_UNLOCK_PENDING and FL_DOWNGRADE_PENDING flags.

To get this right we also need to track separate downgrade and break
times, to handle the case where a write-leased file gets conflicting
opens first for read, then later for write.

(I first considered just eliminating the downgrade behavior
completely--nfsv4 doesn't need it, and nobody as far as I can tell
actually uses it currently--but Jeremy Allison tells me that Windows
oplocks do behave this way, so Samba will probably use this some day.)

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19 13:25:34 -04:00
J. Bruce Fields
710b721696 locks: move F_INPROGRESS from fl_type to fl_flags field
F_INPROGRESS isn't exposed to userspace.  To me it makes more sense in
fl_flags....

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19 13:25:34 -04:00
Linus Torvalds
0c3bef6128 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: OF: Don't crash when bridge parent is NULL.
  PCI: export pcie_bus_configure_settings symbol
  PCI: code and comments cleanup
  PCI: make cardbus-bridge resources optional
  PCI: make SRIOV resources optional
  PCI : ability to relocate assigned pci-resources
  PCI: honor child buses add_size in hot plug configuration
  PCI: Set PCI-E Max Payload Size on fabric
2011-08-19 10:02:37 -07:00
Christoph Lameter
49e2258586 slub: per cpu cache for partial pages
Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to
partial pages. The partial page list is used in slab_free() to avoid
per node lock taking.

In __slab_alloc() we can then take multiple partial pages off the per
node partial list in one go reducing node lock pressure.

We can also use the per cpu partial list in slab_alloc() to avoid scanning
partial lists for pages with free objects.

The main effect of a per cpu partial list is that the per node list_lock
is taken for batches of partial pages instead of individual ones.

Potential future enhancements:

1. The pickup from the partial list could be perhaps be done without disabling
   interrupts with some work. The free path already puts the page into the
   per cpu partial list without disabling interrupts.

2. __slab_free() may have some code paths that could use optimization.

Performance:

				Before		After
./hackbench 100 process 200000
				Time: 1953.047	1564.614
./hackbench 100 process 20000
				Time: 207.176   156.940
./hackbench 100 process 20000
				Time: 204.468	156.940
./hackbench 100 process 20000
				Time: 204.879	158.772
./hackbench 10 process 20000
				Time: 20.153	15.853
./hackbench 10 process 20000
				Time: 20.153	15.986
./hackbench 10 process 20000
				Time: 19.363	16.111
./hackbench 1 process 20000
				Time: 2.518	2.307
./hackbench 1 process 20000
				Time: 2.258	2.339
./hackbench 1 process 20000
				Time: 2.864	2.163

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:27 +03:00
Wu Fengguang
bb0822954a squeeze max-pause area and drop pass-good area
Revert the pass-good area introduced in ffd1f609ab ("writeback:
introduce max-pause and pass-good dirty limits") and make the max-pause
area smaller and safe.

This fixes ~30% performance regression in the ext3 data=writeback
fio_mmap_randwrite_64k/fio_mmap_randrw_64k test cases, where there are
12 JBOD disks, on each disk runs 8 concurrent tasks doing reads+writes.

Using deadline scheduler also has a regression, but not that big as CFQ,
so this suggests we have some write starvation.

The test logs show that

- the disks are sometimes under utilized

- global dirty pages sometimes rush high to the pass-good area for
  several hundred seconds, while in the mean time some bdi dirty pages
  drop to very low value (bdi_dirty << bdi_thresh).  Then suddenly the
  global dirty pages dropped under global dirty threshold and bdi_dirty
  rush very high (for example, 2 times higher than bdi_thresh). During
  which time balance_dirty_pages() is not called at all.

So the problems are

1) The random writes progress so slow that they break the assumption of
   the max-pause logic that "8 pages per 200ms is typically more than
   enough to curb heavy dirtiers".

2) The max-pause logic ignored task_bdi_thresh and thus opens the possibility
   for some bdi's to over dirty pages, leading to (bdi_dirty >> bdi_thresh)
   and then (bdi_thresh >> bdi_dirty) for others.

3) The higher max-pause/pass-good thresholds somehow leads to the bad
   swing of dirty pages.

The fix is to allow the task to slightly dirty over task_bdi_thresh, but
no way to exceed bdi_dirty and/or global dirty_thresh.

Tests show that it fixed the JBOD regression completely (both behavior
and performance), while still being able to cut down large pause times
in balance_dirty_pages() for single-disk cases.

Reported-by: Li Shaohua <shaohua.li@intel.com>
Tested-by: Li Shaohua <shaohua.li@intel.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-08-19 22:42:07 +08:00
Changli Gao
4ca2462e94 net: add the comment for skb->l4_rxhash
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-19 07:26:44 -07:00
Jiri Pirko
b81693d914 net: remove ndo_set_multicast_list callback
Remove no longer used operation.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-17 20:22:03 -07:00
Jiri Pirko
01789349ee net: introduce IFF_UNICAST_FLT private flag
Use IFF_UNICAST_FTL to find out if driver handles unicast address
filtering. In case it does not, promisc mode is entered.

Patch also fixes following drivers:
stmmac, niu: support uc filtering and yet it propagated
	ndo_set_multicast_list
bna, benet, pxa168_eth, ks8851, ks8851_mll, ksz884x : has set
	ndo_set_rx_mode but do not support uc filtering

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-17 20:21:27 -07:00
Tom Herbert
bdeab99191 rps: Add flag to skb to indicate rxhash is based on L4 tuple
The l4_rxhash flag was added to the skb structure to indicate
that the rxhash value was computed over the 4 tuple for the
packet which includes the port information in the encapsulated
transport packet.  This is used by the stack to preserve the
rxhash value in __skb_rx_tunnel.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-17 20:06:03 -07:00
Linus Torvalds
b4fd4ae6c6 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Domains: Fix build for CONFIG_PM_RUNTIME unset
2011-08-17 13:15:25 -07:00
Ian Campbell
aa462abe8a mm: fix __page_to_pfn for a const struct page argument
This allows the cast in lowmem_page_address (introduced as a warning
fixup to 33dd4e0ec9 "mm: make some struct page's const") to be
removed.

Propagate const'ness to page_to_section() as well since it is required
by __page_to_pfn.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-17 13:00:20 -07:00
Ian Campbell
f991879473 mm: make HASHED_PAGE_VIRTUAL page_address' struct page argument const.
Followup to 33dd4e0ec9 "mm: make some struct page's const" which missed the
HASHED_PAGE_VIRTUAL case.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-17 13:00:20 -07:00
Linus Torvalds
6cac952960 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rtc: Limit RTC PIE frequency
  rtc: Fix hrtimer deadlock
  rtc: Handle errors correctly in rtc_irq_set_state()

Fixup trivial conflicts in drivers/rtc/interface.c due to slightly
trivially versions of the same patch coming in two different ways.
2011-08-17 10:28:33 -07:00
Linus Torvalds
950d0a10d1 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: Track the owner of irq descriptor
  irq: Always set IRQF_ONESHOT if no primary handler is specified
  genirq: Fix wrong bit operation
2011-08-17 10:23:50 -07:00
Lukas Czerner
fbc854027c ext3: remove deprecated oldalloc
For a long time now orlov is the default block allocator in the ext3. It
performs better than the old one and no one seems to claim otherwise so
we can safely drop it and make oldalloc and orlov mount option
deprecated.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-08-17 11:42:19 +02:00
Ben Hutchings
a27fc96b03 ethtool: Note common alternate exit condition for interrupt coalescing
Many implementations ignore the value of max_frames and do not
treat usecs == 0 as special.  Document this as deprecated.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 16:36:15 -07:00
Ben Hutchings
acf42a2a57 ethtool: Explicitly state the exit condition for interrupt coalescing
Also explicitly state how to disable interrupt coalescing.

Remove the now-redundant text from field descriptions.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 16:36:15 -07:00
Ben Hutchings
2d7c79390e ethtool: Correct description of 'max_coalesced_frames' fields
The current descriptions state that these fields specify 'How many
packets to delay ... after a packet ...' which implies that the
hardware should wait for (max_coalesced_frames + 1) completions before
generating an interrupt.  It is also stated that setting both this
field and the corresponding 'coalesce_usecs' field to 0 is invalid.
Together, this implies that the hardware must always be configured
to delay a completion IRQ for at least 1 usec or 1 more completion.

I believe that the addition of 1 is not intended, and David Miller
confirms that the original implementation (in tg3) does not do this.
Clarify the descriptions of these fields to avoid this interpretation.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 16:36:14 -07:00
Ben Hutchings
725b361b29 ethtool: Specify what kind of coalescing struct ethtool_coalesce covers
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 16:36:14 -07:00
Ben Hutchings
5f308d195e ethtool: Reformat struct ethtool_coalesce comments into kernel-doc format
This reorders and duplicates some wording, but should make no
substantive changes.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 16:36:14 -07:00
Don Zickus
abd4d5587b pstore: change mutex locking to spin_locks
pstore was using mutex locking to protect read/write access to the
backend plug-ins.  This causes problems when pstore is executed in
an NMI context through panic() -> kmsg_dump().

This patch changes the mutex to a spin_lock_irqsave then also checks to
see if we are in an NMI context.  If we are in an NMI and can't get the
lock, just print a message stating that and blow by the locking.

All this is probably a hack around the bigger locking problem but it
solves my current situation of trying to sleep in an NMI context.

Tested by loading the lkdtm module and executing a HARDLOCKUP which
will cause the machine to panic inside the nmi handler.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-08-16 11:55:58 -07:00
Mark Brown
1ddc07d0f1 ASoC: Add WM8958 noise gate support
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-17 00:48:47 +09:00
Vinod Koul
a16e470caa dmaengine: remove struct scatterlist for header
Commit 90b44f8 introduces dmaengine_prep_slave_single API which adds
scatterlist.h in dmaengine.h, so defining struct scatterlist is not required

Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
2011-08-16 16:41:41 +05:30
David S. Miller
19e2f6fe96 net: Fix sungem_phy sharing.
Since sungem_phy is used by multiple, unrelated, drivers make it
build as a real module under drivers/net.

depmod will pick up the symbol dependency and make sure sungem_phy.ko
gets loaded any time sungem.ko or spider_net.ko is loaded.

Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-16 00:16:49 -07:00
Mimi Zohar
1e39f384bb evm: fix build problems
- Make the previously missing security_old_inode_init_security() stub
  function definition static inline.

- The stub security_inode_init_security() function previously returned
  -EOPNOTSUPP and relied on the callers to change it to 0.  The stub
  security/security_old_inode_init_security() functions now return 0.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-16 09:23:44 +10:00
Jeff Moyer
4853abaae7 block: fix flush machinery for stacking drivers with differring flush flags
Commit ae1b153962, block: reimplement
FLUSH/FUA to support merge, introduced a performance regression when
running any sort of fsyncing workload using dm-multipath and certain
storage (in our case, an HP EVA).  The test I ran was fs_mark, and it
dropped from ~800 files/sec on ext4 to ~100 files/sec.  It turns out
that dm-multipath always advertised flush+fua support, and passed
commands on down the stack, where those flags used to get stripped off.
The above commit changed that behavior:

static inline struct request *__elv_next_request(struct request_queue *q)
{
        struct request *rq;

        while (1) {
-               while (!list_empty(&q->queue_head)) {
+               if (!list_empty(&q->queue_head)) {
                        rq = list_entry_rq(q->queue_head.next);
-                       if (!(rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) ||
-                           (rq->cmd_flags & REQ_FLUSH_SEQ))
-                               return rq;
-                       rq = blk_do_flush(q, rq);
-                       if (rq)
-                               return rq;
+                       return rq;
                }

Note that previously, a command would come in here, have
REQ_FLUSH|REQ_FUA set, and then get handed off to blk_do_flush:

struct request *blk_do_flush(struct request_queue *q, struct request *rq)
{
        unsigned int fflags = q->flush_flags; /* may change, cache it */
        bool has_flush = fflags & REQ_FLUSH, has_fua = fflags & REQ_FUA;
        bool do_preflush = has_flush && (rq->cmd_flags & REQ_FLUSH);
        bool do_postflush = has_flush && !has_fua && (rq->cmd_flags &
        REQ_FUA);
        unsigned skip = 0;
...
        if (blk_rq_sectors(rq) && !do_preflush && !do_postflush) {
                rq->cmd_flags &= ~REQ_FLUSH;
		if (!has_fua)
			rq->cmd_flags &= ~REQ_FUA;
	        return rq;
	}

So, the flush machinery was bypassed in such cases (q->flush_flags == 0
&& rq->cmd_flags & (REQ_FLUSH|REQ_FUA)).

Now, however, we don't get into the flush machinery at all.  Instead,
__elv_next_request just hands a request with flush and fua bits set to
the scsi_request_fn, even if the underlying request_queue does not
support flush or fua.

The agreed upon approach is to fix the flush machinery to allow
stacking.  While this isn't used in practice (since there is only one
request-based dm target, and that target will now reflect the flush
flags of the underlying device), it does future-proof the solution, and
make it function as designed.

In order to make this work, I had to add a field to the struct request,
inside the flush structure (to store the original req->end_io).  Shaohua
had suggested overloading the union with rb_node and completion_data,
but the completion data is used by device mapper and can also be used by
other drivers.  So, I didn't see a way around the additional field.

I tested this patch on an HP EVA with both ext4 and xfs, and it recovers
the lost performance.  Comments and other testers, as always, are
appreciated.

Cheers,
Jeff

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-15 21:37:25 +02:00
David S. Miller
2bb698412d net: Move sungem_phy.h under include/linux
Fixes build failures of the spider_net driver because it tries
to use a convoluted path to include this header.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-14 22:52:04 -07:00
Linus Torvalds
97c24d1d45 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: remove unused "ddr" parameter in struct mmc_ios
  mmc: dw_mmc: Fix DDR mode support.
  mmc: core: use defined R1_STATE_PRG macro for card status
  mmc: sdhci: use f_max instead of host->clock for timeouts
  mmc: sdhci: move timeout_clk calculation farther down
  mmc: sdhci: check host->clock before using it as a denominator
  mmc: Revert "mmc: sdhci: Fix SDHCI_QUIRK_TIMEOUT_USES_SDCLK"
  mmc: tmio: eliminate unused variable 'mmc' warning
  mmc: esdhc-imx: fix card interrupt loss on freescale eSDHC
  mmc: sdhci-s3c: Fix build for header change
  mmc: dw_mmc: Fix mask in IDMAC_SET_BUFFER1_SIZE macro
  mmc: cb710: fix possible pci_dev leak in cb710_pci_configure()
  mmc: core: Detect eMMC v4.5 ext_csd entries
  mmc: mmc_test: avoid stalled file in debugfs
  mmc: sdhci-s3c: add BROKEN_ADMA_ZEROLEN_DESC quirk
  mmc: sdhci: pxav3: controller needs 32 bit ADMA addressing
  mmc: sdhci: fix retuning timer wrongly deleted in sdhci_tasklet_finish
2011-08-14 12:28:15 -07:00
Rafael J. Wysocki
17f2ae7f67 PM / Domains: Fix build for CONFIG_PM_RUNTIME unset
Function genpd_queue_power_off_work() is not defined for
CONFIG_PM_RUNTIME, so pm_genpd_poweroff_unused() causes a build
error to happen in that case.  Fix the problem by making
pm_genpd_poweroff_unused() depend on CONFIG_PM_RUNTIME too.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-14 13:34:31 +02:00
Paul Turner
ec12cb7f31 sched: Accumulate per-cfs_rq cpu usage and charge against bandwidth
Account bandwidth usage on the cfs_rq level versus the task_groups to which
they belong.  Whether we are tracking bandwidth on a given cfs_rq is maintained
under cfs_rq->runtime_enabled.

cfs_rq's which belong to a bandwidth constrained task_group have their runtime
accounted via the update_curr() path, which withdraws bandwidth from the global
pool as desired.  Updates involving the global pool are currently protected
under cfs_bandwidth->lock, local runtime is protected by rq->lock.

This patch only assigns and tracks quota, no action is taken in the case that
cfs_rq->runtime_used exceeds cfs_rq->runtime_assigned.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Nikhil Rao <ncrao@google.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110721184757.179386821@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-08-14 12:03:26 +02:00
Jaehoon Chung
7fd781e8f9 mmc: remove unused "ddr" parameter in struct mmc_ios
"mmc: dw_mmc: Fix DDR mode support" removed the last user.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-13 14:50:32 -04:00
Frank Blaschka
4dc83dfd3e if_ether: add new Ethernet Protocol ID for af_iucv
Add a new ethertype for af_iucv over s/390 HiperSockets transport. Since
HiperSockets is not a real ethernet hw this is not an officially registered ID.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:16 -07:00
Linus Torvalds
73e0881d31 Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  dt: add empty of_get_property for non-dt
2011-08-12 21:59:09 -07:00
Jouni Malinen
9946ecfb51 nl80211/cfg80211: Add extra IE configuration to AP mode setup
The NL80211_CMD_NEW_BEACON command is, in practice, requesting AP mode
operations to be started. Add new attributes to provide extra IEs
(e.g., WPS IE, P2P IE) for drivers that build Beacon, Probe Response,
and (Re)Association Response frames internally (likely in firmware).

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-12 13:45:04 -04:00
Jouni Malinen
5fb628e910 nl80211/cfg80211: Add crypto settings into NEW_BEACON
This removes need from drivers to parse the beacon tail/head data
to figure out what crypto settings are to be used in AP mode in case
the Beacon and Probe Response frames are fully constructed in the
driver/firmware.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-12 13:45:04 -04:00
Jouni Malinen
32e9de846b nl80211/cfg80211: Allow SSID to be specified in new beacon command
This makes it easier for drivers that generate Beacon and Probe Response
frames internally (in firmware most likely) in AP mode.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-12 13:45:03 -04:00
Linus Torvalds
ce8a84ef1e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  e1000e: increase driver version number
  e1000e: alternate MAC address update
  e1000e: do not disable receiver on 82574/82583
  e1000e: alternate MAC address does not work on device id 0x1060
  PCnet: Fix section mismatch
  bnx2x: disable dcb on 578xx since not supported yet
  bnx2x: properly clean indirect addresses
  bnx2x: prevent race between undi_unload and load flows
  bnx2x: fix select_queue when FCoE is disabled
  bnx2x: init FCOE FP only once
  ipv4: some rt_iif -> rt_route_iif conversions
  net/bridge/netfilter/ebtables.c: use available error handling code
  net/netlabel/netlabel_kapi.c: add missing cleanup code
  net/irda: sh_sir: tidyup compile warning
  net/irda: sh_sir: add missing header
  net/irda: sh_irda: add missing header
  slcan: ldisc generated skbs are received in softirq context
  scm: Capture the full credentials of the scm sender
  tcp: initialize variable ecn_ok in syncookies path
  drivers/net/wireless/wl1251: add missing kfree
  ...
2011-08-12 06:43:53 -07:00
Mark Brown
c18eee3181 ASoC: Add bitfield definitions for WM8958 MICBIAS registers
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
2011-08-12 14:23:04 +09:00
Vasiliy Kulikov
72fa59970f move RLIMIT_NPROC check from set_user() to do_execve_common()
The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
check in set_user() to check for NPROC exceeding via setuid() and
similar functions.

Before the check there was a possibility to greatly exceed the allowed
number of processes by an unprivileged user if the program relied on
rlimit only.  But the check created new security threat: many poorly
written programs simply don't check setuid() return code and believe it
cannot fail if executed with root privileges.  So, the check is removed
in this patch because of too often privilege escalations related to
buggy programs.

The NPROC can still be enforced in the common code flow of daemons
spawning user processes.  Most of daemons do fork()+setuid()+execve().
The check introduced in execve() (1) enforces the same limit as in
setuid() and (2) doesn't create similar security issues.

Neil Brown suggested to track what specific process has exceeded the
limit by setting PF_NPROC_EXCEEDED process flag.  With the change only
this process would fail on execve(), and other processes' execve()
behaviour is not changed.

Solar Designer suggested to re-check whether NPROC limit is still
exceeded at the moment of execve().  If the process was sleeping for
days between set*uid() and execve(), and the NPROC counter step down
under the limit, the defered execve() failure because NPROC limit was
exceeded days ago would be unexpected.  If the limit is not exceeded
anymore, we clear the flag on successful calls to execve() and fork().

The flag is also cleared on successful calls to set_user() as the limit
was exceeded for the previous user, not the current one.

Similar check was introduced in -ow patches (without the process flag).

v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().

Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-11 11:24:42 -07:00
Namhyung Kim
c09c47caed blktrace: add FLUSH/FUA support
Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
FUA follows WRITE, use the same 'F' flag for both cases and
distinguish them by their (relative) position. The end results
look like (other flags might be shown also):

 - WRITE:            W
 - WRITE_FLUSH:      FW
 - WRITE_FUA:        WF
 - WRITE_FLUSH_FUA:  FWF

Note that we reuse TC_BARRIER due to lack of bit space of act_mask
so that the older versions of blktrace tools will report flush
requests as barriers from now on.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-11 10:36:05 +02:00
Matthew Wilcox
8e4bf84474 Move some REQ flags to the common bio/request area
REQ_SECURE, REQ_FLUSH and REQ_FUA may all be set on a bio as well as
on a request, so relocate them to the shared part of the enum.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-11 10:36:03 +02:00
Mimi Zohar
5a4730ba95 evm: fix evm_inode_init_security return code
evm_inode_init_security() should return 0, when EVM is not enabled.
(Returning an error is a remnant of evm_inode_post_init_security.)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-11 17:42:41 +10:00
Mimi Zohar
e1c9b23adb evm: building without EVM enabled fixes
- Missing 'inline' on evm_inode_setattr() definition.
Introduced by commit 817b54aa45 ("evm: add evm_inode_setattr to prevent
updating an invalid security.evm").

- Missing security_old_inode_init_security() stub function definition.
Caused by commit 9d8f13ba3f ("security: new security_inode_init_security
API adds function callback").

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-11 17:42:07 +10:00
Mathieu Desnoyers
b75ef8b44b Tracepoint: Dissociate from module mutex
Copy the information needed from struct module into a local module list
held within tracepoint.c from within the module coming/going notifier.

This vastly simplifies locking of tracepoint registration /
unregistration, because we don't have to take the module mutex to
register and unregister tracepoints anymore. Steven Rostedt ran into
dependency problems related to modules mutex vs kprobes mutex vs ftrace
mutex vs tracepoint mutex that seems to be hard to fix without removing
this dependency between tracepoint and module mutex. (note: it should be
investigated whether kprobes could benefit of being dissociated from the
modules mutex too.)

This also fixes module handling of tracepoint list iterators, because it
was expecting the list to be sorted by pointer address. Given we have
control on our own list now, it's OK to sort this list which has
tracepoints as its only purpose. The reason why this sorting is required
is to handle the fact that seq files (and any read() operation from
user-space) cannot hold the tracepoint mutex across multiple calls, so
list entries may vanish between calls. With sorting, the tracepoint
iterator becomes usable even if the list don't contain the exact item
pointed to by the iterator anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Jason Baron <jbaron@redhat.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-08-10 20:38:14 -04:00
John Stultz
9082c465a5 alarmtimers: Add try_to_cancel functionality
There's a number of edge cases when cancelling a alarm, so
to be sure we accurately do so, introduce try_to_cancel, which
returns proper failure errors if it cannot. Also modify cancel
to spin until the alarm is properly disabled.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:29 -07:00
John Stultz
a28cde81ab alarmtimers: Add more refined alarm state tracking
In order to allow for functionality like try_to_cancel, add
more refined  state tracking (similar to hrtimers).

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:27 -07:00
John Stultz
9e26476243 alarmtimers: Remove period from alarm structure
Now that periodic alarmtimers are managed by the handler function,
remove the period value from the alarm structure and let the handlers
manage the interval on their own.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:26 -07:00
John Stultz
dce75a8c71 alarmtimers: Add alarm_forward functionality
In order to avoid wasting time expiring and re-adding very high freq
periodic alarmtimers, introduce alarm_forward() which is similar to
hrtimer_forward and moves the timer to the next future expiration time
and returns the number of overruns.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:23 -07:00
John Stultz
4b41308d2d alarmtimers: Change alarmtimer functions to return alarmtimer_restart values
In order to properly fix the denial of service issue with high freq
periodic alarm timers, we need to push the re-arming logic into the
alarm timer handler, much as the hrtimer code does.

This patch introduces alarmtimer_restart enum and changes the
alarmtimer handler declarations to use it as a return value. Further,
to ease following changes, it extends the alarmtimer handler functions
to also take the time at expiration. No logic is yet modified.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-08-10 14:55:20 -07:00
David Herrmann
4ea5454203 HID: Fix race condition between driver core and ll-driver
HID low level drivers register new devices with the HID core which then
adds the devices to the HID bus. The HID bus normally immediately probes
an appropriate driver which then handles HID input for this device.
The ll driver now uses the hid_input_report() function to report input
events for a specific device. However, if the HID bus unloads the driver
at the same time (for instance via a call to
 /sys/bus/hid/devices/<dev>/unbind) then the hdev->driver pointer may be
used by hid_input_report() and hid_device_remove() at the same time
which may cause hdev->driver to point to invalid memory.

This fix adds a semaphore to every hid device which protects
hdev->driver from asynchronous access. This semaphore is locked during
driver *_probe and *_remove and also inside hid_input_report(). The
*_probe and *_remove functions may sleep so the semaphore is good here,
however, hid_input_report() is in atomic context and hence only uses
down_trylock(). If it cannot acquire the lock it simply drops the input
package.

The low-level drivers report input events synchronously so
hid_input_report() should never be entered twice at the same time on the
same device. Hence, the lock should always be available. But if the
driver is currently probed/removed then the lock is not available and
dropping the package should be safe because this is what would have
happened if the package arrived some milliseconds earlier/later.

This also fixes another race condition while probing drivers:
First the *_probe function of the driver is called and only if that
succeeds, the related input device of hidinput is registered. If the low
level driver reports input events after the *_probe function returned
but before the input device is registered, then a NULL pointer
dereference will occur. (Equivalently on driver remove function).
This is not possible anymore, since the semaphore lock drops all
incoming packages until the driver/device is fully initialized.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-08-10 14:02:07 +02:00
Rafał Miłecki
454496fd05 ssb: define boardflags
They are SPROM specific, so all should be defined in ssb code.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-09 15:38:57 -04:00
Stephen Warren
89272b8c0d dt: add empty of_get_property for non-dt
The patch adds empty function of_get_property for non-dt build, so that
drivers migrating to dt can save some '#ifdef CONFIG_OF'.

This also fixes the current Tegra compile problem in linux-next.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-08-09 11:27:16 -06:00
Mark Brown
790923e56b regmap: Remove unused type and list fields from bus interface
We no longer enumerate the bus types, we rely on the driver telling us
this on init.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-10 00:26:38 +09:00
Eric Andersson
c17ca3f5a2 Input: add driver for Bosch Sensortec's BMA150 accelerometer
Signed-off-by: Albert Zhang <xu.zhang@bosch-sensortec.com>
Signed-off-by: Eric Andersson <eric.andersson@unixphere.com>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Reviewed-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-08-09 01:33:04 -07:00
Mark Brown
3566cc9d90 regmap: Fix kerneldoc errors for regmap
Field names didn't match between the documentation and the code.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-09 10:25:06 +09:00
James Morris
5a2f3a02ae Merge branch 'next-evm' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/ima-2.6 into next
Conflicts:
	fs/attr.c

Resolve conflict manually.

Signed-off-by: James Morris <jmorris@namei.org>
2011-08-09 10:31:03 +10:00
Peter Zijlstra
5c723ba5b7 mm: Fix fixup_user_fault() for MMU=n
In commit 2efaca927f ("mm/futex: fix futex writes on archs with SW
tracking of dirty & young") we forgot about MMU=n.  This patch fixes
that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/1311761831.24752.413.camel@twins
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-08 12:11:02 -07:00
Linus Torvalds
638a843909 cred: use 'const' in get_current_{user,groups}
Avoid annoying warnings from these functions ("discards qualifiers")
because they assign 'current_cred()' to a non-const pointer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-08 11:33:23 -07:00
Hauke Mehrtens
908debc8da bcma: get CPU clock
Add method to return the clock of the CPU. This is needed by the arch
code to calculate the mips_hpt_frequency.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:29:29 -04:00
Hauke Mehrtens
e3afe0e5be bcma: add serial console support
This adds support for serial console to bcma, when operating on an SoC.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:29:28 -04:00
Hauke Mehrtens
21e0534ad7 bcma: add mips driver
This adds a mips driver to bcma. This is only found on embedded
devices. For now the driver just initializes the irqs used on this
system.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:29:26 -04:00
Hauke Mehrtens
ecd177c216 bcma: add SOC bus
This patch adds support for using bcma on a Broadcom SoC as the system
bus. An SoC like the bcm4716 could register this bus and use it to
searches for the bcma cores and register the devices on this bus.

BCMA_HOSTTYPE_NONE was intended for SoCs at first but BCMA_HOSTTYPE_SOC
is a better name.

Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:29:25 -04:00
Hauke Mehrtens
517f43e5a9 bcma: add functions to scan cores needed on SoCs
The chip common and mips core have to be setup early in the boot
process to get the cpu clock.
bcma_bus_early_register() gets pointers to some space to store the core
data and searches for the chip common and mips core and initializes
chip common. After that was done and the kernel is out of early boot we
just have to run bcma_bus_register() and it will search for the other
cores, initialize and register them.
The cores are getting the same numbers as before.

Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:29:24 -04:00
Randy Dunlap
84f8508a7d regulator: fix regulator/consumer.h kernel-doc warning
Fix kernel-doc warning about internal/private data by marking it
as "private:" so that kernel-doc will ignore it.

Warning(include/linux/regulator/consumer.h:128): No description found for parameter 'ret'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-08-08 17:15:08 +01:00
David Howells
27e4e43627 CRED: Restore const to current_cred()
Commit 3295514841 ("fix rcu annotations noise in cred.h") accidentally
dropped the const of current->cred inside current_cred() by the
insertion of a cast to deal with an RCU annotation loss warning from
sparce.

Use an appropriate RCU wrapper instead so as not to lose the const.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-08 09:03:16 -07:00
Miklos Szeredi
37fb3a30b4 fuse: fix flock
Commit a9ff4f87 "fuse: support BSD locking semantics" overlooked a
number of issues with supporing flock locks over existing POSIX
locking infrastructure:

  - it's not backward compatible, passing flock(2) calls to userspace
    unconditionally (if userspace sets FUSE_POSIX_LOCKS)

  - it doesn't cater for the fact that flock locks are automatically
    unlocked on file release

  - it doesn't take into account the fact that flock exclusive locks
    (write locks) don't need an fd opened for write.

The last one invalidates the original premise of the patch that flock
locks can be emulated with POSIX locks.

This patch fixes the first two issues.  The last one needs to be fixed
in userspace if the filesystem assumed that a write lock will happen
only on a file operned for write (as in the case of the current fuse
library).

Reported-by: Sebastian Pipping <webmaster@hartwork.org>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2011-08-08 16:08:08 +02:00
Vinod Koul
90b44f8ffd dmaengine: add helper function for slave_single
For clients which require a single slave transfer and dont want to be bothered
about the scatterlist api, this helper gives simple API for this transfer and
creates single scatterlist for DMA API

Idea from Russell King

Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-08-08 16:48:14 +05:30
Mark Brown
18694886bd regmap: Add precious registers to the driver interface
Some devices are sensitive to reads on their registers, especially for
things like clear on read interrupt status registers. Avoid creating
problems with these with things like debugfs by allowing drivers to tell
the core about them. If a register is marked as precious then the core
will not internally generate any reads of it.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-08 15:47:05 +09:00
Mark Brown
2e2ae66df3 regmap: Allow devices to specify which registers are accessible
This is currently unused but we need to know which registers exist and
their properties in order to implement diagnostics like register map
dumps and the cache features.

We use callbacks partly because properties can vary at runtime (eg, through
access locks on registers) and partly because big switch statements are a
good compromise between readable code and small data size for providing
information on big register maps.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-08 15:47:00 +09:00
Mark Brown
dd898b2095 regmap: Add kerneldoc for struct regmap_config
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-08-08 15:47:00 +09:00
Mark Brown
18d4ed4342 Merge branch 'for-3.1' into for-3.2
Conflict due to the fix for the register map failure - taken the for-3.1
version.

Conflicts:
	sound/soc/codecs/sgtl5000.c
2011-08-08 14:56:19 +09:00
David S. Miller
6602a4baf4 net: Make userland include of netlink.h more sane.
Currently userland will barf when including linux/netlink.h unless it
precisely includes sys/socket.h first.  The issue is where the
definition of "sa_family_t" comes from.

We've been back and forth on how to fix this issue in the past, see:

http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/622621
http://thread.gmane.org/gmane.linux.network/143380

Ben Hutchings suggested we take a hint from how we handle the
sockaddr_storage type.  First we define a "__kernel_sa_family_t"
to linux/socket.h that is always defined.

Then if __KERNEL__ is defined, we also define "sa_family_t" as
equal to "__kernel_sa_family_t".

Then in places like linux/netlink.h we use __kernel_sa_family_t
in user visible datastructures.

Reported-by: Michel Machado <michel@digirati.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-07 22:48:07 -07:00
Al Viro
3295514841 fix rcu annotations noise in cred.h
task->cred is declared as __rcu, and access to other tasks' ->cred is,
indeed, protected.  Access to current->cred does not need rcu_dereference()
at all, since only the task itself can change its ->cred.  sparse, of
course, has no way of knowing that...

Add force-cast in current_cred(), make current_fsuid() et.al. use it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-07 13:42:35 -07:00
Linus Torvalds
c2f340a69c Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  ore: Make ore its own module
  exofs: Rename raid engine from exofs/ios.c => ore
  exofs: ios: Move to a per inode components & device-table
  exofs: Move exofs specific osd operations out of ios.c
  exofs: Add offset/length to exofs_get_io_state
  exofs: Fix truncate for the raid-groups case
  exofs: Small cleanup of exofs_fill_super
  exofs: BUG: Avoid sbi realloc
  exofs: Remove pnfs-osd private definitions
  nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
2011-08-06 22:56:03 -07:00
Linus Torvalds
3ddcd0569c vfs: optimize inode cache access patterns
The inode structure layout is largely random, and some of the vfs paths
really do care.  The path lookup in particular is already quite D$
intensive, and profiles show that accessing the 'inode->i_op->xyz'
fields is quite costly.

We already optimized the dcache to not unnecessarily load the d_op
structure for members that are often NULL using the DCACHE_OP_xyz bits
in dentry->d_flags, and this does something very similar for the inode
ops that are used during pathname lookup.

It also re-orders the fields so that the fields accessed by 'stat' are
together at the beginning of the inode structure, and roughly in the
order accessed.

The effect of this seems to be in the 1-2% range for an empty kernel
"make -j" run (which is fairly kernel-intensive, mostly in filename
lookup), so it's visible.  The numbers are fairly noisy, though, and
likely depend a lot on exact microarchitecture.  So there's more tuning
to be done.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-06 22:53:23 -07:00
Linus Torvalds
830c0f0edc vfs: renumber DCACHE_xyz flags, remove some stale ones
Gcc tends to generate better code with small integers, including the
DCACHE_xyz flag tests - so move the common ones to be first in the list.
Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
tree.

And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
common case to be a nice straight-line fall-through.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-06 22:52:40 -07:00
Linus Torvalds
7cd4767e69 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Compute protocol sequence numbers and fragment IDs using MD5.
  crypto: Move md5_transform to lib/md5.c
2011-08-06 22:12:37 -07:00
David S. Miller
6e5714eaf7 net: Compute protocol sequence numbers and fragment IDs using MD5.
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky <dan@doxpara.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-06 18:33:19 -07:00
David S. Miller
bc0b96b54a crypto: Move md5_transform to lib/md5.c
We are going to use this for TCP/IP sequence number and fragment ID
generation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-06 18:32:45 -07:00
Linus Torvalds
2560540b78 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (38 commits)
  acer-wmi: support Lenovo ideapad S205 wifi switch
  acerhdf.c: spaces in aliased changed to *
  platform-drivers-x86: ideapad-laptop: add missing ideapad_input_exit in ideapad_acpi_add error path
  x86 driver: fix typo in TDP override enabling
  Platform: fix samsung-laptop DMI identification for N150/N210/220/N230
  dell-wmi: Add keys for Dell XPS L502X
  platform-drivers-x86: samsung-q10: make dmi_check_callback return 1
  Platform: Samsung Q10 backlight driver
  platform-drivers-x86: intel_scu_ipc: convert to DEFINE_PCI_DEVICE_TABLE
  platform-drivers-x86: intel_rar_register: convert to DEFINE_PCI_DEVICE_TABLE
  platform-drivers-x86: intel_menlow: add missing return AE_OK for intel_menlow_register_sensor()
  platform-drivers-x86: intel_mid_thermal: fix memory leak
  platform-drivers-x86: msi-wmi: add missing sparse_keymap_free in msi_wmi_init error path
  Samsung Laptop platform driver: support N510
  asus-wmi: add uwb rfkill support
  asus-wmi: add gps rfkill support
  asus-wmi: add CWAP support and clarify the meaning of WAPF bits
  asus-wmi: return proper value in store_cpufv()
  asus-wmi: check for temp1 presence
  asus-wmi: add thermal sensor
  ...
2011-08-06 13:26:15 -07:00
Mandeep Singh Baines
1eb19a12bd lib/sha1: use the git implementation of SHA-1
For ChromiumOS, we use SHA-1 to verify the integrity of the root
filesystem.  The speed of the kernel sha-1 implementation has a major
impact on our boot performance.

To improve boot performance, we investigated using the heavily optimized
sha-1 implementation used in git.  With the git sha-1 implementation, we
see a 11.7% improvement in boot time.

10 reboots, remove slowest/fastest.

Before:

  Mean: 6.58 seconds Stdev: 0.14

After (with git sha-1, this patch):

  Mean: 5.89 seconds Stdev: 0.07

The other cool thing about the git SHA-1 implementation is that it only
needs 64 bytes of stack for the workspace while the original kernel
implementation needed 320 bytes.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-06 11:26:52 -07:00
Andy Lutomirski
33009557bd Add KEY_MICMUTE and enable it on Lenovo X220
I suspect that this works on T410.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-08-05 14:45:41 -04:00
Linus Torvalds
24f0eed266 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  RCUify freeing acls, let check_acl() go ahead in RCU mode if acl is cached
  get rid of boilerplate switches in posix_acl.h
  fix block device fallout from ->fsync() changes
2011-08-04 16:44:40 -10:00
Linus Torvalds
7f3bf7cd34 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmaengine: use DEFINE_IDR for static initialization
  ioat: fix xor_idx_to_desc
  Avoid section type conflict in dma/ioat/dma_v3.c
  ioat: Adding PCI IDs for IOAT devices on SandyBridge platforms
2011-08-04 16:43:43 -10:00
Boaz Harrosh
655b161284 nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
exofs file system wants to use pnfs_osd_xdr.h file instead of
redefining pnfs-objects types in it's private "pnfs.h" headr.

Before we do the switch we must make sure pnfs_osd_xdr.h is
compilable also under NFS versions smaller than 4.1. Since now
it is needed regardless of version, by the exofs code.

nfs4_string is not the only nfs4 type out in the global scope.

Ack-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-08-04 12:35:08 -07:00
Linus Torvalds
53d1e658df Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  Revert "dt: add of_alias_scan and of_alias_get_id"
  dt: remove of_alias_get_id() reference
2011-08-04 06:37:07 -10:00
Grant Likely
fe55c1844a Revert "dt: add of_alias_scan and of_alias_get_id"
This reverts commit 750f463a74.

of_alias_* still needs work to be generalized for 'promtree' dt
platforms, and to no implicitly create entries for available ids.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-08-04 11:26:24 +01:00
Linus Torvalds
35e51fe82d Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  cpuidle: stop depending on pm_idle
  x86 idle: move mwait_idle_with_hints() to where it is used
  cpuidle: replace xen access to x86 pm_idle and default_idle
  cpuidle: create bootparam "cpuidle.off=1"
  mrst_pmu: driver for Intel Moorestown Power Management Unit
2011-08-03 21:54:15 -10:00
Linus Torvalds
c0c770e610 Merge branch 'apei-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'apei-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI, APEI, EINJ Param support is disabled by default
  APEI GHES: 32-bit buildfix
  ACPI: APEI build fix
  ACPI, APEI, GHES: Add hardware memory error recovery support
  HWPoison: add memory_failure_queue()
  ACPI, APEI, GHES, Error records content based throttle
  ACPI, APEI, GHES, printk support for recoverable error via NMI
  lib, Make gen_pool memory allocator lockless
  lib, Add lock-less NULL terminated single list
  Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG
  ACPI, APEI, Add WHEA _OSC support
  ACPI, APEI, Add APEI bit support in generic _OSC call
  ACPI, APEI, GHES, Support disable GHES at boot time
  ACPI, APEI, GHES, Prevent GHES to be built as module
  ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST
  ACPI, APEI, Add apei_exec_run_optional
  ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic
  ACPI, APEI, ERST, Fix erst-dbg long record reading issue
  ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled
2011-08-03 21:53:27 -10:00
Linus Torvalds
27665ffa22 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6:
  dt: add of_alias_scan and of_alias_get_id
2011-08-03 15:10:30 -10:00
Hugh Dickins
e504f3fdd6 tmpfs radix_tree: locate_item to speed up swapoff
We have already acknowledged that swapoff of a tmpfs file is slower than
it was before conversion to the generic radix_tree: a little slower
there will be acceptable, if the hotter paths are faster.

But it was a shock to find swapoff of a 500MB file 20 times slower on my
laptop, taking 10 minutes; and at that rate it significantly slows down
my testing.

Now, most of that turned out to be overhead from PROVE_LOCKING and
PROVE_RCU: without those it was only 4 times slower than before; and
more realistic tests on other machines don't fare as badly.

I've tried a number of things to improve it, including tagging the swap
entries, then doing lookup by tag: I'd expected that to halve the time,
but in practice it's erratic, and often counter-productive.

The only change I've so far found to make a consistent improvement, is
to short-circuit the way we go back and forth, gang lookup packing
entries into the array supplied, then shmem scanning that array for the
target entry.  Scanning in place doubles the speed, so it's now only
twice as slow as before (or three times slower when the PROVEs are on).

So, add radix_tree_locate_item() as an expedient, once-off,
single-caller hack to do the lookup directly in place.  #ifdef it on
CONFIG_SHMEM and CONFIG_SWAP, as much to document its limited
applicability as save space in other configurations.  And, sadly,
#include sched.h for cond_resched().

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:24 -10:00
Hugh Dickins
69f07ec938 tmpfs: use kmemdup for short symlinks
But we've not yet removed the old swp_entry_t i_direct[16] from
shmem_inode_info.  That's because it was still being shared with the
inline symlink.  Remove it now (saving 64 or 128 bytes from shmem inode
size), and use kmemdup() for short symlinks, say, those up to 128 bytes.

I wonder why mpol_free_shared_policy() is done in shmem_destroy_inode()
rather than shmem_evict_inode(), where we usually do such freeing? I
guess it doesn't matter, and I'm not into NUMA mpol testing right now.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:24 -10:00
Hugh Dickins
aa3b189551 tmpfs: convert mem_cgroup shmem to radix-swap
Remove mem_cgroup_shmem_charge_fallback(): it was only required when we
had to move swappage to filecache with GFP_NOWAIT.

Remove the GFP_NOWAIT special case from mem_cgroup_cache_charge(), by
moving its call out from shmem_add_to_page_cache() to two of thats three
callers.  But leave it doing mem_cgroup_uncharge_cache_page() on error:
although asymmetrical, it's easier for all 3 callers to handle.

These two changes would also be appropriate if anyone were to start
using shmem_read_mapping_page_gfp() with GFP_NOWAIT.

Remove mem_cgroup_get_shmem_target(): mc_handle_file_pte() can test
radix_tree_exceptional_entry() to get what it needs for itself.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:24 -10:00
Hugh Dickins
41ffe5d5ce tmpfs: miscellaneous trivial cleanups
While it's at its least, make a number of boring nitpicky cleanups to
shmem.c, mostly for consistency of variable naming.  Things like "swap"
instead of "entry", "pgoff_t index" instead of "unsigned long idx".

And since everything else here is prefixed "shmem_", better change
init_tmpfs() to shmem_init().

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:23 -10:00
Hugh Dickins
285b2c4fdd tmpfs: demolish old swap vector support
The maximum size of a shmem/tmpfs file has been limited by the maximum
size of its triple-indirect swap vector.  With 4kB page size, maximum
filesize was just over 2TB on a 32-bit kernel, but sadly one eighth of
that on a 64-bit kernel.  (With 8kB page size, maximum filesize was just
over 4TB on a 64-bit kernel, but 16TB on a 32-bit kernel,
MAX_LFS_FILESIZE being then more restrictive than swap vector layout.)

It's a shame that tmpfs should be more restrictive than ramfs, and this
limitation has now been noticed.  Add another level to the swap vector?
No, it became obscure and hard to maintain, once I complicated it to
make use of highmem pages nine years ago: better choose another way.

Surely, if 2.4 had had the radix tree pagecache introduced in 2.5, then
tmpfs would never have invented its own peculiar radix tree: we would
have fitted swap entries into the common radix tree instead, in much the
same way as we fit swap entries into page tables.

And why should each file have a separate radix tree for its pages and
for its swap entries? The swap entries are required precisely where and
when the pages are not.  We want to put them together in a single radix
tree: which can then avoid much of the locking which was needed to
prevent them from being exchanged underneath us.

This also avoids the waste of memory devoted to swap vectors, first in
the shmem_inode itself, then at least two more pages once a file grew
beyond 16 data pages (pages accounted by df and du, but not by memcg).
Allocated upfront, to avoid allocation when under swapping pressure, but
pure waste when CONFIG_SWAP is not set - I have never spattered around
the ifdefs to prevent that, preferring this move to sharing the common
radix tree instead.

There are three downsides to sharing the radix tree.  One, that it binds
tmpfs more tightly to the rest of mm, either requiring knowledge of swap
entries in radix tree there, or duplication of its code here in shmem.c.
I believe that the simplications and memory savings (and probable higher
performance, not yet measured) justify that.

Two, that on HIGHMEM systems with SWAP enabled, it's the lowmem radix
nodes that cannot be freed under memory pressure - whereas before it was
the less precious highmem swap vector pages that could not be freed.
I'm hoping that 64-bit has now been accessible for long enough, that the
highmem argument has grown much less persuasive.

Three, that swapoff is slower than it used to be on tmpfs files, since
it's using a simple generic mechanism not tailored to it: I find this
noticeable, and shall want to improve, but maybe nobody else will
notice.

So...  now remove most of the old swap vector code from shmem.c.  But,
for the moment, keep the simple i_direct vector of 16 pages, with simple
accessors shmem_put_swap() and shmem_get_swap(), as a toy implementation
to help mark where swap needs to be handled in subsequent patches.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:23 -10:00
Hugh Dickins
a2c16d6cb0 mm: let swap use exceptional entries
If swap entries are to be stored along with struct page pointers in a
radix tree, they need to be distinguished as exceptional entries.

Most of the handling of swap entries in radix tree will be contained in
shmem.c, but a few functions in filemap.c's common code need to check
for their appearance: find_get_page(), find_lock_page(),
find_get_pages() and find_get_pages_contig().

So as not to slow their fast paths, tuck those checks inside the
existing checks for unlikely radix_tree_deref_slot(); except for
find_lock_page(), where it is an added test.  And make it a BUG in
find_get_pages_tag(), which is not applied to tmpfs files.

A part of the reason for eliminating shmem_readpage() earlier, was to
minimize the places where common code would need to allow for swap
entries.

The swp_entry_t known to swapfile.c must be massaged into a slightly
different form when stored in the radix tree, just as it gets massaged
into a pte_t when stored in page tables.

In an i386 kernel this limits its information (type and page offset) to
30 bits: given 32 "types" of swapfile and 4kB pagesize, that's a maximum
swapfile size of 128GB.  Which is less than the 512GB we previously
allowed with X86_PAE (where the swap entry can occupy the entire upper
32 bits of a pte_t), but not a new limitation on 32-bit without PAE; and
there's not a new limitation on 64-bit (where swap filesize is already
limited to 16TB by a 32-bit page offset).  Thirty areas of 128GB is
probably still enough swap for a 64GB 32-bit machine.

Provide swp_to_radix_entry() and radix_to_swp_entry() conversions, and
enforce filesize limit in read_swap_header(), just as for ptes.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:22 -10:00
Hugh Dickins
6328650bb4 radix_tree: exceptional entries and indices
A patchset to extend tmpfs to MAX_LFS_FILESIZE by abandoning its
peculiar swap vector, instead keeping a file's swap entries in the same
radix tree as its struct page pointers: thus saving memory, and
simplifying its code and locking.

This patch:

The radix_tree is used by several subsystems for different purposes.  A
major use is to store the struct page pointers of a file's pagecache for
memory management.  But what if mm wanted to store something other than
page pointers there too?

The low bit of a radix_tree entry is already used to denote an indirect
pointer, for internal use, and the unlikely radix_tree_deref_retry()
case.

Define the next bit as denoting an exceptional entry, and supply inline
functions radix_tree_exception() to return non-0 in either unlikely
case, and radix_tree_exceptional_entry() to return non-0 in the second
case.

If a subsystem already uses radix_tree with that bit set, no problem: it
does not affect internal workings at all, but is defined for the
convenience of those storing well-aligned pointers in the radix_tree.

The radix_tree_gang_lookups have an implicit assumption that the caller
can deduce the offset of each entry returned e.g.  by the page->index of
a struct page.  But that may not be feasible for some kinds of item to
be stored there.

radix_tree_gang_lookup_slot() allow for an optional indices argument,
output array in which to return those offsets.  The same could be added
to other radix_tree_gang_lookups, but for now keep it to the only one
for which we need it.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:22 -10:00
Axel Lin
5d6f921b42 drivers/video/backlight/aat2870_bl.c: fix setting max_current
- Current implementation tests wrong value for setting
   aat2870_bl->max_current.

 - In the current implementation, we cannot differentiate between 2 cases:

   a) if pdata->max_current is not set , or

   b) pdata->max_current is set to AAT2870_CURRENT_0_45 (which is also 0).

   Fix it by setting AAT2870_CURRENT_0_45 to be 1 and adjust the equation in
   aat2870_brightness() accordingly.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jin Park <jinyoungp@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:22 -10:00
Johannes Weiner
3dab1bce8e mm: page_alloc: increase __GFP_BITS_SHIFT to include __GFP_OTHER_NODE
__GFP_OTHER_NODE is used for NUMA allocations on behalf of other nodes.
It's supposed to be passed through from the page allocator to
zone_statistics(), but it never gets there as gfp_allowed_mask is not
wide enough and masks out the flag early in the allocation path.

The result is an accounting glitch where successful NUMA allocations
by-agent are not properly attributed as local.

Increase __GFP_BITS_SHIFT so that it includes __GFP_OTHER_NODE.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:21 -10:00
Rusty Russell
88eca0207c ida: simplified functions for id allocation
The current hyper-optimized functions are overkill if you simply want to
allocate an id for a device.  Create versions which use an internal
lock.

In followup patches, numerous drivers are converted to use this
interface.

Thanks to Tejun for feedback.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:20 -10:00
Akinobu Mita
dd48c085c1 fault-injection: add ability to export fault_attr in arbitrary directory
init_fault_attr_dentries() is used to export fault_attr via debugfs.
But it can only export it in debugfs root directory.

Per Forlin is working on mmc_fail_request which adds support to inject
data errors after a completed host transfer in MMC subsystem.

The fault_attr for mmc_fail_request should be defined per mmc host and
export it in debugfs directory per mmc host like
/sys/kernel/debug/mmc0/mmc_fail_request.

init_fault_attr_dentries() doesn't help for mmc_fail_request.  So this
introduces fault_create_debugfs_attr() which is able to create a
directory in the arbitrary directory and replace
init_fault_attr_dentries().

[akpm@linux-foundation.org: extraneous semicolon, per Randy]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Per Forlin <per.forlin@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 14:25:20 -10:00
Len Brown
a0bfa13738 cpuidle: stop depending on pm_idle
cpuidle users should call cpuidle_call_idle() directly
rather than via (pm_idle)() function pointer.

Architecture may choose to continue using (pm_idle)(),
but cpuidle need not depend on it:

  my_arch_cpu_idle()
	...
	if(cpuidle_call_idle())
		pm_idle();

cc: Kevin Hilman <khilman@deeprootsystems.com>
cc: Paul Mundt <lethal@linux-sh.org>
cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 19:06:37 -04:00
Len Brown
d91ee5863b cpuidle: replace xen access to x86 pm_idle and default_idle
When a Xen Dom0 kernel boots on a hypervisor, it gets access
to the raw-hardware ACPI tables.  While it parses the idle tables
for the hypervisor's beneift, it uses HLT for its own idle.

Rather than have xen scribble on pm_idle and access default_idle,
have it simply disable_cpuidle() so acpi_idle will not load and
architecture default HLT will be used.

cc: xen-devel@lists.xensource.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 19:06:36 -04:00
Len Brown
d0e323b470 Merge branch 'apei' into apei-release
Some trivial conflicts due to other various merges
adding to the end of common lists sooner than this one.

	arch/ia64/Kconfig
	arch/powerpc/Kconfig
	arch/x86/Kconfig
	lib/Kconfig
	lib/Makefile

Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:30:42 -04:00
Huang Ying
ea8f5fb8a7 HWPoison: add memory_failure_queue()
memory_failure() is the entry point for HWPoison memory error
recovery.  It must be called in process context.  But commonly
hardware memory errors are notified via MCE or NMI, so some delayed
execution mechanism must be used.  In MCE handler, a work queue + ring
buffer mechanism is used.

In addition to MCE, now APEI (ACPI Platform Error Interface) GHES
(Generic Hardware Error Source) can be used to report memory errors
too.  To add support to APEI GHES memory recovery, a mechanism similar
to that of MCE is implemented.  memory_failure_queue() is the new
entry point that can be called in IRQ context.  The next step is to
make MCE handler uses this interface too.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:58 -04:00
Huang Ying
7f184275aa lib, Make gen_pool memory allocator lockless
This version of the gen_pool memory allocator supports lockless
operation.

This makes it safe to use in NMI handlers and other special
unblockable contexts that could otherwise deadlock on locks.  This is
implemented by using atomic operations and retries on any conflicts.
The disadvantage is that there may be livelocks in extreme cases.  For
better scalability, one gen_pool allocator can be used for each CPU.

The lockless operation only works if there is enough memory available.
If new memory is added to the pool a lock has to be still taken.  So
any user relying on locklessness has to ensure that sufficient memory
is preallocated.

The basic atomic operation of this allocator is cmpxchg on long.  On
architectures that don't have NMI-safe cmpxchg implementation, the
allocator can NOT be used in NMI handler.  So code uses the allocator
in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:57 -04:00
Huang Ying
f49f23abf3 lib, Add lock-less NULL terminated single list
Cmpxchg is used to implement adding new entry to the list, deleting
all entries from the list, deleting first entry of the list and some
other operations.

Because this is a single list, so the tail can not be accessed in O(1).

If there are multiple producers and multiple consumers, llist_add can
be used in producers and llist_del_all can be used in consumers.  They
can work simultaneously without lock.  But llist_del_first can not be
used here.  Because llist_del_first depends on list->first->next does
not changed if list->first is not changed during its operation, but
llist_del_first, llist_add, llist_add (or llist_del_all, llist_add,
llist_add) sequence in another consumer may violate that.

If there are multiple producers and one consumer, llist_add can be
used in producers and llist_del_all or llist_del_first can be used in
the consumer.

This can be summarized as follow:

           |   add    | del_first |  del_all
 add       |    -     |     -     |     -
 del_first |          |     L     |     L
 del_all   |          |           |     -

Where "-" stands for no lock is needed, while "L" stands for lock is
needed.

The list entries deleted via llist_del_all can be traversed with
traversing function such as llist_for_each etc.  But the list entries
can not be traversed safely before deleted from the list.  The order
of deleted entries is from the newest to the oldest added one.  If you
want to traverse from the oldest to the newest, you must reverse the
order by yourself before traversing.

The basic atomic operation of this list is cmpxchg on long.  On
architectures that don't have NMI-safe cmpxchg implementation, the
list can NOT be used in NMI handler.  So code uses the list in NMI
handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:56 -04:00
Shawn Guo
750f463a74 dt: add of_alias_scan and of_alias_get_id
The patch adds function of_alias_scan to populate a global lookup
table with the properties of 'aliases' node and function
of_alias_get_id for drivers to find alias id from the lookup table.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[grant.likely: add locking and rework parse loop]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-08-03 13:02:31 +01:00
Linus Torvalds
c299eba3c5 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
  ACPI:  delete stale reference in kernel-parameters.txt
  ACPI: add missing _OSI strings
  ACPI: remove NID_INVAL
  thermal: make THERMAL_HWMON implementation fully internal
  thermal: split hwmon lookup to a separate function
  thermal: hide CONFIG_THERMAL_HWMON
  ACPI print OSI(Linux) warning only once
  ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
  ACPI / Battery: propagate sysfs error in acpi_battery_add()
  ACPI / Battery: avoid acpi_battery_add() use-after-free
  ACPI: introduce "acpi_rsdp=" parameter for kdump
  ACPI: constify ops structs
  ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
  ACPI: fix 80 char overflow
  ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
  ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
  ACPI / Battery: Add the hibernation process in the battery_notify()
  ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
  ACPI / Battery: Change 16-bit signed negative battery current into correct value
  ACPI / Battery: Add the power unit macro
  ...
2011-08-02 21:17:02 -10:00
Linus Torvalds
a6b11f5338 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6:
  MAINTAINERS: Add keyword match for of_match_table to device tree section
  of: constify property name parameters for helper functions
  input: xilinx_ps2: Add missing of_address.h header
  of: address: use resource_size helper
2011-08-02 20:49:57 -10:00
Linus Torvalds
f3406816bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (34 commits)
  dm table: set flush capability based on underlying devices
  dm crypt: optionally support discard requests
  dm raid: add md raid1 support
  dm raid: support metadata devices
  dm raid: add write_mostly parameter
  dm raid: add region_size parameter
  dm raid: improve table parameters documentation
  dm ioctl: forbid multiple device specifiers
  dm ioctl: introduce __get_dev_cell
  dm ioctl: fill in device parameters in more ioctls
  dm flakey: add corrupt_bio_byte feature
  dm flakey: add drop_writes
  dm flakey: support feature args
  dm flakey: use dm_target_offset and support discards
  dm table: share target argument parsing functions
  dm snapshot: skip reading origin when overwriting complete chunk
  dm: ignore merge_bvec for snapshots when safe
  dm table: clean dm_get_device and move exports
  dm raid: tidy includes
  dm ioctl: prevent empty message
  ...
2011-08-02 20:49:21 -10:00
Al Viro
3567866bf2 RCUify freeing acls, let check_acl() go ahead in RCU mode if acl is cached
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-03 00:58:42 -04:00
Al Viro
951c0d660a get rid of boilerplate switches in posix_acl.h
the only potentially subtle thing here: get_cached_acl()
is never called with the second argument other than
ACL_TYPE_{ACCESS,DEFAULT}.  IOW, that return ERR_PTR(-EINVAL)
in there might as well be BUG().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-03 00:47:21 -04:00