Commit Graph

15731 Commits

Author SHA1 Message Date
Steven Whitehouse
af5df56688 vfs: Further changes from macro to inline function in fs.h
There is a second set of macros for when CONFIG_FILE_LOCKING is
not set. This patch updates those to become inline functions
as well.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27 14:43:59 -04:00
Steven Whitehouse
c2aca5e529 vfs: Update fs.h to use inline functions when no file locking set
This avoids various issues which might give rise to compiler warnings
about missing functions and/or unused variable with the previous
macros. This also fixes a bug where one of the macros was returning
0, but it should have been void.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27 14:43:58 -04:00
Cheng Renquan
10f303ae1e do_pipe cleanup: drop its last user in arch/alpha/
The last user of do_pipe is in arch/alpha/, after replacing it with
do_pipe_flags, the do_pipe can be totally dropped.

Signed-off-by: Cheng Renquan <crquan@gmail.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27 14:43:58 -04:00
Christoph Hellwig
2b1c6bd77d generic compat_sys_ustat
Due to a different size of ino_t ustat needs a compat handler, but
currently only x86 and mips provide one.  Add a generic compat_sys_ustat
and switch all architectures over to it.  Instead of doing various
user copy hacks compat_sys_ustat just reimplements sys_ustat as
it's trivial.  This was suggested by Arnd Bergmann.

Found by Eric Sandeen when running xfstests/017 on ppc64, which causes
stack smashing warnings on RHEL/Fedora due to the too large amount of
data writen by the syscall.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27 14:43:57 -04:00
Ingo Molnar
6e15cf0486 Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2
Conflicts:
	arch/parisc/kernel/irq.c
	arch/x86/include/asm/fixmap_64.h
	arch/x86/include/asm/setup.h
	kernel/irq/handle.c

Semantic merge:
        arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-27 17:28:43 +01:00
Harald Jenny
1cae710321 sony-laptop: VGN-A317M hotkey support
This laptop has 5 SPIC managed buttons above the keyboard:
sound + and - as well as brightness, zoom and S1.
Possibly the entire VGN-A serie behaves the same.

Signed-off-by: Harald Jenny <harald@a-little-linux-box.at>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-27 12:18:56 -04:00
Matthew Garrett
45c7942ba8 sony-laptop: Add support for extended hotkeys
Recent Sony SR-series machines have an additional set of buttons accessed
via the 0x127 method rather than the 0x100 method. Add support for these.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-27 12:18:30 -04:00
Matthew Garrett
9b57896e62 sony-laptop: Add support for extra keyboard events
The current sony-laptop code assumes that the keyboard event method is
always located at slot 2 in the platform code. Remove this assumption and
add support for some additional hotkeys.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-27 12:18:07 -04:00
Sascha Hauer
05fd8e73e1 clkdev: add possibility to get a clock based on the device name
This adds clk_get_sys to get a clock without the associated struct
device.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-03-27 14:51:13 +01:00
Bartlomiej Zolnierkiewicz
bf717c0a2e ide: keep track of number of bytes instead of sectors in struct ide_cmd
* Pass number of bytes instead of sectors to ide_init_sg_cmd().

* Pass number of bytes to process to ide_pio_sector() and rename
  it to ide_pio_bytes().

* Rename ->nsect field to ->nbytes in struct ide_cmd and use
  ->nbytes, ->nleft and ->cursg_ofs to keep track of number of
  bytes instead of sectors.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:47 +01:00
Bartlomiej Zolnierkiewicz
35b5d0be3d ide: remove ide_execute_pkt_cmd() (v2)
* Pass command structure to ide_execute_command() and skip
  __ide_set_handler() for ATAPI protocols on non-DRQ devices.

* Convert ide_issue_pc() to always use ide_execute_command()
  and remove no longer needed ide_execute_pkt_cmd().

v2:
* Fix for non-DRQ devices (based on report from Borislav).

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:47 +01:00
Bartlomiej Zolnierkiewicz
22117d6eaa ide: add ->dma_timer_expiry method and remove ->dma_exec_cmd one (v2)
* Rename dma_timer_expiry() to ide_dma_sff_timer_expiry() and export it.

* Add ->dma_timer_expiry method and use it to set hwif->expiry for
  ATA_PROT_DMA protocol in do_rw_taskfile().

* Initialize ->dma_timer_expiry to ide_dma_sff_timer_expiry() for SFF hosts.

* Move setting hwif->expiry from ide_execute_command() to its users and drop
  'expiry' argument.

* Use ide_execute_command() instead of ->dma_exec_cmd in do_rw_taskfile().

* Remove ->dma_exec_cmd method and its implementations.

* Unexport ide_execute_command() and ide_dma_intr().

v2:
* Fix CONFIG_BLK_DEV_IDEDMA=n build (noticed by Randy Dunlap).

* Fix *dma_expiry naming (suggested by Sergei Shtylyov).

There should be no functional changes caused by this patch.

Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:47 +01:00
Bartlomiej Zolnierkiewicz
60c0cd02b2 ide: set hwif->expiry prior to calling [__]ide_set_handler()
* Set hwif->expiry prior to calling [__]ide_set_handler()
  and drop 'expiry' argument.

* Set hwif->expiry to NULL in ide_{timer_expiry,intr}()
  and remove 'hwif->expiry = NULL' assignments.

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:46 +01:00
Bartlomiej Zolnierkiewicz
b788ee9c65 ide: use do_rw_taskfile() for ATA_CMD_PACKET commands
* Pass command to ide_issue_pc() and update ->do_request methods
  in ide-{cd,floppy,tape}.c accordingly.

* Convert ide_pktcmd_tf_load() to ide_init_packet_cmd() which
  just initializes command structure and use do_rw_taskfile()
  to load ATA_CMD_PACKET commands.

While at it:

* Rename ide{floppy,tape}_issue_pc() to ide_{floppy,tape}_issue_pc().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:46 +01:00
Bartlomiej Zolnierkiewicz
2298169418 ide: pass command to ide_map_sg()
* Set IDE_TFLAG_WRITE flag and ->rq also for ATA_CMD_PACKET
  commands.

* Pass command to ->dma_setup method and update all its
  implementations accordingly.

* Pass command instead of request to ide_build_sglist(),
  *_build_dmatable() and ide_map_sg().

While at it:

* Fix scc_dma_setup() documentation + use ATA_DMA_WR define.

* Rename sgiioc4_build_dma_table() to sgiioc4_build_dmatable(),
  change return value type to 'int' and drop unused 'ddir'
  argument.

* Do some minor cleanups in [tx4939]ide_dma_setup().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:46 +01:00
Bartlomiej Zolnierkiewicz
130e886708 ide: remove ide_end_request()
* Add ide_rq_bytes() helper.

* Add blk_noretry_request() quirk to ide_complete_rq() (currently only fs
  requests can be marked as "noretry" so there is no change in behavior).

* Switch current ide_end_request() users to use ide_complete_rq().

  [ No need to check for rq->nr_sectors == 0 in {ide_dma,task_pio}_intr(),
    nsectors == 0 in cdrom_end_request() and err == 0 in ide_do_devset(). ]

* Remove no longer needed ide_end_request().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:45 +01:00
Bartlomiej Zolnierkiewicz
f974b196f5 ide: pass number of bytes to complete to ide_complete_rq()
There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:44 +01:00
Bartlomiej Zolnierkiewicz
a9587fd8c4 ide: remove BUG() from ide_complete_rq()
It is no longer needed so remove it, also while at it dequeue the request
only on blk_end_request() success and make ide_complete_rq() return an error
value.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:44 +01:00
Bartlomiej Zolnierkiewicz
6902a53312 ide: pass error value to ide_complete_rq()
Set rq->errors at ide_complete_rq() call sites and then pass
error value to ide_complete_rq().

[ Some rq->errors assignments look really wrong but this patch
  leaves them alone to not introduce too many changes at once. ]

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:43 +01:00
Bartlomiej Zolnierkiewicz
1caf236daf ide: add ide_end_rq() (v2)
* Move request dequeuing from __ide_end_request() to ide_end_request().

* Rename __ide_end_request() to ide_end_rq() and export it.

* Fix ide_end_rq() to pass original blk_end_request() return value.

* ide_end_dequeued_request() is used only in cdrom_end_request()
  so inline it there and then remove the function.

v2:
* Remove needless BUG_ON() while at it (start_request()'s one is enough).

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:42 +01:00
Bartlomiej Zolnierkiewicz
0dfb991c69 ide: use ata_tf_protocols enums
* Add IDE_TFLAG_MULTI_PIO taskfile flag and set it for commands
  using multi-PIO protocol.

* Use ata_tf_protocols enums instead of TASKFILE_* defines to
  denote command's protocol and then rename ->data_phase field
  to ->protocol.

* Remove no longer needed <linux/hdreg.h> includes.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:39 +01:00
Bartlomiej Zolnierkiewicz
b6308ee0c5 ide: move command related fields from ide_hwif_t to struct ide_cmd
* Move command related fields from ide_hwif_t to struct ide_cmd.

* Make ide_init_sg_cmd() take command and sectors number as arguments.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:38 +01:00
Bartlomiej Zolnierkiewicz
adb1af9803 ide: pass command instead of request to ide_pio_datablock()
* Add IDE_TFLAG_FS taskfile flag and set it for REQ_TYPE_FS requests.

* Convert ->{in,out}put_data methods to take command instead of request
  as an argument.  Then convert pre_task_out_intr(), task_end_request(),
  task_error(), task_in_unexpected(), ide_pio_sector(), ide_pio_multi()
  and ide_pio_datablock() in similar way.

* Rename task_end_request() to ide_finish_cmd().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:38 +01:00
Bartlomiej Zolnierkiewicz
22aa4b32a1 ide: remove ide_task_t typedef
While at it:
- rename struct ide_task_s to struct ide_cmd
- remove stale comments from idedisk_{read_native,set}_max_address()
- drop unused 'cmd' argument from ide_{cmd,task}_ioctl()
- drop unused 'task' argument from tx4939ide_tf_load_fixup()
- rename ide_complete_task() to ide_complete_cmd()
- use consistent naming for struct ide_cmd variables

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:37 +01:00
Bartlomiej Zolnierkiewicz
e6830a86c2 ide: call ide_build_sglist() prior to ->dma_setup (v2)
* Re-map sg table if needed in ide_build_sglist().

* Move ide_build_sglist() call from ->dma_setup to its users.

* Un-export ide_build_sglist().

v2:
* Build fix for CONFIG_BLK_DEV_IDEDMA=n (noticed by Randy Dunlap).

There should be no functional changes caused by this patch.

Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:37 +01:00
Bartlomiej Zolnierkiewicz
03a2faaea8 ide: return request status from ->pc_callback method
Make ->pc_callback method return request status and then move
the request completion from ->pc_callback to ide_pc_intr().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:36 +01:00
Bartlomiej Zolnierkiewicz
3ee38302ff ide: remove ->end_request method
* Handle completion of private driver requests explicitly
  for ide_floppy and ide_tape media in ide_kill_rq().

* Remove no longer needed ->end_request method.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:36 +01:00
Bartlomiej Zolnierkiewicz
c152cc1a90 ide: use ->end_request only for private device driver requests
* Move IDE{FLOPPY,TAPE}_ERROR_* defines to <linux/ide.h> and rename them
  to IDE_DRV_ERROR_*.

* Handle ->end_request special cases for floppy/tape media in ide_kill_rq().

* Call ->end_request only for private device driver requests.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:34 +01:00
Bartlomiej Zolnierkiewicz
5e2040fd0a ide: move ->failed_pc to ide_drive_t
Move ->failed_pc from struct ide_{disk,tape}_obj to ide_drive_t.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:34 +01:00
Bartlomiej Zolnierkiewicz
c7016e95a5 ide: remove no longer needed PC_FLAG_TIMEDOUT packet command flag
There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:33 +01:00
Bartlomiej Zolnierkiewicz
e3d9a73a83 ide: remove ->data_phase field from ide_hwif_t
* Always use hwif->task->data_phase and remove ->data_phase
  field from ide_hwif_t.

* Remove superfluous REQ_TYPE_ATA_TASKFILE check from
  ide_pio_datablock() while at it.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:32 +01:00
Bartlomiej Zolnierkiewicz
a09485df9c ide: move request type specific code from ide_end_drive_cmd() to callers (v3)
* Move request type specific code from ide_end_drive_cmd() to callers.

* Remove stale ide_end_drive_cmd() documentation and drop no longer
  used 'stat' argument.  Then rename the function to ide_complete_rq().

v2:
* Fix handling of blk_pm_request() requests in task_no_data_intr().

v3:
* Some ide_no_data_taskfile() users (HPA code and HDIO_DRIVE_* ioctls
  handlers) access original command later so we need to update it in
  ide_complete_task().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:31 +01:00
Bartlomiej Zolnierkiewicz
3616b6536a ide: complete power step in ide_complete_pm_request()
* Complete power step in ide_complete_pm_request().

* Rename ide_complete_pm_request() to ide_complete_pm_rq().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:29 +01:00
Bartlomiej Zolnierkiewicz
19710d25d5 ide: add "flagged" taskfile flags to struct ide_taskfile (v2)
* Add ->ftf_flags field to struct ide_taskfile
  and convert flags for TASKFILE ioctl to use it.

* Rename "flagged" taskfile flags:
  - IDE_TFLAG_FLAGGED -> IDE_FTFLAG_FLAGGED
  - IDE_TFLAG_FLAGGED_SET_IN_FLAGS -> IDE_FTFLAG_SET_IN_FLAGS
  - IDE_TFLAG_{OUT,IN}_DATA -> IDE_FTFLAG_{OUT,IN}_DATA

v2:
* Remember to fully update ide-h8300.c, scc_pata.c and tx493{8,9}ide.c
  (thanks to Stephen Rothwell for noticing).

There should be no functional changes caused by this patch.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:28 +01:00
Bartlomiej Zolnierkiewicz
c094ea0774 ide: add IDE_HFLAG_4DRIVES host flag
Add IDE_HFLAG_4DRIVES host flag and use it instead of ide_4drives
chipset type in ide_init_port().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:28 +01:00
Bartlomiej Zolnierkiewicz
2787cb8ae5 ide: add IDE_HFLAG_DTC2278 host flag
Add IDE_HFLAG_DTC2278 host flag and use it instead of ide_dtc2278
chipset type in ide_init_port().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:28 +01:00
Bartlomiej Zolnierkiewicz
255115fb35 ide: allow host drivers to specify IRQ flags
* Add ->irq_flags field to struct ide_port_info and struct ide_host.

* Update host drivers and IDE PCI code to use ->irq_flags field.

* Convert init_irq() and ide_intr() to use host->irq_flags.

This fixes handling of shared IRQs for non-PCI hosts
and removes ugly ifdeffery from core IDE code.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:27 +01:00
Bartlomiej Zolnierkiewicz
69197ad70e ide: fix memleak on failure in probe_for_drive()
Always free drive->id in probe_for_drive() if device is not present.

While at it:
- remove dead IDE_DFLAG_DEAD flag
- remove superfluous IDE_DFLAG_PRESENT check

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:26 +01:00
Bartlomiej Zolnierkiewicz
15a453a955 ide: include <asm/ide.h> only when needed
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:26 +01:00
Bartlomiej Zolnierkiewicz
e354c1d803 ide: remove IDE_ARCH_LOCK (v2)
* Add ->{get,release}_lock methods to struct ide_port_info
  and struct ide_host.

* Convert core IDE code, m68k IDE code and falconide support to use
  ->{get,release}_lock methods instead of ide_{get,release}_lock().

* Remove IDE_ARCH_LOCK.

v2:
* Build fix from Geert updating ide_{get,release}_lock() callers in
  falconide.c.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:22 +01:00
Bartlomiej Zolnierkiewicz
d15a613ba0 ide: remove IDE_ARCH_INTR (v2)
This micro-optimization is not worth it.  Just always check for
existence of ->ack_intr method in ide_intr() and ide_timer_expiry().

v2:
Fix brown-paper-bag bug spotted by David D. Kilzer.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michael Schmitz <schmitz@debian.org>
Cc: "David D. Kilzer" <ddkilzer@kilzer.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-27 12:46:21 +01:00
Borislav Petkov
088b1b8860 ide: improve debugging scheme
and more specifically, push __func__ into debug
macro thus making ide_debug_log() calls shorter and more readable.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
2009-03-27 12:46:19 +01:00
Stephen Hemminger
ac99533fb7 wan: convert sdla driver to net_device_ops
Also use internal net_device_stats

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-27 00:46:44 -07:00
David S. Miller
01e6de64d9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-03-26 22:45:23 -07:00
Alexander Clouter
3341323bb4 hwrng: timeriomem - Use phys address rather than virt
There is no ioremap'ing or anything in timeriomem-rng.c as I foolishly
used already remapped virtual addresses instead of passing the physical
address to be polled.

This patch fixes this flaw and lets developers do the Right Thing(tm).

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-03-27 12:59:54 +08:00
Linus Torvalds
be0ea69674 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slob: fix lockup in slob_free()
  slub: use get_track()
  slub: rename calculate_min_partial() to set_min_partial()
  slub: add min_partial sysfs tunable
  slub: move min_partial to struct kmem_cache
  SLUB: Fix default slab order for big object sizes
  SLUB: Do not pass 8k objects through to the page allocator
  SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants
  slob: clean up the code
  SLUB: Use ->objsize from struct kmem_cache_cpu in slab_free()
2009-03-26 16:18:17 -07:00
Linus Torvalds
8e9d208972 Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6
* 'bkl-removal' of git://git.lwn.net/linux-2.6:
  Rationalize fasync return values
  Move FASYNC bit handling to f_op->fasync()
  Use f_lock to protect f_flags
  Rename struct file->f_ep_lock
2009-03-26 16:14:02 -07:00
Linus Torvalds
ba1eb95cf3 Merge branch 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
  x86: headers cleanup - setup.h
  emu101k1.h: fix duplicate include of <linux/types.h>
  compiler-gcc4: conditionalize #error on __KERNEL__
  remove __KERNEL_STRICT_NAMES
  make netfilter use strict integer types
  make drm headers use strict integer types
  make MTD headers use strict integer types
  make most exported headers use strict integer types
  make exported headers use strict posix types
  unconditionally include asm/types.h from linux/types.h
  make linux/types.h as assembly safe
  Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h
  headers_check fix cleanup: linux/reiserfs_fs.h
  headers_check fix cleanup: linux/nubus.h
  headers_check fix cleanup: linux/coda_psdev.h
  headers_check fix: x86, setup.h
  headers_check fix: x86, prctl.h
  headers_check fix: linux/reinserfs_fs.h
  headers_check fix: linux/socket.h
  headers_check fix: linux/nubus.h
  ...

Manually fix trivial conflicts in:
	include/linux/netfilter/xt_limit.h
	include/linux/netfilter/xt_statistic.h
2009-03-26 16:11:41 -07:00
Linus Torvalds
a8416961d3 Merge branch 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: disable __do_IRQ support
  sparseirq, powerpc/cell: fix unused variable warning in interrupt.c
  genirq: deprecate obsolete typedefs and defines
  genirq: deprecate __do_IRQ
  genirq: add doc to struct irqaction
  genirq: use kzalloc instead of explicit zero initialization
  genirq: make irqreturn_t an enum
  genirq: remove redundant if condition
  genirq: remove unused hw_irq_controller typedef
  irq: export remove_irq() and setup_irq() symbols
  irq: match remove_irq() args with setup_irq()
  irq: add remove_irq() for freeing of setup_irq() irqs
  genirq: assert that irq handlers are indeed running in hardirq context
  irq: name 'p' variables a bit better
  irq: further clean up the free_irq() code flow
  irq: refactor and clean up the free_irq() code flow
  irq: clean up manage.c
  irq: use GFP_KERNEL for action allocation in request_irq()
  kernel/irq: fix sparse warning: make symbol static
  irq: optimize init_kstat_irqs/init_copy_kstat_irqs
  ...
2009-03-26 16:06:50 -07:00
Linus Torvalds
6671de344c Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  posix timers: fix RLIMIT_CPU && fork()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fix
  time: ntp: clean up second_overflow()
  time: ntp: simplify ntp_tick_adj calculations
  time: ntp: make 64-bit constants more robust
  time: ntp: refactor do_adjtimex() some more
  time: ntp: refactor do_adjtimex()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex()
  time: ntp: micro-optimize ntp_update_offset()
  time: ntp: simplify ntp_update_offset_fll()
  time: ntp: refactor and clean up ntp_update_offset()
  time: ntp: refactor up ntp_update_frequency()
  time: ntp: clean up ntp_update_frequency()
  time: ntp: simplify the MAX_TICKADJ_SCALED definition
  time: ntp: simplify the second_overflow() code flow
  time: ntp: clean up kernel/time/ntp.c
  x86: hpet: stop HPET_COUNTER when programming periodic mode
  x86: hpet: provide separate functions to stop and start the counter
  x86: hpet: print HPET registers during setup (if hpet=verbose is used)
  time: apply NTP frequency/tick changes immediately
  ...
2009-03-26 16:05:42 -07:00
Linus Torvalds
831576fe40 Merge branch 'sched-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (46 commits)
  sched: Add comments to find_busiest_group() function
  sched: Refactor the power savings balance code
  sched: Optimize the !power_savings_balance during fbg()
  sched: Create a helper function to calculate imbalance
  sched: Create helper to calculate small_imbalance in fbg()
  sched: Create a helper function to calculate sched_domain stats for fbg()
  sched: Define structure to store the sched_domain statistics for fbg()
  sched: Create a helper function to calculate sched_group stats for fbg()
  sched: Define structure to store the sched_group statistics for fbg()
  sched: Fix indentations in find_busiest_group() using gotos
  sched: Simple helper functions for find_busiest_group()
  sched: remove unused fields from struct rq
  sched: jiffies not printed per CPU
  sched: small optimisation of can_migrate_task()
  sched: fix typos in documentation
  sched: add avg_overlap decay
  x86, sched_clock(): mark variables read-mostly
  sched: optimize ttwu vs group scheduling
  sched: TIF_NEED_RESCHED -> need_reshed() cleanup
  sched: don't rebalance if attached on NULL domain
  ...
2009-03-26 16:05:01 -07:00
Linus Torvalds
86d9c07017 Merge branch 'for-2.6.30' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.30' of git://git.kernel.dk/linux-2.6-block:
  Get rid of pdflush_operation() in emergency sync and remount
  btrfs: get rid of current_is_pdflush() in btrfs_btree_balance_dirty
  Move the default_backing_dev_info out of readahead.c and into backing-dev.c
  block: Repeated lines in switching-sched.txt
  bsg: Remove bogus check against request_queue->max_sectors
  block: WARN in __blk_put_request() for potential bio leak
  loop: fix circular locking in loop_clr_fd()
  loop: support barrier writes
  bsg: add support for tail queuing
  cpqarray: enable bus mastering
  block: genhd.h cleanup patch
  block: add private bio_set for bio integrity allocations
  block: genhd.h comment needs updating
  block: get rid of unused blkdev_free_rq() define
  block: remove various blk_queue_*() setting functions in blk_init_queue_node()
  cciss: add BUILD_BUG_ON() for catching bad CommandList_struct alignment
  block: don't create bio_vec slabs of less than the inline number
  block: cleanup bio_alloc_bioset()
2009-03-26 16:03:04 -07:00
Yu Zhao
898585172f PCI: save and restore PCIe 2.0 registers
PCIe 2.0 defines several new registers (Device Control 2, Link Control 2,
and Slot Control 2). Save and retore them in pci_save_pcie_state() and
pci_restore_pcie_state().

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-26 16:02:30 -07:00
Linus Torvalds
13220a94d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
  ixgbe: Allow Priority Flow Control settings to survive a device reset
  net: core: remove unneeded include in net/core/utils.c.
  e1000e: update version number
  e1000e: fix close interrupt race
  e1000e: fix loss of multicast packets
  e1000e: commonize tx cleanup routine to match e1000 & igb
  netfilter: fix nf_logger name in ebt_ulog.
  netfilter: fix warning in ebt_ulog init function.
  netfilter: fix warning about invalid const usage
  e1000: fix close race with interrupt
  e1000: cleanup clean_tx_irq routine so that it completely cleans ring
  e1000: fix tx hang detect logic and address dma mapping issues
  bridge: bad error handling when adding invalid ether address
  bonding: select current active slave when enslaving device for mode tlb and alb
  gianfar: reallocate skb when headroom is not enough for fcb
  Bump release date to 25Mar2009 and version to 0.22
  r6040: Fix second PHY address
  qeth: fix wait_event_timeout handling
  qeth: check for completion of a running recovery
  qeth: unregister MAC addresses during recovery.
  ...

Manually fixed up conflicts in:
	drivers/infiniband/hw/cxgb3/cxio_hal.h
	drivers/infiniband/hw/nes/nes_nic.c
2009-03-26 15:54:36 -07:00
Linus Torvalds
d3f12d36f1 Merge branch 'kvm-updates/2.6.30' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.30' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (113 commits)
  KVM: VMX: Don't allow uninhibited access to EFER on i386
  KVM: Correct deassign device ioctl to IOW
  KVM: ppc: e500: Fix the bug that KVM is unstable in SMP
  KVM: ppc: e500: Fix the bug that mas0 update to wrong value when read TLB entry
  KVM: Fix missing smp tlb flush in invlpg
  KVM: Get support IRQ routing entry counts
  KVM: fix sparse warnings: Should it be static?
  KVM: fix sparse warnings: context imbalance
  KVM: is_long_mode() should check for EFER.LMA
  KVM: VMX: Update necessary state when guest enters long mode
  KVM: ia64: Fix the build errors due to lack of macros related to MSI.
  ia64: Move the macro definitions related to MSI to one header file.
  KVM: fix kvm_vm_ioctl_deassign_device
  KVM: define KVM_CAP_DEVICE_DEASSIGNMENT
  KVM: ppc: Add emulation of E500 register mmucsr0
  KVM: Report IRQ injection status for MSI delivered interrupts
  KVM: MMU: Fix another largepage memory leak
  KVM: SVM: set accessed bit for VMCB segment selectors
  KVM: Report IRQ injection status to userspace.
  KVM: MMU: remove assertion in kvm_mmu_alloc_page
  ...
2009-03-26 15:47:52 -07:00
Linus Torvalds
39b566eedb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
  RDMA/cxgb3: Enforce required firmware
  IB/mlx4: Unregister IB device prior to CLOSE PORT command
  mlx4_core: Add link type autosensing
  mlx4_core: Don't perform SET_PORT command for Ethernet ports
  RDMA/nes: Handle MPA Reject message properly
  RDMA/nes: Improve use of PBLs
  RDMA/nes: Remove LLTX
  RDMA/nes: Inform hardware that asynchronous event has been handled
  RDMA/nes: Fix tmp_addr compilation warning
  RDMA/nes: Report correct vendor_id and vendor_part_id
  RDMA/nes: Update copyright to new legal entity and year
  RDMA/nes: Account for freed PBL after HW operation
  IB: Remove useless ibdev_is_alive() tests from sysfs code
  IB/sa_query: Fix AH leak due to update_sm_ah() race
  IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
  IB/mad: initialize mad_agent_priv before putting on lists
  IB/mad: Fix null pointer dereference in local_completions()
  IB/mad: Fix RMPP header RRespTime manipulation
  IB/iser: Remove hard setting of path MTU
  mlx4_core: Add device IDs for MT25458 10GigE devices
  ...
2009-03-26 15:47:08 -07:00
David S. Miller
08abe18af1 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wimax/i2400m/usb-notif.c
2009-03-26 15:23:24 -07:00
Linus Torvalds
0384e29591 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (35 commits)
  [libata] Improve timeout handling
  [libata] Drain data on errors
  pata_sc1200: Activate secondary channel
  pata_artop: Serializing support
  [libata] ahci: correct enclosure LED state save
  [libata] More robust parsing for IDENTIFY DEVICE multi_count field
  sata_mv: fix LED blinking for SoC+NCQ
  sata_mv: optimize IRQ coalescing for 8-port chips
  sata_mv: implement IRQ coalescing (v2)
  sata_mv: cosmetic preparations for IRQ coalescing
  pata-rb532-cf: platform_get_irq() fix ignored failure
  pata_efar: fix *dma_mask
  pata_radisys: fix mwdma_mask to exclude mwdma0
  [libata] convert drivers to use ata.h mode mask defines
  include/linux/ata.h: add some more transfer masks
  ahci: Blacklist HP Compaq 6720s that spins off disks during ACPI power off
  [libata] sata_mv: Implement direct FIS transmission via mv_qc_issue_fis().
  [libata] Export ata_pio_queue_task() so that it can be used from sata_mv.
  [libata] sata_mv: Add a new mv_sff_check_status() function to sata_mv.
  [libata] sata_mv: Tighten up interrupt masking in mv_qc_issue()
  ...
2009-03-26 11:20:23 -07:00
Linus Torvalds
61a091827e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (97 commits)
  USB: qcserial: add device id for HP devices
  USB: isp1760: Add a delay before reading the SKIPMAP registers in isp1760-hcd.c
  USB: allow malformed LANGID descriptors
  USB: pxa27x_udc: typo fixes and code cleanups
  USB: gadget: gadget zero uses new suspend/resume hooks
  USB: gadget: composite device-level suspend/resume hooks
  USB: r8a66597-hcd: suspend/resume support
  USB: more u32 conversion after transfer_buffer_length and actual_length
  USB: Fix cp2101 USB serial device driver termios functions for console use
  USB: CP2101 New Device ID
  USB: ipaq: handle 4 endpoint devices
  USB: S3C: Move usb-control.h to platform include
  USB: ohci-hcd: Add ARCH_S3C24XX to the ohci-s3c2410.c glue
  USB: pedantic: spelling correction in comment for ch9.h
  USB: host: fix sparse warning: Using plain integer as NULL pointer
  USB: ohci-s3c2410: fix name of bus clock
  USB: ohci-s3c2410: remove <mach/hardware.h> include
  USB: serial: rename cp2101 driver to cp210x
  USB: CP2101 Reduce Error Logging
  USB: CP2101 Support AN205 baud rates
  ...
2009-03-26 11:17:39 -07:00
Linus Torvalds
0c93ea4064 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (61 commits)
  Dynamic debug: fix pr_fmt() build error
  Dynamic debug: allow simple quoting of words
  dynamic debug: update docs
  dynamic debug: combine dprintk and dynamic printk
  sysfs: fix some bin_vm_ops errors
  kobject: don't block for each kobject_uevent
  sysfs: only allow one scheduled removal callback per kobj
  Driver core: Fix device_move() vs. dpm list ordering, v2
  Driver core: some cleanup on drivers/base/sys.c
  Driver core: implement uevent suppress in kobject
  vcs: hook sysfs devices into object lifetime instead of "binding"
  driver core: fix passing platform_data
  driver core: move platform_data into platform_device
  sysfs: don't block indefinitely for unmapped files.
  driver core: move knode_bus into private structure
  driver core: move knode_driver into private structure
  driver core: move klist_children into private structure
  driver core: create a private portion of struct device
  driver core: remove polling for driver_probe_done(v5)
  sysfs: reference sysfs_dirent from sysfs inodes
  ...

Fixed conflicts in drivers/sh/maple/maple.c manually
2009-03-26 11:17:04 -07:00
Linus Torvalds
bc2fd381d8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (53 commits)
  ide: use try_to_identify() in ide_driveid_update()
  ide: clear drive IRQ after re-enabling local IRQs in ide_driveid_update()
  ide: sanitize SELECT_MASK() usage in ide_driveid_update()
  ide: classify device type in do_probe()
  ide: remove broken EXABYTENEST support
  ide: shorten timeout value in ide_driveid_update()
  ide: propagate AltStatus workarounds to ide_driveid_update()
  ide: fix kmalloc() failure handling in ide_driveid_update()
  mn10300: remove <asm/ide.h>
  frv: remove <asm/ide.h>
  ide: remove pciirq argument from ide_pci_setup_ports()
  ide: fix ->init_chipset method to return 'int' value
  ide: remove try_to_identify() wrapper
  ide: remove no longer needed IRQ auto-probing from try_to_identify() (v2)
  ide: remove no longer needed IRQ fallback code from hwif_init()
  amd74xx: remove no longer needed ->init_hwif method
  ide: remove no longer needed IDE_HFLAG[_FORCE]_LEGACY_IRQS
  ide: use ide_pci_is_in_compatibility_mode() in ide_pci_init_{one,two}()
  ide: use pci_get_legacy_ide_irq() in ide_pci_init_{one,two}()
  ide: handle IDE_HFLAG[_FORCE]_LEGACY_IRQS in ide_pci_init_{one,two}()
  ...
2009-03-26 11:13:06 -07:00
Linus Torvalds
928a726b0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (96 commits)
  sh: add support for SMSC Polaris platform
  sh: fix the HD64461 level-triggered interrupts handling
  sh: sh-rtc wakeup support
  sh: sh-rtc invalid time rework
  sh: sh-rtc carry interrupt rework
  sh: disallow kexec virtual entry
  sh: kexec jump: fix for ftrace.
  sh: kexec: Drop SR.BL bit toggling.
  sh: add kexec jump support
  sh: rework kexec segment code
  sh: simplify kexec vbr code
  sh: Flush only the needed range when unmapping a VMA.
  sh: Update debugfs ASID dumping for 16-bit ASID support.
  sh: tlb-pteaex: Kill off legacy PTEA updates.
  sh: Support for extended ASIDs on PTEAEX-capable SH-X3 cores.
  sh: sh7763rdp: Change IRQ number for sh_eth of sh7763rdp
  sh: espt-giga board support
  sh: dma: Make G2 DMA configurable.
  sh: dma: Make PVR2 DMA configurable.
  sh: Move IRQ multi definition of DMAC to defconfig
  ...
2009-03-26 11:11:23 -07:00
Linus Torvalds
8ff64b539b Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
  GFS2: Fix freeze issue
  Fix a minor bug in the previous patch
  GFS2: Clean up of glops.c
  GFS2: Fix locking bug in failed shared to exclusive conversion
  GFS2: Pagecache usage optimization on GFS2
  GFS2: fix sparse warning: Should it be static?
  GFS2: fix sparse warnings: constant is so big it is ...
  GFS2: Support quota/noquota mount arguments
  GFS2: Fix alignment issue and tidy gfs2_bitfit
  GFS2: Add a "demote a glock" interface to sysfs
  GFS2: Expose UUID via sysfs/uevent
  GFS2: Support generation of discard requests
  GFS2: Fix deadlock on journal flush
  GFS2: Fix error path ref counting for root inode
  GFS2: Remove unused field from glock
  GFS2: Merge lock_dlm module into GFS2
  GFS2: Remove "double" locking in quota
  GFS2: change gfs2_quota_scan into a shrinker
  GFS2: Bring back lvb-related stuff to lock_nolock to support quotas
  GFS2: Fix remount argument parsing
2009-03-26 11:08:47 -07:00
Linus Torvalds
502012534d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (430 commits)
  ALSA: hda - Add quirk for Acer Ferrari 5000
  ALSA: hda - Use cached calls to get widget caps and pin caps
  ALSA: hda - Don't create empty/single-item input source
  ALSA: hda - Fix the wrong pin-cap check in patch_realtek.c
  ALSA: hda - Cache pin-cap values
  ALSA: hda - Avoid output amp manipulation to digital mic pins
  ALSA: hda - Add function id to proc output
  ALSA: pcm - Safer boundary checks
  ALSA: hda - Detect digital-mic inputs on ALC663 / ALC272
  ALSA: sound/ali5451: typo: s/resouces/resources/
  ALSA: hda - Don't show the current connection for power widgets
  ALSA: Fix wrong pointer to dev_err() in arm/pxa2xx-ac97-lib.c
  ASoC: Declare Headset as Mic and Headphone widgets for SDP3430
  ASoC: OMAP: N810: Add more jack functions
  ASoC: OMAP: N810: Mark not connected input pins
  ASoC: Add FLL support for WM8400
  ALSA: hda - Don't reset stream at each prepare callback
  ALSA: hda - Don't reset BDL unnecessarily
  ALSA: pcm - Fix delta calculation at boundary overlap
  ALSA: pcm - Reset invalid position even without debug option
  ...
2009-03-26 11:05:17 -07:00
Linus Torvalds
562f477a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits)
  crypto: sha512-s390 - Add missing block size
  hwrng: timeriomem - Breaks an allyesconfig build on s390:
  nlattr: Fix build error with NET off
  crypto: testmgr - add zlib test
  crypto: zlib - New zlib crypto module, using pcomp
  crypto: testmgr - Add support for the pcomp interface
  crypto: compress - Add pcomp interface
  netlink: Move netlink attribute parsing support to lib
  crypto: Fix dead links
  hwrng: timeriomem - New driver
  crypto: chainiv - Use kcrypto_wq instead of keventd_wq
  crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq
  crypto: api - Use dedicated workqueue for crypto subsystem
  crypto: testmgr - Test skciphers with no IVs
  crypto: aead - Avoid infinite loop when nivaead fails selftest
  crypto: skcipher - Avoid infinite loop when cipher fails selftest
  crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention
  crypto: api - crypto_alg_mod_lookup either tested or untested
  crypto: amcc - Add crypt4xx driver
  crypto: ansi_cprng - Add maintainer
  ...
2009-03-26 11:04:34 -07:00
Linus Torvalds
8d80ce80e1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (71 commits)
  SELinux: inode_doinit_with_dentry drop no dentry printk
  SELinux: new permission between tty audit and audit socket
  SELinux: open perm for sock files
  smack: fixes for unlabeled host support
  keys: make procfiles per-user-namespace
  keys: skip keys from another user namespace
  keys: consider user namespace in key_permission
  keys: distinguish per-uid keys in different namespaces
  integrity: ima iint radix_tree_lookup locking fix
  TOMOYO: Do not call tomoyo_realpath_init unless registered.
  integrity: ima scatterlist bug fix
  smack: fix lots of kernel-doc notation
  TOMOYO: Don't create securityfs entries unless registered.
  TOMOYO: Fix exception policy read failure.
  SELinux: convert the avc cache hash list to an hlist
  SELinux: code readability with avc_cache
  SELinux: remove unused av.decided field
  SELinux: more careful use of avd in avc_has_perm_noaudit
  SELinux: remove the unused ae.used
  SELinux: check seqno when updating an avc_node
  ...
2009-03-26 11:03:39 -07:00
Matthew Garrett
d0adde574b Add a strictatime mount option
Add support for explicitly requesting full atime updates. This makes it
possible for kernels to default to relatime but still allow userspace to
override it.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-26 10:56:35 -07:00
Ingo Molnar
5a54bd1307 Merge commit 'v2.6.29' into core/header-fixes 2009-03-26 18:29:40 +01:00
H. Peter Anvin
8cd2c29dd5 compiler-gcc4: conditionalize #error on __KERNEL__
Impact: Fix for exported headers

We only want to error out on specific gcc versions if we are actually
building the kernel, so conditionalize the #if...#error on __KERNEL__.

Based on a patchset by Arnd Bergmann <arnd@arndb.de>.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:23 +01:00
Arnd Bergmann
3a471cbc08 remove __KERNEL_STRICT_NAMES
With the last used of non-strict names gone from the
exported header files, we can remove the old libc5
compatibility cruft from our headers and only export
strict types.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:21 +01:00
Arnd Bergmann
60c195c729 make netfilter use strict integer types
Netfilter traditionally uses BSD integer types in its
interface headers. This changes it to use the Linux
strict integer types, like everyone else.

Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:20 +01:00
Arnd Bergmann
ccef7ab534 make MTD headers use strict integer types
The MTD headers traditionally use stdint types rather than
the kernel integer types. This converts them to do the
same as all the others.

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:17 +01:00
Arnd Bergmann
9adfbfb611 make most exported headers use strict integer types
This takes care of all files that have only a small number
of non-strict integer type uses.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Cc: linux-ppp@vger.kernel.org
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:15 +01:00
Arnd Bergmann
85efde6f4e make exported headers use strict posix types
A number of standard posix types are used in exported headers, which
is not allowed if __STRICT_KERNEL_NAMES is defined. In order to
get rid of the non-__STRICT_KERNEL_NAMES part and to make sane headers
the default, we have to change them all to safe types.

There are also still some leftovers in reiserfs_fs.h, elfcore.h
and coda.h, but these files have not compiled in user space for
a long time.

This leaves out the various integer types ({u_,u,}int{8,16,32,64}_t),
which we take care of separately.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Cc: linux-ppp@vger.kernel.org
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:14 +01:00
Jaswinder Singh Rajput
9d50638bae unconditionally include asm/types.h from linux/types.h
Reported-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:12 +01:00
Atsushi Nemoto
0f571515c3 dmaengine: Add privatecnt to revert DMA_PRIVATE property
Currently dma_request_channel() set DMA_PRIVATE capability but never
clear it.  So if a public channel was once grabbed by
dma_request_channel(), the device stay PRIVATE forever.  Add
privatecnt member to dma_device to correctly revert it.

[lg@denx.de: fix bad usage of 'chan' in dma_async_device_register]
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-26 09:48:09 -07:00
Ingo Molnar
7c526e1fef Merge branches 'timers/new-apis', 'timers/ntp' and 'timers/urgent' into timers/core 2009-03-26 15:45:52 +01:00
Theodore Ts'o
7058548cd5 ext4: Use WRITE_SYNC for commits which are caused by fsync()
If a commit is triggered by fsync(), set a flag indicating the journal
blocks associated with the transaction should be flushed out using
WRITE_SYNC.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-03-25 23:35:46 -04:00
Jan Kara
bf84c82d00 quota: Remove uppercase aliases for quota functions.
Since all users have been converted, remove uppercase names of quota functions.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-03-26 02:18:37 +01:00
Jan Kara
dd6f3c6d5a quota: Remove NODQUOT macro
Remove this macro which is just a definition of NULL. Fix a few coding style
issues along the way.

Signed-off-by: Jan Kara <jack@suse.cz>
2009-03-26 02:18:35 +01:00
Mingming Cao
9900ba3487 quota: Use inode->i_blkbits to get block bits
Andrew has suggested to use inode->i_blkbits to get the block bits info,
rather than use super block's blockbits. That should be faster and emit
less code.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-03-26 02:18:34 +01:00
Mingming Cao
740d9dcd94 quota: Add quota reservation claim and released operations
Reserved quota will be claimed at the block allocation time. Over-booked
quota could be returned back with the release callback function.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-03-26 02:18:24 +01:00
Mingming Cao
f18df22899 quota: Add quota reservation support
Delayed allocation defers the block allocation at the dirty pages
flush-out time, doing quota charge/check at that time is too late.
But we can't charge the quota blocks until blocks are really allocated,
otherwise users could get overcharged after reboot from system crash.

This patch adds quota reservation for delayed allocation. Quota blocks
are reserved in memory, inode and quota won't gets dirtied until later
block allocation time.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-03-26 02:15:50 +01:00
Oliver Neukum
0361a28d3f HID: autosuspend support for USB HID
This uses the USB busy mechanism for aggessive autosuspend of USB
HID devices. It autosuspends all opened devices supporting remote wakeup
after a timeout unless

- output is being done to the device
- a key is being held down (remote wakeup isn't triggered upon key release)
- LED(s) are lit
- hiddev is opened

As in the current driver closed devices will be autosuspended even if they
don't support remote wakeup.

The patch is quite large because output to devices is done in hard interrupt
context meaning a lot a queuing and locking had to be touched. The LED stuff
has been solved by means of a simple counter. Additions to the generic HID code
could be avoided. In addition it now covers hidraw. It contains an embryonic
version of an API to let the generic HID code tell the lower levels which
capabilities with respect to power management are needed.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-25 17:57:57 +01:00
Thomas Gleixner
de18836e44 genirq: fix devres.o build for GENERIC_HARDIRQS=n
kernel/irq/devres.c is built by sparc (32bit) and m68k via the obscure
../../../kernel/irq/devres.o reference in arch/[sparc/m68k]/kernel/Makefile

To avoid ifdeffery in devres.c provide request_threaded_irq as an
inline for these users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-25 17:33:38 +01:00
Eric Dumazet
b8dfe49877 netfilter: factorize ifname_compare()
We use same not trivial helper function in four places. We can factorize it.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-25 17:31:52 +01:00
Dan Williams
729b5d1b8e dmaengine: allow dma support for async_tx to be toggled
Provide a config option for blocking the allocation of dma channels to
the async_tx api.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-25 09:13:25 -07:00
Dan Williams
06164f3194 async_tx: provide __async_inline for HAS_DMA=n archs
To allow an async_tx routine to be compiled away on HAS_DMA=n arch it
needs to be declared __always_inline otherwise the compiler may emit
code and cause a link error.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-25 09:13:25 -07:00
Dan Williams
54aee6a5f5 dmaengine: kill some unused headers
The dmaengine redux left some unneeded headers in
include/linux/dmaengine.h, clean them up.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-03-25 09:13:24 -07:00
Ingo Molnar
b6d9842258 Merge branch 'sched/cleanups'; commit 'v2.6.29' into sched/core 2009-03-25 10:26:51 +01:00
Alan Cox
c96f1732e2 [libata] Improve timeout handling
On a timeout call a device specific handler early in the recovery so that
we can complete and process successful commands which timed out due to IRQ
loss or the like rather more elegantly.

[Revised to exclude the timeout handling on a few devices that inherit from
 SFF but are not SFF enough to use the default timeout handler]

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-24 22:52:39 -04:00
Alan Cox
3d47aa8e7e [libata] Drain data on errors
If the device is signalling that there is data to drain after an error we
should read the bytes out and throw them away. Without this some devices
and controllers get wedged and don't recover.

Based on earlier work by Mark Lord

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-24 22:48:26 -04:00
Erik Inge Bolsø
22ddbd1e03 include/linux/ata.h: add some more transfer masks
Signed-off-by: Erik Inge Bolsø <knan-lkml@anduin.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-24 22:12:00 -04:00
Mark Lord
1a660164c2 [libata] Export ata_pio_queue_task() so that it can be used from sata_mv.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-24 22:02:41 -04:00
Greg Banks
e6e66b02e1 Dynamic debug: fix pr_fmt() build error
When CONFIG_DYNAMIC_DEBUG is enabled, allow callers of pr_debug()
to provide their own definition of pr_fmt() even if that definition
uses tricks like

#define pr_fmt(fmt) "%s:" fmt, __func__

Signed-off-by: Greg Banks <gnb@sgi.com>
Cc: Jason Baron <jbaron@redhat.com>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:27 -07:00
Jason Baron
e9d376f0fa dynamic debug: combine dprintk and dynamic printk
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.

The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.

for example, enabled all debug messages in module 'nf_conntrack':

echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control

to disable them:

echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control

A further explanation can be found in the documentation patch.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:26 -07:00
Cornelia Huck
ffa6a7054d Driver core: Fix device_move() vs. dpm list ordering, v2
dpm_list currently relies on the fact that child devices will
be registered after their parents to get a correct suspend
order. Using device_move() however destroys this assumption, as
an already registered device may be moved under a newly registered
one.

This patch adds a new argument to device_move(), allowing callers
to specify how dpm_list should be adapted.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:26 -07:00
Ming Lei
f67f129e51 Driver core: implement uevent suppress in kobject
This patch implements uevent suppress in kobject and removes it
from struct device, based on the following ideas:

1,Uevent sending should be one attribute of kobject, so suppressing it
in kobject layer is more natural than in device layer. By this way,
we can do it for other objects embedded with kobject.

2,It may save several bytes for each instance of struct device.(On my
omap3(32bit ARM) based box, can save 8bytes per device object)

This patch also introduces dev_set|get_uevent_suppress() helpers to
set and query uevent_suppress attribute in case to help kobject
as private part of struct device in future.

[This version is against the latest driver-core patch set of Greg,please
ignore the last version.]

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:26 -07:00
Kay Sievers
4995f8ef9d vcs: hook sysfs devices into object lifetime instead of "binding"
During bootup performance tracing I noticed many occurrences of
vca* device creation and removal, leading to the usual userspace
uevent processing, which are, in this case, rather pointless.

A simple test showing the kernel timing (not including all the
work userspace has to do), gives us these numbers:
  $ time for i in `seq 1000`; do echo a > /dev/tty2; done
  real    0m1.142s
  user    0m0.015s
  sys     0m0.540s

If we move the hook for the vcs* driver core devices from the
tty "binding" to the vc allocation/deallocation, which is what
the vcs* devices represent, we get the following numbers:
  $ time for i in `seq 1000`; do echo a > /dev/tty2; done
  real    0m0.152s
  user    0m0.030s
  sys     0m0.072s

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:26 -07:00
Ming Lei
006f4571a1 driver core: move platform_data into platform_device
This patch moves platform_data from struct device into
struct platform_device, based on the two ideas:

1. Now all platform_driver is registered by platform_driver_register,
   which makes probe()/release()/... of platform_driver passed parameter
   of platform_device *, so platform driver can get platform_data from
   platform_device;

2. Other kind of devices do not need to use platform_data, we can
   decrease size of device if moving it to platform_device.

Taking into consideration of thousands of files to be fixed and they
can't be finished in one night(maybe it will take a long time), so we
keep platform_data in device to allow two kind of cases coexist until
all platform devices pass its platfrom data from
platform_device->platform_data.

All patches to do this kind of conversion are welcome.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:26 -07:00
Greg Kroah-Hartman
ae1b41715e driver core: move knode_bus into private structure
Nothing outside of the driver core should ever touch knode_bus, so
move it out of the public eye.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:25 -07:00
Greg Kroah-Hartman
8940b4f312 driver core: move knode_driver into private structure
Nothing outside of the driver core should ever touch knode_driver, so
move it out of the public eye.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:25 -07:00
Greg Kroah-Hartman
f791b8c836 driver core: move klist_children into private structure
Nothing outside of the driver core should ever touch klist_children, or
knode_parent, so move them out of the public eye.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:25 -07:00
Greg Kroah-Hartman
fb069a5d13 driver core: create a private portion of struct device
This is to be used to move things out of struct device that no code
outside of the driver core should ever touch.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:25 -07:00
Ming Lei
b23530ebc3 driver core: remove polling for driver_probe_done(v5)
This patch removes 100ms polling for driver_probe_done in
wait_for_device_probe(), and uses wait_event() instead.
Removing polling in fs initialization may lead to
a faster boot.

This patch also changes the return type of wait_for_device_done()
from int to void.

This patch is against Arjan's patch in linux-next tree.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:25 -07:00
Hans J. Koch
8205779114 UIO: Add name attributes for mappings and port regions
If a UIO device has several memory mappings, it can be difficult for userspace
to find the right one. The situation becomes even worse if the UIO driver can
handle different versions of a card that have different numbers of mappings.
Benedikt Spranger has such cards and pointed this out to me. Thanks, Bene!

To address this problem, this patch adds "name" sysfs attributes for each
mapping. Userspace can use these to clearly identify each mapping. The name
string is optional. If a driver doesn't set it, an empty string will be
returned, so this patch won't break existing drivers.

The same problem exists for port region information, so a "name" attribute is
added there, too.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:24 -07:00
Eric Miao
57fee4a58f platform: introduce module id table for platform devices
Now platform_device is being widely used on SoC processors where the
peripherals are attached to the system bus, which is simple enough.

However, silicon IPs for these SoCs are usually shared heavily across
a family of processors, even products from different companies.  This
makes the original simple driver name based matching insufficient, or
simply not straight-forward.

Introduce a module id table for platform devices, and makes it clear
that a platform driver is able to support some shared IP and handle
slight differences across different platforms (by 'driver_data').
Module alias is handled automatically when a MODULE_DEVICE_TABLE()
is defined.

To not disturb the current platform drivers too much, the matched id
entry is recorded and can be retrieved by platform_get_device_id().

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:24 -07:00
Kay Sievers
1fa5ae857b driver core: get rid of struct device's bus_id string array
Now that all users of bus_id is gone, we can remove it from struct
device.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:23 -07:00
Kay Sievers
2c0f3e96f3 wimax: struct device - replace bus_id with dev_name(), dev_set_name()
Cc: inaky.perez-gonzalez@intel.com
Cc: linux-wimax@intel.com
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
2009-03-24 16:38:23 -07:00
Pablo Neira Ayuso
38938bfe34 netlink: add NETLINK_NO_ENOBUFS socket flag
This patch adds the NETLINK_NO_ENOBUFS socket flag. This flag can
be used by unicast and broadcast listeners to avoid receiving
ENOBUFS errors.

Generally speaking, ENOBUFS errors are useful to notify two things
to the listener:

a) You may increase the receiver buffer size via setsockopt().
b) You have lost messages, you may be out of sync.

In some cases, ignoring ENOBUFS errors can be useful. For example:

a) nfnetlink_queue: this subsystem does not have any sort of resync
method and you can decide to ignore ENOBUFS once you have set a
given buffer size.

b) ctnetlink: you can use this together with the socket flag
NETLINK_BROADCAST_SEND_ERROR to stop getting ENOBUFS errors as
you do not need to resync (packets whose event are not delivered
are drop to provide reliable logging and state-synchronization).

Moreover, the use of NETLINK_NO_ENOBUFS also reduces a "go up, go down"
effect in terms of performance which is due to the netlink congestion
control when the listener cannot back off. The effect is the following:

1) throughput rate goes up and netlink messages are inserted in the
receiver buffer.
2) Then, netlink buffer fills and overruns (set on nlk->state bit 0).
3) While the listener empties the receiver buffer, netlink keeps
dropping messages. Thus, throughput goes dramatically down.
4) Then, once the listener has emptied the buffer (nlk->state
bit 0 is set off), goto step 1.

This effect is easy to trigger with netlink broadcast under heavy
load, and it is more noticeable when using a big receiver buffer.
You can find some results in [1] that show this problem.

[1] http://1984.lsi.us.es/linux/netlink/

This patch also includes the use of sk_drop to account the number of
netlink messages drop due to overrun. This value is shown in
/proc/net/netlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-24 16:37:55 -07:00
David Brownell
8942939a6c USB: gadget: composite device-level suspend/resume hooks
Address one open question in the composite gadget framework:
Yes, we should have device-level suspend/resume callbacks
in addition to the function-level ones.  We have at least one
scenario (with gadget zero in OTG test mode) that's awkward
to handle without it.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Felipe Balbi <felipe.balbi@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:45 -07:00
D.J. Capelis
64a3a25f44 USB: pedantic: spelling correction in comment for ch9.h
Just noticed this during a grep, figured I might as well send it in.

From: D.J. Capelis <dev@capelis.dj>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:44 -07:00
Greg Kroah-Hartman
8c209e6782 USB: make actual_length in struct urb field u32
actual_length should also be a u32 and not a signed value.  This patch
changes this field to be 'u32' to prevent any potential negative
conversion and comparison errors.

This triggered a few compiler warning messages when these fields were
being used with the min macro, so they have also been fixed up in this
patch.

Cc: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:36 -07:00
Greg Kroah-Hartman
16e2e5f634 USB: make transfer_buffer_lengths in struct urb field u32
Roel Kluin pointed out that transfer_buffer_lengths in struct urb was
declared as an 'int'.  This patch changes this field to be 'u32' to
prevent any potential negative conversion and comparison errors.

This triggered a few compiler warning messages when these fields were
being used with the min macro, so they have also been fixed up in this
patch.

Cc: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:36 -07:00
David Vrabel
6da9c99059 USB: allow libusb to talk to unauthenticated WUSB devices
To permit a userspace application to associate with WUSB devices
using numeric association, control transfers to unauthenticated WUSB
devices must be allowed.

This requires that wusbcore correctly sets the device state to
UNAUTHENTICATED, DEFAULT and ADDRESS and that control transfers can be
performed to UNAUTHENTICATED devices.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:35 -07:00
Alan Stern
e6e244b6cb usb-storage: prepare for subdriver separation
This patch (as1206) is the first step in converting usb-storage's
subdrivers into separate modules.  It makes the following large-scale
changes:

	Remove a bunch of unnecessary #ifdef's from usb_usual.h.
	Not truly necessary, but it does clean things up.

	Move the USB device-ID table (which is duplicated between
	libusual and usb-storage) into its own source file,
	usual-tables.c, and arrange for this to be linked with
	either libusual or usb-storage according to whether
	USB_LIBUSUAL is configured.

	Add to usual-tables.c a new usb_usual_ignore_device()
	function to detect whether a particular device needs to be
	managed by a subdriver and not by the standard handlers
	in usb-storage.

	Export a whole bunch of functions in usb-storage, renaming
	some of them because their names don't already begin with
	"usb_stor_".  These functions will be needed by the new
	subdriver modules.

	Split usb-storage's probe routine into two functions.
	The subdrivers will call the probe1 routine, then fill in
	their transport and protocol settings, and then call the
	probe2 routine.

	Take the default cases and error checking out of
	get_transport() and get_protocol(), which run during
	probe1, and instead put a check for invalid transport
	or protocol values into the probe2 function.

	Add a new probe routine to be used for standard devices,
	i.e., those that don't need a subdriver.  This new routine
	checks whether the device should be ignored (because it
	should be handled by ub or by a subdriver), and if not,
	calls the probe1 and probe2 functions.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:34 -07:00
Ajay Kumar Gupta
f6d92a05c8 USB: otg: adding nop usb transceiver
NOP transceiver is used by all the usb transceiver which are mostly
autonomous and doesn't require any programming or which are built
into the usb ip itself.NOP transceiver only allocates the memory
for struct xceiv and calls otg_set_transceiver() so function call
to otg_get_transceiver() will return a valid transceiver.

NOP transceiver device should be registered by calling
usb_nop_xceiv_register() from platform files.

Signed-off-by: Ajay Kumar Gupta <ajay.gupta@ti.com>
Cc: Felipe Balbi <felipe.balbi@nokia.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:30 -07:00
Oliver Neukum
f8bece8d91 USB: serial: introduce a flag into the usb serial layer to tell drivers that their URBs are killed due to suspension
This patch introduces a flag into the usb serial layer to tell drivers
that their URBs are killed due to suspension. That is necessary to let
drivers know whether they should report an error back.

Signed-off-by: Oliver Neukum <oneukum@suse.de>

Hi Greg,

this is for 2.6.30. Patches to use this in drivers are under development.

	Regards
		Oliver
2009-03-24 16:20:29 -07:00
Julia Lawall
4d6914b729 USB: Move definitions from usb.h to usb/ch9.h
The functions:

usb_endpoint_dir_in(epd)
usb_endpoint_dir_out(epd)
usb_endpoint_is_bulk_in(epd)
usb_endpoint_is_bulk_out(epd)
usb_endpoint_is_int_in(epd)
usb_endpoint_is_int_out(epd)
usb_endpoint_is_isoc_in(epd)
usb_endpoint_is_isoc_out(epd)
usb_endpoint_num(epd)
usb_endpoint_type(epd)
usb_endpoint_xfer_bulk(epd)
usb_endpoint_xfer_control(epd)
usb_endpoint_xfer_int(epd)
usb_endpoint_xfer_isoc(epd)

are moved from include/linux/usb.h to include/linux/usb/ch9.h.
include/linux/usb/ch9.h makes more sense for these functions because they
only depend on constants that are defined in this file.

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:27 -07:00
Robert Jarzmik
c2344f13b5 USB: gpio_vbus: add delayed vbus_session calls
Call usb_gadget_vbus_connect() and ...disconnect() from a
workqueue rather than from an irq handler, allowing msleep()
calls in vbus_session.  Update kerneldoc to match.

[ dbrownell@users.sourceforge.net: more kerneldoc updates ]

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:26 -07:00
Alan Stern
1662e3a7f0 USB: add quirk to avoid config and interface strings
Apparently the Configuration and Interface strings aren't used as
often as the Vendor, Product, and Serial strings.  In at least one
device (a Saitek Cyborg Gold 3D joystick), attempts to read the
Configuration string cause the device to stop responding to Control
requests.

This patch (as1226) adds a quirks flag, telling the kernel not to
read a device's Configuration or Interface strings, together with a
new quirk for the offending joystick.

Reported-by: Melchior FRANZ <melchior.franz@gmail.com>
Tested-by: Melchior FRANZ <melchior.franz@gmail.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>  [2.6.28 and 2.6.29, nothing earlier]
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:25 -07:00
Bartlomiej Zolnierkiewicz
2ebe1d9efe ide: use try_to_identify() in ide_driveid_update()
* Pass pointer to buffer for IDENTIFY data to do_identify()
  and try_to_identify().

* Un-static try_to_identify() and use it in ide_driveid_update().

* Rename try_to_identify() to ide_dev_read_id().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:59 +01:00
Bartlomiej Zolnierkiewicz
552d3a99bd ide: remove broken EXABYTENEST support
do_identify() marks EXABYTENEST device as non-present and frees
drive->id so enable_nest() has absolutely no chance of working.

The code was like this since at least 2.6.12-rc2 and nobody
has noticed so just remove broken EXABYTENEST support.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:58 +01:00
Bartlomiej Zolnierkiewicz
d45b70ab9b mn10300: remove <asm/ide.h>
* Remove superfluous <asm/intctl-regs.h> include.

* Remove no longer used SUPPORT_SLOW_DATA_PORTS define.

* Move defining SUPPORT_VLB_SYNC to <linux/ide.h>.

* Use __ide_mm_*() macros from <asm-generic/ide_iops.h>
  (MN10300 uses only memory-mapped I/O).

* Remove <asm/ide.h>.

While at it:

* Remove superfluous SPARC64 #ifdef from <linux/ide.h>.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:54 +01:00
Bartlomiej Zolnierkiewicz
662641d98b frv: remove <asm/ide.h>
* Remove superfluous <asm/{setup,io,irq}.h> includes.

* Remove <asm/ide.h>.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:54 +01:00
Bartlomiej Zolnierkiewicz
86ccf37c6a ide: remove pciirq argument from ide_pci_setup_ports()
* Set ->irq explicitly in cs5520.c.

* Remove irq argument from ide_hw_configure().

* Remove pciirq argument from ide_pci_setup_ports().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:53 +01:00
Bartlomiej Zolnierkiewicz
2ed0ef543a ide: fix ->init_chipset method to return 'int' value
* Return 0 instead of dev->irq in ->init_chipset implementations.

* Fix ->init_chipset method to return 'int' value instead of
  'unsigned int' one.

This fixes ->init_chipset handling for host drivers (cs5530, hpt366
and pdc202xx_new) for which it is possible for this method to fail.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:53 +01:00
Bartlomiej Zolnierkiewicz
8b07ed26f8 ide: remove no longer needed IRQ fallback code from hwif_init()
Then remove no longer used __ide_default_irq().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:52 +01:00
Bartlomiej Zolnierkiewicz
2467922a56 ide: remove no longer needed IDE_HFLAG[_FORCE]_LEGACY_IRQS
There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:52 +01:00
Bartlomiej Zolnierkiewicz
327fa1c294 ide: move error handling code to ide-eh.c (v2)
Do some CodingStyle fixups in <linux/ide.h> while at it.

v2:
Add missing <linux/delay.h> include (reported by Stephen Rothwell).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:47 +01:00
Bartlomiej Zolnierkiewicz
7eeaaaa522 ide: move xfer mode tuning code to ide-xfer-mode.c
* Move xfer mode tuning code to ide-xfer-mode.c.

* Add CONFIG_IDE_XFER_MODE config option to be selected by host drivers
  that support xfer mode tuning.

* Add CONFIG_IDE_XFER_MODE=n static inline versions of ide_set_pio()
  and ide_set_xfer_rate().

* Make IDE_TIMINGS and BLK_DEV_IDEDMA config options select IDE_XFER_MODE,
  also add explicit selects for few host drivers that need it.

* Build/link ide-xfer-mode.o and ide-pio-blacklist.o (it is needed only
  by ide-xfer-mode.o) only if CONFIG_IDE_XFER_MODE=y.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:46 +01:00
Bartlomiej Zolnierkiewicz
11938c9290 ide: move device settings code to ide-devsets.c
Remove stale comment from ide.c while at it.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:44 +01:00
Bartlomiej Zolnierkiewicz
c4e66c36cc ide: move ide_do_park_unpark() to ide-park.c
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:44 +01:00
Bartlomiej Zolnierkiewicz
1866082339 ide: remove ide_do_drive_cmd()
* Use elv_add_request() instead of __elv_add_request() in ide_do_drive_cmd().

* ide_do_drive_cmd() is used only in ide-{atapi,cd}.c so inline it there.

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:44 +01:00
Bartlomiej Zolnierkiewicz
65ca537732 ide: move ide_dma_timeout_retry() to ide-dma.c
Move ide_dma_timeout_retry() to ide-dma.c and add static inline
version for CONFIG_BLK_DEV_IDEDMA=n.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:43 +01:00
Bartlomiej Zolnierkiewicz
b6a45a0b1e ide: move drive_is_ready() to ide-io.c
Move drive_is_ready() to ide-io.c, then make it static.

Also make some minor CodingStyle fixups while at it.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:43 +01:00
Bartlomiej Zolnierkiewicz
8b803bd184 ide: sanitize ACPI initialization
* ide_acpi_init() -> ide_acpi_init_port()

* ide_acpi_blacklist() -> ide_acpi_init()

* Call ide_acpi_init() only once (do it during IDE core
  initialization) and cleanup the function accordingly.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:41 +01:00
Bartlomiej Zolnierkiewicz
7ed5b157d9 ide: add ide_for_each_present_dev() iterator
* Add ide_for_each_present_dev() iterator and convert IDE code to use it.

* Do some drive-by CodingStyle fixups in ide-acpi.c while at it.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:41 +01:00
Bartlomiej Zolnierkiewicz
7a254df007 ide: move ide_pktcmd_tf_load() to ide-atapi.c
Then make it static and remove 'dma' argument.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-24 23:22:39 +01:00
David S. Miller
b5bb14386e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-03-24 13:24:36 -07:00
Stefan Richter
18e9b10fcd firewire: cdev: add closure to async stream ioctl
This changes the as yet unreleased FW_CDEV_IOC_SEND_STREAM_PACKET ioctl
to generate an fw_cdev_event_response event just like the other two
ioctls for asynchronous request transmission do.  This way, clients get
feedback on successful or unsuccessful transmission.

This also adds input validation for length, tag, channel, sy, speed.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:50 +01:00
Stefan Richter
de487da8ca firewire: cdev: secure add_descriptor ioctl
The access permissions and ownership or ACL of /dev/fw* character device
files will typically be set based on the device type of the respective
nodes, as obtained by firewire-core from descriptors in the device's
configuration ROM.  An example policy is to deny write permission by
default but grant write permission to files of AV/C video and audio
devices and IIDC video devices.

The FW_CDEV_IOC_ADD_DESCRIPTOR ioctl could be used to partly subvert
such a policy:  Find a device file with relaxed permissions, use the
ioctl to add a descriptor with AV/C marker to the local node's ROM, thus
gain access to the local node's character device file.  (This is only
possible if there are udev scripts installed which actively relax
permissions for known device types and if there is a device of such a
type connected.)

Accessibility of the local node's device file is relevant to host
security if the host contains two or more IEEE 1394 link layer
controllers which are plugged into a single bus.

Therefore change the ABI to deny FW_CDEV_IOC_ADD_DESCRIPTOR if the file
belongs to a remote node.  (This change has no impact on known
implementers of the ABI:  None of them uses the ioctl yet.)

Also clarify the documentation:  The ioctl affects all local nodes, not
just one local node.

Cc: stable@kernel.org
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:50 +01:00
Stefan Richter
c8a25900f3 firewire: cdev: amendment to "add ioctl to query maximum transmission speed"
The as yet unreleased FW_CDEV_IOC_GET_SPEED ioctl puts only a single
integer into the parameter buffer.  We can use ioctl()'s return value
instead.

(Also: Some whitespace change in firewire-cdev.h.)

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:49 +01:00
Jay Fenlason
f8c2287c65 firewire: implement asynchronous stream transmission
Allow userspace and other firewire drivers (fw-ipv4 I'm looking at
you!) to send Asynchronous Transmit Streams as described in 7.8.3 of
release 1.1 of the 1394 Open Host Controller Interface Specification.

Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (tweaks)
2009-03-24 20:56:49 +01:00
Stefan Richter
5d9cb7d276 firewire: cdev: add ioctls for iso resource management, amendment
Some fixes:
  - Remove stale documentation.
  - Fix a != vs. == thinko that got in the way of channel management.
  - Try bandwidth deallocation even if channel deallocation failed.

A simplification:
  - fw_cdev_allocate_iso_resource.channels is now ordered like
    libdc1394's dc1394_iso_allocate_channel() channels_allowed
    argument.

By the way, I looked closer at cards from NEC, TI, and VIA, and noticed
that they all don't implement IEEE 1394a behaviour which is meant to
deviate from IEEE 1212's notion of lock compare-swap.  This means that
we have to do two lock transactions instead of one in many cases where
one transaction would already succeed on a fully 1394a compliant IRM.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:46 +01:00
Stefan Richter
77258da403 firewire: cdev: increment fw_cdev_version, update documentation
Necessary due to
    Date: Tue, 22 Jul 2008 23:23:40 -0700
    From: David Moore <dcm@acm.org>
    Subject: firewire: Include iso timestamp in headers when header_size > 4

Side note:  The lack of upwards compatibility sounds worse than it is.
All existing client implementations, libraw1394 and libdc1394, set
header_size = 4.  And since the ABI v1 behaviour does not offer any
advantages over the new behaviour, we deliberately do not provide the
old behaviour anymore.

Also add documentation about the format of fw_cdev_get_cycle_timer which
may be used in conjunction with the timestamp of iso packets but has a
different format.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:46 +01:00
Jay Fenlason, Stefan Richter
acfe833357 firewire: cdev: add ioctl for broadcast write requests
Write transactions to the broadcast node ID are a convenient way to
trigger functions of multiple nodes at once.  IIDC is a protocol which
can make use of this if multiple cameras with same command_regs_base are
connected at the same bus.

Based on
    Date: Wed, 10 Sep 2008 11:32:16 -0400
    From: Jay Fenlason <fenlason@redhat.com>
    Subject: [patch] SEND_BROADCAST_REQUEST
Changes:  ioctl_send_request() and ioctl_send_broadcast_request() now
share code.  Broadcast speed corrected to S100.  Check for proper tcode.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:44 +01:00
Stefan Richter
33580a3ef5 firewire: cdev: add ioctl to query maximum transmission speed
While the speed of asynchronous transactions is automatically chosen by
the kernel, the speed of isochronous streams has to be chosen by the
initiating client.

In case of 1394a bus topologies, the maximum possible speed could be
figured out with some effort by evaluation of the remote node's link
speed field in the config ROM, the local node's link speed field, and
the PHY speeds and topologic information in the local node's or IRM's
topology map CSR.  However, this does not work in case of 1394b buses.

Hence add an ioctl to export the maximum speed which the kernel already
determined.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:44 +01:00
Stefan Richter
1ec3c0269d firewire: cdev: add ioctls for manual iso resource management
This adds ioctls for allocation and deallocation of a channel or/and
bandwidth without auto-reallocation and without auto-deallocation.

The benefit of these ioctls is that libraw1394-style isochronous
resource management can be implemented without write access to the IRM's
character device file.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:44 +01:00
Jay Fenlason, Stefan Richter
b1bda4cdc2 firewire: cdev: add ioctls for isochronous resource management
Based on
    Date: Tue, 18 Nov 2008 11:41:27 -0500
    From: Jay Fenlason <fenlason@redhat.com>
    Subject: [Patch V4] Add ISO resource management support
with several changes to the ABI and implementation.  Only the part of
the ABI which enables auto-reallocation and auto-deallocation is
included here.

This implements ioctls for kernel-assisted allocation of isochronous
channels and isochronous bandwidth.  The benefits are:
  - The client does not have to have write access to the /dev/fw* device
    corresponding to the IRM.
  - The client does not have to perform reallocation after bus resets.
  - Channel and bandwidth are deallocated by the kernel if the file is
    closed before the client deallocated the resources.  Thus resources
    are released even if the client crashes.

It is anticipated that future in-kernel code (firewire-core IRM code;
the firewire port of firedtv), will use the fw-iso.c portions of this
code too.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: David Moore <dcm@acm.org>
2009-03-24 20:56:43 +01:00
Stefan Richter
632321ecd9 firewire: cdev: fix documentation of FW_CDEV_IOC_GET_INFO
The FW_CDEV_IOC_GET_INFO ioctl looks at client->device->config_rom, not
at the local node's config ROM.

We could fix the implementation or the documentation.  I believe the way
how it is currently implemented is more useful than the way how it is
currently documented.  In fact, libdc1394 uses the ABI already as
implemented, not as documented.  Hence let's change the documentation.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:41 +01:00
Stefan Richter
bf8e3355ec firewire: cdev: documentation fixlet
Reported-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-03-24 20:56:37 +01:00
Thomas Gleixner
3a38148f04 genirq: provide old request_irq() for CONFIG_GENERIC_HARDIRQ=n
Impact: Undo compile breakage for archs with CONFIG_GENERIC_HARDIRQ=n

The threaded interrupt handler patches changed request_irq from extern
to inline. Architectures which do not use the generic irq code still
have request_irq() as a global function and therefor fail to compile.

Keep the extern declaration for CONFIG_GENERIC_HARDIRQ=n

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-24 20:34:24 +01:00
Lai Jiangshan
ee000b7f9f tracing: use union for multi-usages field
Impact: cleanup

struct dyn_ftrace::ip has different usages in his lifecycle,
we use union for it. And also for struct dyn_ftrace::flags.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 16:43:12 +01:00
Ingo Molnar
29219683c4 Merge branches 'x86/apic', 'x86/cleanups', 'x86/mm', 'x86/pat', 'x86/setup' and 'x86/signal'; commit 'v2.6.29' into x86/core 2009-03-24 15:19:45 +01:00
Steven Rostedt
8aef2d2856 function-graph: ignore times across schedule
Impact: more accurate timings

The current method of function graph tracing does not take into
account the time spent when a task is not running. This shows functions
that call schedule have increased costs:

 3) + 18.664 us   |      }
 ------------------------------------------
 3)    <idle>-0    =>  kblockd-123
 ------------------------------------------

 3)               |      finish_task_switch() {
 3)   1.441 us    |        _spin_unlock_irq();
 3)   3.966 us    |      }
 3) ! 2959.433 us |    }
 3) ! 2961.465 us |  }

This patch uses the tracepoint in the scheduling context switch to
account for time that has elapsed while a task is scheduled out.
Now we see:

 ------------------------------------------
 3)    <idle>-0    =>  edac-po-1067
 ------------------------------------------

 3)               |      finish_task_switch() {
 3)   0.685 us    |        _spin_unlock_irq();
 3)   2.331 us    |      }
 3) + 41.439 us   |    }
 3) + 42.663 us   |  }

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 09:33:30 -04:00
Steven Rostedt
5d1a03dc54 function-graph: moved the timestamp from arch to generic code
This patch move the timestamp from happening in the arch specific
code into the general code. This allows for better control by the tracer
to time manipulation.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 09:31:34 -04:00
Boaz Harrosh
05378940ca bsg: add support for tail queuing
Currently inherited from sg.c bsg will submit asynchronous request
 at the head-of-the-queue, (using "at_head" set in the call to
 blk_execute_rq_nowait()). This is bad in situation where the queues
 are full, requests will execute out of order, and can cause
 starvation of the first submitted requests.

The sg_io_v4->flags member is used and a bit is allocated to denote the
Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old
code at the write/read path. SG_IO code path behavior was changed so to
be the same as write/read behavior. SG_IO was very rarely used and breaking
compatibility with it is OK at this stage.

sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first
nibble and one bit from the last nibble. Even though none of these bits
are supported by bsg, The second nibble is allocated for use by bsg. Just
in case.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:17 +01:00
Petros Koutoupis
d399228646 block: genhd.h cleanup patch
In include/linux/genhd.h: Line 335 has a comment that needs to be updated from: /* drivers/block/ll_rw_blk.c */ to /* block/blk-core.c */. Also as of kernel 2.6.16, the function definition for get_blkdev_list was removed from block/genhd.c but the function declaration is still present on line 339. This patch addresses both those fixes, by updating the comment and removing the declaration.

Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:17 +01:00
Martin K. Petersen
6d2a78e783 block: add private bio_set for bio integrity allocations
The integrity bio allocation needs its own bio_set to avoid violating
the mempool allocation rules and risking deadlocks.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:17 +01:00
Petros Koutoupis
32ca163c9c block: genhd.h comment needs updating
The include/linux/genhd.h file, on line 338-352 declares some function
prototypes in which the comment on line 338 states that the definition of
these prototypes are to be found at drivers/block/genhd.c. The problem is
that genhd.c has been relocated to block/genhd.c. See attached patch to
correct this minor cosmetic typo.

Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:17 +01:00
Steven Whitehouse
f057f6cdf6 GFS2: Merge lock_dlm module into GFS2
This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
 o Reducing overhead by eliminating duplicated fields between structures
 o Simplifcation of the code (reduces the code size by a fair bit)
 o The locking interface is now the DLM interface itself as proposed
   some time ago.
 o Fewer lookups of glocks when processing replies from the DLM
 o Fewer memory allocations/deallocations for each glock
 o Scope to do further optimisations in the future (but this patch is
   more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:14 +00:00
Thomas Gleixner
f48fe81e5b genirq: threaded irq handlers review fixups
Delta patch to address the review comments.

      - Implement warning when IRQ_WAKE_THREAD is requested and no
        thread handler installed
      - coding style fixes

Pointed-out-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-24 12:15:23 +01:00
Arjan van de Ven
935bd5b971 genirq: add support for threaded interrupts to devres
Some devices use devres_request_irq() for to install their interrupt
handler. Add support for threaded interrupts to devres as well.

[tglx - simplified and adapted to latest threadirq version]

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-24 12:15:23 +01:00
Thomas Gleixner
3aa551c9b4 genirq: add threaded interrupt handler support
Add support for threaded interrupt handlers:

A device driver can request that its main interrupt handler runs in a
thread. To achive this the device driver requests the interrupt with
request_threaded_irq() and provides additionally to the handler a
thread function. The handler function is called in hard interrupt
context and needs to check whether the interrupt originated from the
device. If the interrupt originated from the device then the handler
can either return IRQ_HANDLED or IRQ_WAKE_THREAD. IRQ_HANDLED is
returned when no further action is required. IRQ_WAKE_THREAD causes
the genirq code to invoke the threaded (main) handler. When
IRQ_WAKE_THREAD is returned handler must have disabled the interrupt
on the device level. This is mandatory for shared interrupt handlers,
but we need to do it as well for obscure x86 hardware where disabling
an interrupt on the IO_APIC level redirects the interrupt to the
legacy PIC interrupt lines.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 12:15:23 +01:00
Sheng Yang
9cf0669746 intel-iommu: VT-d page table to support snooping control bit
The user can request to enable snooping control through VT-d page table.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-24 09:42:54 +00:00
Sheng Yang
dbb9fd8630 iommu: Add domain_has_cap iommu_ops
This iommu_op can tell if domain have a specific capability, like snooping
control for Intel IOMMU, which can be used by other components of kernel to
adjust the behaviour.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-24 09:42:51 +00:00
Sheng Yang
58c610bd1a intel-iommu: Snooping control support
Snooping control enabled IOMMU to guarantee DMA cache coherency and thus reduce
software effort (VMM) in maintaining effective memory type.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-24 09:42:43 +00:00
Sheng Yang
bc7a8660df KVM: Correct deassign device ioctl to IOW
It's IOR by mistake, so fix it before release.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:15 +02:00
Weidong Han
2df8a40bcc KVM: define KVM_CAP_DEVICE_DEASSIGNMENT
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.

the ioctl has been already implemented in the
commit: 0a92035674

Acked-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:12 +02:00
Gleb Natapov
4925663a07 KVM: Report IRQ injection status to userspace.
IRQ injection status is either -1 (if there was no CPU found
that should except the interrupt because IRQ was masked or
ioapic was misconfigured or ...) or >= 0 in that case the
number indicates to how many CPUs interrupt was injected.
If the value is 0 it means that the interrupt was coalesced
and probably should be reinjected.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:11 +02:00
Gerd Hoffmann
c807660407 KVM: Fix kvmclock on !constant_tsc boxes
kvmclock currently falls apart on machines without constant tsc.
This patch fixes it.  Changes:

  * keep tsc frequency in a per-cpu variable.
  * handle kvmclock update using a new request flag, thus checking
    whenever we need an update each time we enter guest context.
  * use a cpufreq notifier to track frequency changes and force
    kvmclock updates.
  * send ipis to kick cpu out of guest context if needed to make
    sure the guest doesn't see stale values.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:09 +02:00
Sheng Yang
79950e1073 KVM: Use irq routing API for MSI
Merge MSI userspace interface with IRQ routing table. Notice the API have been
changed, and using IRQ routing table would be the only interface kvm-userspace
supported.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:09 +02:00
Marcelo Tosatti
44882eed2e KVM: make irq ack notifications aware of routing table
IRQ ack notifications assume an identity mapping between pin->gsi,
which might not be the case with, for example, HPET.

Translate before acking.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
2009-03-24 11:03:08 +02:00
Avi Kivity
91b2ae773d KVM: Avoid using CONFIG_ in userspace visible headers
Kconfig symbols are not available in userspace, and are not stripped by
headers-install.  Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:06 +02:00
Avi Kivity
399ec807dd KVM: Userspace controlled irq routing
Currently KVM has a static routing from GSI numbers to interrupts (namely,
0-15 are mapped 1:1 to both PIC and IOAPIC, and 16:23 are mapped 1:1 to
the IOAPIC).  This is insufficient for several reasons:

- HPET requires non 1:1 mapping for the timer interrupt
- MSIs need a new method to assign interrupt numbers and dispatch them
- ACPI APIC mode needs to be able to reassign the PCI LINK interrupts to the
  ioapics

This patch implements an interrupt routing table (as a linked list, but this
can be easily changed) and a userspace interface to replace the table.  The
routing table is initialized according to the current hardwired mapping.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:06 +02:00
Avi Kivity
75858a84a6 KVM: Interrupt mask notifiers for ioapic
Allow clients to request notifications when the guest masks or unmasks a
particular irq line.  This complements irq ack notifications, as the guest
will not ack an irq line that is masked.

Currently implemented for the ioapic only.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:03 +02:00
Sheng Yang
17071fe74f KVM: Add support to disable MSI for assigned device
MSI is always enabled by default for msi2intx=1. But if msi2intx=0, we
have to disable MSI if guest require to do so.

The patch also discard unnecessary msi2intx judgment if guest want to update
MSI state.

Notice KVM_DEV_IRQ_ASSIGN_MSI_ACTION is a mask which should cover all MSI
related operations, though we only got one for now.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:02 +02:00
Sheng Yang
67346440e8 KVM: Remove duplicated prototype of kvm_arch_destroy_vm
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:03:01 +02:00
Avi Kivity
1c08364c35 KVM: Move struct kvm_pio_request into x86 kvm_host.h
This is an x86 specific stucture and has no business living in common code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:55 +02:00
Marcelo Tosatti
52d939a0bf KVM: PIT: provide an option to disable interrupt reinjection
Certain clocks (such as TSC) in older 2.6 guests overaccount for lost
ticks, causing severe time drift. Interrupt reinjection magnifies the
problem.

Provide an option to disable it.

[avi: allow room for expansion in case we want to disable reinjection
      of other timers]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:55 +02:00
Izik Eidus
0f34607440 KVM: remove the vmap usage
vmap() on guest pages hides those pages from the Linux mm for an extended
(userspace determined) amount of time.  Get rid of it.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:54 +02:00
Jan Kiszka
971cc3dcbc KVM: Advertise guest debug capability per-arch
Limit KVM_CAP_SET_GUEST_DEBUG only to those archs (currently x86) that
support it. This simplifies user space stub implementations.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:52 +02:00
Jes Sorensen
e9a999fe1f KVM: ia64: stack get/restore patch
Implement KVM_IA64_VCPU_[GS]ET_STACK ioctl calls. This is required
for live migrations.

Patch is based on previous implementation that was part of old
GET/SET_REGS ioctl calls.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:50 +02:00
Jan Kiszka
d0bfb940ec KVM: New guest debug interface
This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
part, controlling the "main switch" and the single-step feature. The
arch specific part adds an x86 interface for intercepting both types of
debug exceptions separately and re-injecting them when the host was not
interested. Moveover, the foundation for guest debugging via debug
registers is layed.

To signal breakpoint events properly back to userland, an arch-specific
data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
contains the PC, the debug exception, and relevant debug registers to
tell debug events properly apart.

The availability of this new interface is signaled by
KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
provided.

Note that both SVM and VTX are supported, but only the latter was tested
yet. Based on the experience with all those VTX corner case, I would be
fairly surprised if SVM will work out of the box.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:49 +02:00
David Howells
402d326519 NOMMU: Present backing device capabilities for MTD chardevs
Present backing device capabilities for MTD character device files to allow
NOMMU mmap to do direct mapping where possible.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Bernd Schmidt <bernd.schmidt@analog.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-24 09:00:19 +00:00
Pekka Enberg
15a5b0a491 Merge branches 'topic/slob/cleanups', 'topic/slob/fixes', 'topic/slub/core', 'topic/slub/cleanups' and 'topic/slub/perf' into for-linus 2009-03-24 10:25:21 +02:00
Benjamin Herrenschmidt
9e41d9597e Merge commit 'origin/master' into next 2009-03-24 13:38:30 +11:00
James Morris
703a3cd728 Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
Takashi Iwai
e7bfbb0215 Merge branch 'topic/hda' into for-linus 2009-03-24 00:36:09 +01:00
Takashi Iwai
b5c784894c Merge branch 'topic/asoc' into for-linus 2009-03-24 00:35:53 +01:00
Takashi Iwai
5b56eec774 Merge branch 'topic/jack' into for-linus 2009-03-24 00:35:43 +01:00
David S. Miller
8be7cdccac Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/ucc_geth.c
2009-03-23 13:35:04 -07:00
Thomas Gleixner
80c5520811 Merge branch 'cpus4096' into irq/threaded
Conflicts:
	arch/parisc/kernel/irq.c
	kernel/irq/handle.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-23 21:20:20 +01:00
Linus Torvalds
d56ffd38a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  ucc_geth: Fix oops when using fixed-link support
  dm9000: locking bugfix
  net: update dnet.c for bus_id removal
  dnet: DNET should depend on HAS_IOMEM
  dca: add missing copyright/license headers
  nl80211: Check that function pointer != NULL before using it
  sungem: missing net_device_ops
  be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle
  be2net: replenish when posting to rx-queue is starved in out of mem conditions
  bas_gigaset: correctly allocate USB interrupt transfer buffer
  smsc911x: reset last known duplex and carrier on open
  sh_eth: Fix mistake of the address of SH7763
  sh_eth: Change handling of IRQ
  netns: oops in ip[6]_frag_reasm incrementing stats
  net: kfree(napi->skb) => kfree_skb
  net: fix sctp breakage
  ipv6: fix display of local and remote sit endpoints
  net: Document /proc/sys/net/core/netdev_budget
  tulip: fix crash on iface up with shirq debug
  virtio_net: Make virtio_net support carrier detection
  ...
2009-03-23 09:25:58 -07:00
Ingo Molnar
efd247fa34 Merge branches 'sched/debug' and 'linus' into sched/core 2009-03-23 16:53:20 +01:00
Frederic Weisbecker
c0f92ba99b debugfs: function to know if debugfs is initialized
Impact: add new debugfs API

With ftrace, some tracers are registered in early initcalls
and attempt to create files on the debugfs filesystem.
Depending on when they are activated, they can try to create their
file at any time. Some checks can be done on the tracing area
but providing a helper to know if debugfs is registered make it
really more easy.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1237759847-21025-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-23 16:25:46 +01:00
Pablo Neira Ayuso
dd5b6ce6fd nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink
This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-23 13:21:06 +01:00
Ingo Molnar
b3e3b302cf Merge branches 'irq/sparseirq' and 'linus' into irq/core 2009-03-23 10:07:49 +01:00
Tom Zanussi
2d622719f1 tracing: add ring_buffer_event_discard() to ring buffer
This patch overloads RINGBUF_TYPE_PADDING to provide a way to discard
events from the ring buffer, for the event-filtering mechanism
introduced in a subsequent patch.

I did the initial version but thanks to Steven Rostedt for adding
the parts that actually made it work. ;-)

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-22 18:38:25 +01:00
Stephen Hemminger
777baa4711 usbnet: support net_device_ops
Use net_device_ops for usbnet device, and export for use
by other derived drivers.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:41:01 -07:00
Stephen Hemminger
9247744e5e skb: expose and constify hash primitives
Some minor changes to queue hashing:
 1. Use const on accessor functions
 2. Export skb_tx_hash for use in drivers (see ixgbe)

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 13:39:26 -07:00
Maciej Sosnowski
e2fc4d1929 dca: add missing copyright/license headers
In two dca files copyright and license headers are missing.
This patch adds them there.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 13:31:23 -07:00
Alex Chiang
3ed4fd96b3 PCI: Introduce pci_rescan_bus()
This API is used by the PCI core to rescan a bus and rediscover
newly added devices.

Over time, it is expected that the various PCI hotplug drivers
will migrate to this interface and away from the old
pci_do_scan_bus() interface.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:44 -07:00
Kenji Kaneshige
79af72d716 PCI: pci_is_root_bus helper
Introduce pci_is_root_bus helper function. This will help make code
more consistent, as well as prevent incorrect assumptions (such as
pci_bus->self == NULL on a root bus, which is not always true).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:56:36 -07:00
Yu Zhao
74bb1bcc7d PCI: handle SR-IOV Virtual Function Migration
Add or remove a Virtual Function after receiving a Migrate In or Out
Request.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:28 -07:00
Yu Zhao
dd7cc44d0b PCI: add SR-IOV API for Physical Function driver
Add or remove the Virtual Function when the SR-IOV is enabled or
disabled by the device driver. This can happen anytime rather than
only at the device probe stage.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:26 -07:00
Yu Zhao
d1b054da8f PCI: initialize and release SR-IOV capability
If a device has the SR-IOV capability, initialize it (set the ARI
Capable Hierarchy in the lowest numbered PF if necessary; calculate
the System Page Size for the VF MMIO, probe the VF Offset, Stride
and BARs). A lock for the VF bus allocation is also initialized if
a PF is the lowest numbered PF.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:22 -07:00
David O'Shea
8293b0f629 PCI: Compaq Evo D510 SMBus quirk using USB instead of VGA
On the Compaq Evo D510 SFF/CMT, a PCI quirk activated the SMBus device
based on detection of the on-board VGA controller, but the on-board
VGA is disabled if an AGP card is inserted, so look for one of the USB
controllers instead.

Signed-off-by: David O'Shea <dcoshea@hotmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:19 -07:00
Matthew Wilcox
1c8d7b0a56 PCI MSI: Add support for multiple MSI
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block.  Ensure that the architecture back ends don't
have to know about multiple MSI.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:14 -07:00
Matthew Wilcox
f2440d9acb PCI MSI: Refactor interrupt masking code
Since most of the callers already know whether they have an MSI or
an MSI-X capability, split msi_set_mask_bits() into msi_mask_irq()
and msix_mask_irq().  The only callers which don't (mask_msi_irq()
and unmask_msi_irq()) can share code in msi_set_mask_bit().  This then
becomes the only caller of msix_flush_writes(), so we can inline it.
The flushing read can be to any address that belongs to the device,
so we can eliminate the calculation too.

We can also get rid of maskbits_mask from struct msi_desc and simply
recalculate it on the rare occasion that we need it.  The single-bit
'masked' element is replaced by a copy of the 32-bit 'masked' register,
so this patch does not affect the size of msi_desc.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox
264d9caaa1 PCI MSI: Use mask_pos instead of mask_base when appropriate
MSI interrupts have a mask_pos where MSI-X have a mask_base.  Use a
transparent union to get rid of some ugly casts.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox
24d2755339 PCI MSI: Replace 'type' with 'is_msix'
By changing from a 5-bit field to a 1-bit field, we free up some bits
that can be used by a later patch.  Also rearrange the fields for better
packing on 64-bit platforms (reducing the size of msi_desc from 72 bytes
to 64 bytes).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:12 -07:00
Yu Zhao
998dd7c719 PCI: fix incorrect mask of PM No_Soft_Reset bit
Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:08 -07:00
Kenji Kaneshige
d18690af62 PCI/ACPI: fix wrong assumption in acpi_find_root_bridge_handle
Current acpi_find_root_bridge_handle() has a assumption that
pci_bus->self is NULL on the root pci bus. But it might not be true on
some platforms. Because of this wrong assumption, current
acpi_find_root_bridge_handle() might cause endless loop. We must check
pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:47:57 -07:00
Kenji Kaneshige
0747aaf42d PCI/ACPI: fix wrong assumption in acpi_pci_get_bridge_handle
Current acpi_pci_get_bridge_handle() has an assumption that
pci_bus->self is NULL on the root pci bus. But it might not true on
some platforms. Because of this wrong assumption, current
acpi_pci_get_bridge_handle() might return improper ACPI handle. We
must check pci_bus->parent instead.

This bug is the root cause of the following kernel panic reported by
James Bottomley. This problem was introduced by the commit
e8c331e963. The immediate cause was
acpi_pci_get_bridge_handle() returned NULL unexpectedly and it was
passed as the second argument of acpi_walk_namespace().

pci_hotplug: PCI Hot Plug PCI Core version: 0.5
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff8039646f>] acpi_ns_get_next_node+0xb/0x3c
PGD 0
Oops: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.28 #1
RIP: 0010:[<ffffffff8039646f>]  [<ffffffff8039646f>] acpi_ns_get_next_node+0xb/0x3c
RSP: 0018:ffff88007f87fd30  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff8037d260 R09: ffff88007f87fdfc
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff80742040(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000000201000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88007f87e000, task ffff88007f875040)
Stack:
 0000000000000000 ffffffff803964f5 ffff88007f81b728 0000000000001001
 ffff88007f87fdfc ffffffff8037d260 0000000600000001 0000000000000000
 ffffffff8037d260 0000000000000000 0000000000000001 ffff88007f87fdfc
Call Trace:
 [<ffffffff803964f5>] acpi_ns_walk_namespace+0x55/0x138
 [<ffffffff8037d260>] is_pci_dock_device+0x0/0x20
 [<ffffffff8037d260>] is_pci_dock_device+0x0/0x20
 [<ffffffff80394a9e>] acpi_walk_namespace+0x5f/0x83
 [<ffffffff8037dd33>] detect_ejectable_slots+0x53/0x70
 [<ffffffff8037de38>] add_bridge+0xe8/0x200
 [<ffffffff80394aaa>] acpi_walk_namespace+0x6b/0x83
 [<ffffffff803a4ad1>] acpi_pci_register_driver+0x48/0x61
 [<ffffffff806fc5df>] acpiphp_init+0x0/0x58
 [<ffffffff806fc732>] acpiphp_glue_init+0x4c/0x5a
 [<ffffffff806fc616>] acpiphp_init+0x37/0x58
 [<ffffffff8020903b>] _stext+0x3b/0x180
 [<ffffffff80312598>] create_proc_entry+0x58/0xa0
 [<ffffffff802815d1>] register_irq_proc+0xc1/0xe0
 [<ffffffff806db64b>] kernel_init+0x152/0x1ac
 [<ffffffff8023d970>] finish_task_switch+0x0/0x110
 [<ffffffff8020ca7a>] child_rip+0xa/0x20
 [<ffffffff8020c47c>] restore_args+0x0/0x30
 [<ffffffff806db4f9>] kernel_init+0x0/0x1ac
 [<ffffffff8020ca70>] child_rip+0x0/0x20
Code: 89 c2 48 8b 00 48 85 c0 75 f5 48 8b 45 00 48 89 02 44 88 65 09 48 89 5d 00 31 c0 5b 5d 41 5c c3 53 48 85 d2 89 fb 48 89 d7 75 06 <48> 8b 56 10 eb 08 e8 73 f1 ff ff 48 89 c2 85 db 74 1a eb 13 0f
RIP  [<ffffffff8039646f>] acpi_ns_get_next_node+0xb/0x3c
 RSP <ffff88007f87fd30>
CR2: 0000000000000010
---[ end trace a7919e7f17c0a725 ]---

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:47:56 -07:00
Rafael J. Wysocki
3a3c244c9a PCI: PCIe portdrv: Implement pm object
Implement pm object for the PCI Express port driver in order to use
the new power management framework and reduce the code size.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:47:49 -07:00
David Brownell
a4b6d516a6 [MTD] partitioning utility predicates
Move mtd_has_partitions() and mtd_has_cmdlinepart() inlines from a
DaVinci-specific driver to the <linux/mtd/partitions.h> header.

Use those to eliminate #ifdefs in two drivers which had their own
definitions of mtd_has_partitions().

Quite a lot of other MTD drivers could benefit from using use one or both
of these to remove #ifdeffery.  Maybe some Janitors would like to help.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-20 13:16:44 +00:00
Ingo Molnar
7f00a2495b Merge branches 'x86/cleanups', 'x86/mm', 'x86/setup' and 'linus' into x86/core 2009-03-20 10:34:22 +01:00
Ingo Molnar
22de89b371 Merge branches 'tracing/ftrace', 'tracing/kprobes', 'tracing/tasks' and 'linus' into tracing/core 2009-03-20 10:14:53 +01:00
Eric Dumazet
5e140dfc1f net: reorder struct Qdisc for better SMP performance
dev_queue_xmit() needs to dirty fields "state", "q", "bstats" and "qstats"

On x86_64 arch, they currently span three cache lines, involving more
cache line ping pongs than necessary, making longer holding of queue spinlock.

We can reduce this to one cache line, by grouping all read-mostly fields
at the beginning of structure. (Or should I say, all highly modified fields
at the end :) )

Before patch :

offsetof(struct Qdisc, state)=0x38
offsetof(struct Qdisc, q)=0x48
offsetof(struct Qdisc, bstats)=0x80
offsetof(struct Qdisc, qstats)=0x90
sizeof(struct Qdisc)=0xc8

After patch :

offsetof(struct Qdisc, state)=0x80
offsetof(struct Qdisc, q)=0x88
offsetof(struct Qdisc, bstats)=0xa0
offsetof(struct Qdisc, qstats)=0xac
sizeof(struct Qdisc)=0xc0

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-20 01:33:32 -07:00
Stephen Hemminger
2e1ab634bf rtnetlink: add new value for DHCP added routes
To improve manageability, it would be good to be able to disambiguate routes
added by administrator from those added by DHCP client.  The only necessary
kernel change is to add value to rtnetlink include file so iproute2 utility
can use it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-19 23:49:41 -07:00
Andrew Morton
ea7415512a PCI: constify pci_bus_assign_resources()
drivers/pci/hotplug/fakephp.c: In function 'pci_rescan_bus':
drivers/pci/hotplug/fakephp.c:271: warning: passing argument 1 of 'pci_bus_assign_resources' discards qualifiers from pointer target type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:35 -07:00
akpm@linux-foundation.org
c48f1670f4 PCI: constify pci_bus_add_devices()
drivers/pci/hotplug/fakephp.c:283: warning: passing argument 1 of 'pci_bus_add_devices' discards qualifiers from pointer target type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:35 -07:00
Kenji Kaneshige
9f5404d8ea PCI/ACPI: rename pci_osc_control_set()
- Rename pci_osc_control_set() to acpi_pci_osc_control_set() according
  to the other API names in drivers/acpi/pci_root.c.

- Move _OSC related definitions to include/linux/acpi.h because _OSC
  related API is implemented in drivers/acpi/pci_root.c now.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:33 -07:00
Kenji Kaneshige
63f10f0f6d PCI/ACPI: move _OSC code to pci_root.c
Move PCI _OSC management code from drivers/pci/pci-acpi.c to
drivers/acpi/pci_root.c. The benefits are

- We no longer need struct osc_data and its management code (contents
  are moved to struct acpi_pci_root). This simplify the code, and we
  no longer care about kmalloc() failure.

- We can make pci_acpi_osc_support() be a static function, which is
  called only from drivers/acpi/pci_root.c.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:32 -07:00
Rafael J. Wysocki
b43d451385 PCI/PCIe portdrv: Fix allocation of interrupts
If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for the right services.  Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them.  Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() in the current code will be used for PME and
hotplug and which of them will be used for AER if all of these
services are configured.

For these reasons, rework the allocation of interrupts for PCI
Express ports so that if MSI-X are enabled, the right vectors will be
used for the right purposes.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:25 -07:00
Rafael J. Wysocki
a52e2e3513 PCI/MSI: Introduce pci_msix_table_size()
Introduce new function pci_msix_table_size() returning the size of
the MSI-X table of given PCI device or 0 if the device doesn't
support MSI-X.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:25 -07:00
Rafael J. Wysocki
22106368c9 PCI: PCIe portdrv: Remove struct pcie_port_service_id
The PCI Express port driver uses 'struct pcie_port_service_id' for
matching port service devices and drivers, but this structure
contains fields that duplicate information from the port device
itself (vendor, device, subvendor, subdevice) and fields that are not
used by any existing port service driver (class, class_mask,
drvier_data).  Also, both existing port service drivers (AER and
PCIe HP) don't even use the vendor and device fields for device
matching.  Therefore 'struct pcie_port_service_id' can be removed
altogether and the only useful members of it (port_type, service) can
be introduced directly into the port service device and port service
driver structures.  That simplifies the code quite a bit and reduces
its size.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:23 -07:00
Rafael J. Wysocki
0516c8bcd2 PCI: PCIe portdrv: Simplily probe callback of service drivers
The second argument of the ->probe() callback in
struct pcie_port_service_driver is unnecessary and never used.
Remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:23 -07:00
Rafael J. Wysocki
90e9cd50f7 PCI: PCIe portdrv: Aviod using service devices with wrong interrupts
The PCI Express port driver should not attempt to register service
devices that require the ability to generate interrupts if generating
interrupts is not possible.  Namely, if the port has no interrupt pin
configured and we cannot set up MSI or MSI-X for it, there is no way
it can generate interrupts and in such a case the port services that
rely on interrupts (PME, PCIe HP, AER) should not be enabled for it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:21 -07:00
Rafael J. Wysocki
1bf83e558c PCI: PCIe portdrv: Use driver data to simplify code
PCI Express port driver extension, as defined by struct
pcie_port_device_ext in portdrv.h, is allocated and initialized, but
never used (it also is never freed).  Extend it to hold the PCI Express
port type as well as the port interrupt mode, change its name and use it
to simplify the code in portdrv_core.c .

Additionally, remove the redundant interrupt_mode member of struct
pcie_device defined in include/linux/pcieport_if.h .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:20 -07:00
Trond Myklebust
7fe5c398fc NFS: Optimise NFS close()
Close-to-open cache consistency rules really only require us to flush out
writes on calls to close(), and require us to revalidate attributes on the
very last close of the file.

Currently we appear to be doing a lot of extra attribute revalidation
and cache flushes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19 15:35:50 -04:00
Trond Myklebust
7d1e8255cf SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets
This fixes a regression against FreeBSD servers as reported by Tomas
Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
don't ever react to the client closing the socket, and so commit
e06799f958 (SUNRPC: Use shutdown() instead of
close() when disconnecting a TCP socket) causes the setup to hang forever
whenever the client attempts to close and then reconnect.

We break the deadlock by adding a 'linger2' style timeout to the socket,
after which, the client will abort the connection using a TCP 'RST'.

The default timeout is set to 15 seconds. A subsequent patch will put it
under user control by means of a systctl.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19 15:17:34 -04:00
Yevgeny Petrilin
27bf91d6a0 mlx4_core: Add link type autosensing
When a port's link is down (except to driver restart) and the port is
configured for auto sensing, we try to sense port link type (Ethernet
or InfiniBand) in order to determine how to initialize the port.  If
the port type needs to be changed, all mlx4 for the device interfaces
are unregistered and then registered again with the new port
types.  Sensing is done with intervals of 3 seconds.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-18 19:45:11 -07:00
Chuck Lever
2795e53b4e SUNRPC: Clean up static inline functions in svc_xprt.h
Clean up:  Enable the use of const arguments in higher level svc_ APIs
by adding const to the arguments of the helper functions in svc_xprt.h

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 18:13:28 -04:00
Greg Banks
03cf6c9f49 knfsd: add file to export stats about nfsd pools
Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.

This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:

http://article.gmane.org/gmane.linux.nfs/10375

It has also been updated thus:

 * moved EXPORT_SYMBOL() to near the function it exports
 * made the new struct struct seq_operations const
 * used SEQ_START_TOKEN instead of ((void *)1)
 * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
   printk in svc_pool_stats_*()" by Harshula Jayasuriya.
 * merged fix from SGI PV 964001 "Crash reading pool_stats before
   nfsds are started".

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:42 -04:00
Greg Banks
59a252ff8c knfsd: avoid overloading the CPU scheduler with enormous load averages
Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads.  When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up.  As long as there are idle
threads, one will be woken up.

If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them.  Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run.  This situation
causes two significant problems:

1. The CPU scheduler takes over 10% of each CPU, which is robbing
   the nfsd threads of valuable CPU time.

2. At a high enough load, the nfsd threads starve userspace threads
   of CPU time, to the point where daemons like portmap and rpc.mountd
   do not schedule for tens of seconds at a time.  Clients attempting
   to mount an NFS filesystem timeout at the very first step (opening
   a TCP connection to portmap) because portmap cannot wake up from
   select() and call accept() in time.

Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.

The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server.  That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server.  The server was running 128 nfsds.

Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.

This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:

http://article.gmane.org/gmane.linux.nfs/10374

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:41 -04:00
David Shaw
31dec2538e Short write in nfsd becomes a full write to the client
If a filesystem being written to via NFS returns a short write count
(as opposed to an error) to nfsd, nfsd treats that as a success for
the entire write, rather than the short count that actually succeeded.

For example, given a 8192 byte write, if the underlying filesystem
only writes 4096 bytes, nfsd will ack back to the nfs client that all
8192 bytes were written.  The nfs client does have retry logic for
short writes, but this is never called as the client is told the
complete write succeeded.

There are probably other ways it could happen, but in my case it
happened with a fuse (filesystem in userspace) filesystem which can
rather easily have a partial write.

Here is a patch to properly return the short write count to the
client.

Signed-off-by: David Shaw <dshaw@jabberwocky.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:40 -04:00
J. Bruce Fields
8b671b8070 nfsd4: remove use of mutex for file_hashtable
As part of reducing the scope of the client_mutex, and in order to
remove the need for mutexes from the callback code (so that callbacks
can be done as asynchronous rpc calls), move manipulations of the
file_hashtable under the recall_lock.

Update the relevant comments while we're here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Alexandros Batsakis <batsakis@netapp.com>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
2009-03-18 17:38:38 -04:00
J. Bruce Fields
6150ef0dc7 nfsd4: remove unused CHECK_FH flag
All users now pass this, so it's meaningless.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:37 -04:00
J. Bruce Fields
203a8c8e66 nfsd4: separate delegreturn case from preprocess_stateid_op
Delegreturn is enough a special case for preprocess_stateid_op to
warrant just open-coding it in delegreturn.

There should be no change in behavior here; we're just reshuffling code.

Thanks to Yang Hongyang for catching a critical typo.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:18 -04:00
Harvey Harrison
77f18f5e4e nfs: replace uses of __constant_{endian}
The base versions handle constant folding now, none of these headers
are exported to userspace, so the __ prefixed versions are not
necessary.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:51 -04:00
J. Bruce Fields
6c02eaa1d1 nfsd4: use helper for copying delegation filehandle
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:49 -04:00
J. Bruce Fields
a4773c08f2 nfsd4: use helper for copying filehandles for replay
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:49 -04:00
Ingo Molnar
705bb9dc72 Merge branches 'x86/cleanups', 'x86/cpu', 'x86/debug', 'x86/mce2', 'x86/mm', 'x86/mtrr', 'x86/setup', 'x86/setup-memory', 'x86/urgent', 'x86/uv', 'x86/x2apic' and 'linus' into x86/core
Conflicts:
	arch/parisc/kernel/irq.c
2009-03-18 13:19:49 +01:00
Ingo Molnar
84be58d460 dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
Impact: build fix

Fix:

 arch/x86/kvm/x86.o: In function `dma_debug_add_bus':
 (.text+0x0): multiple definition of `dma_debug_add_bus'

dma_debug_add_bus() should be a static inline function.

Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20090317120112.GP6159@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 11:53:48 +01:00
Ingo Molnar
95f3c4ebff Merge branch 'dma-api/debug' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu 2009-03-18 10:37:48 +01:00
Ingo Molnar
04dfcfcb54 Merge branch 'linus' into core/iommu 2009-03-18 10:37:43 +01:00
Ingo Molnar
37ba317c9e Merge branches 'sched/cleanups' and 'linus' into sched/core 2009-03-18 09:57:02 +01:00
Ingo Molnar
327019b01e Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-18 06:59:56 +01:00
Witold Baryluk
97e7e4f391 tracing: optimization of branch tracer
Impact: better performance for if branch tracer

Use an array to count the hit and misses of a conditional instead
of using another conditional. This cuts down on saturation of branch
predictions and increases performance of modern pipelined architectures.

Signed-off-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-17 23:10:43 -04:00
Steven Rostedt
37886f6a9f ring-buffer: add api to allow a tracer to change clock source
This patch adds a new function called ring_buffer_set_clock that
allows a tracer to assign its own clock source to the buffer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-17 23:06:31 -04:00
Suresh Siddha
29b61be65a x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code
Impact: cleanup

Clean up #ifdefs and replace them with helper functions.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:45:07 -07:00
Suresh Siddha
1531a6a6b8 x86, dmar: start with sane state while enabling dma and interrupt-remapping
Impact: cleanup/sanitization

Start from a sane state while enabling dma and interrupt-remapping, by
clearing the previous recorded faults and disabling previously
enabled queued invalidation and interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:39:58 -07:00
Suresh Siddha
eba67e5da6 x86, dmar: routines for disabling queued invalidation and intr remapping
Impact: new interfaces (not yet used)

Routines for disabling queued invalidation and interrupt remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:39:20 -07:00
Suresh Siddha
9d783ba042 x86, x2apic: enable fault handling for intr-remapping
Impact: interface augmentation (not yet used)

Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:38:59 -07:00
David S. Miller
af4330631c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-03-17 15:04:31 -07:00
J. Bruce Fields
76a67ec6fb nfsd: nfsd should drop CAP_MKNOD for non-root
Since creating a device node is normally an operation requiring special
privilege, Igor Zhbanov points out that it is surprising (to say the
least) that a client can, for example, create a device node on a
filesystem exported with root_squash.

So, make sure CAP_MKNOD is among the capabilities dropped when an nfsd
thread handles a request from a non-root user.

Reported-by: Igor Zhbanov <izh1979@gmail.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-17 14:55:55 -04:00
Ingo Molnar
47239561e3 Merge branch 'linus' into core/printk 2009-03-17 16:21:20 +01:00
Joerg Roedel
41531c8f5f dma-debug: add a check dma memory leaks
Impact: allow architectures to monitor busses for dma mem leakage

This patch adds checking code to detect if a device has pending DMA
operations when it is about to be unbound from its device driver.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-17 12:56:49 +01:00
David Woodhouse
ac26c18bd3 dma-debug: add function to dump dma mappings
This adds a function to dump the DMA mappings that the debugging code is
aware of -- either for a single device, or for _all_ devices.

This can be useful for debugging -- sticking a call to it in the DMA
page fault handler, for example, to see if the faulting address _should_
be mapped or not, and hence work out whether it's IOMMU bugs we're
seeing, or driver bugs.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-17 12:56:39 +01:00
Luis R. Rodriguez
73d54c9e74 cfg80211: add regulatory netlink multicast group
This allows us to send to userspace "regulatory" events.
For now we just send an event when we change regulatory domains.
We also notify userspace when devices are using their own custom
world roaming regulatory domains.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16 18:09:40 -04:00
Luis R. Rodriguez
7db90f4a25 cfg80211: move enum reg_set_by to nl80211.h
We do this so we can later inform userspace who set the
regulatory domain and provide details of the request.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16 18:09:40 -04:00
Herbert Xu
d1c76af9e2 GRO: Move netpoll checks to correct location
As my netpoll fix for net doesn't really work for net-next, we
need this update to move the checks into the right place.  As it
stands we may pass freed skbs to netpoll_receive_skb.

This patch also introduces a netpoll_rx_on function to avoid GRO
completely if we're invoked through netpoll.  This might seem
paranoid but as netpoll may have an external receive hook it's
better to be safe than sorry.  I don't think we need this for
2.6.29 though since there's nothing immediately broken by it.

This patch also moves the GRO_* return values to netdevice.h since
VLAN needs them too (I tried to avoid this originally but alas
this seems to be the easiest way out).  This fixes a bug in VLAN
where it continued to use the old return value 2 instead of the
correct GRO_DROP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-16 10:50:02 -07:00
Pablo Neira Ayuso
0269ea4937 netfilter: xtables: add cluster match
This patch adds the iptables cluster match. This match can be used
to deploy gateway and back-end load-sharing clusters. The cluster
can be composed of 32 nodes maximum (although I have only tested
this with two nodes, so I cannot tell what is the real scalability
limit of this solution in terms of cluster nodes).

Assuming that all the nodes see all packets (see below for an
example on how to do that if your switch does not allow this), the
cluster match decides if this node has to handle a packet given:

	(jhash(source IP) % total_nodes) & node_mask

For related connections, the master conntrack is used. The following
is an example of its use to deploy a gateway cluster composed of two
nodes (where this is the node 1):

iptables -I PREROUTING -t mangle -i eth1 -m cluster \
	--cluster-total-nodes 2 --cluster-local-node 1 \
	--cluster-proc-name eth1 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth1 \
	-m mark ! --mark 0xffff -j DROP
iptables -A PREROUTING -t mangle -i eth2 -m cluster \
	--cluster-total-nodes 2 --cluster-local-node 1 \
	--cluster-proc-name eth2 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth2 \
	-m mark ! --mark 0xffff -j DROP

And the following commands to make all nodes see the same packets:

ip maddr add 01:00:5e:00:01:01 dev eth1
ip maddr add 01:00:5e:00:01:02 dev eth2
arptables -I OUTPUT -o eth1 --h-length 6 \
	-j mangle --mangle-mac-s 01:00:5e:00:01:01
arptables -I INPUT -i eth1 --h-length 6 \
	--destination-mac 01:00:5e:00:01:01 \
	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
arptables -I OUTPUT -o eth2 --h-length 6 \
	-j mangle --mangle-mac-s 01:00:5e:00:01:02
arptables -I INPUT -i eth2 --h-length 6 \
	--destination-mac 01:00:5e:00:01:02 \
	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27

In the case of TCP connections, pickup facility has to be disabled
to avoid marking TCP ACK packets coming in the reply direction as
valid.

echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose

BTW, some final notes:

 * This match mangles the skbuff pkt_type in case that it detects
PACKET_MULTICAST for a non-multicast address. This may be done in
a PKTTYPE target for this sole purpose.
 * This match supersedes the CLUSTERIP target.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16 17:10:36 +01:00
Jan Engelhardt
acc738fec0 netfilter: xtables: avoid pointer to self
Commit 784544739a (netfilter: iptables:
lock free counters) broke a number of modules whose rule data referenced
itself. A reallocation would not reestablish the correct references, so
it is best to use a separate struct that does not fall under RCU.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16 15:35:29 +01:00
Jonathan Corbet
db1dd4d376 Use f_lock to protect f_flags
Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all.  This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock.  This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Jonathan Corbet
6849991490 Rename struct file->f_ep_lock
This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock.  For now,
epoll remains the only user, but a future patch will use it to protect
f_flags as well.

Cc: Davide Libenzi <davidel@xmailserver.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Ingo Molnar
edb35028e4 Merge branches 'irq/genirq' and 'linus' into irq/core 2009-03-16 09:20:13 +01:00
Ingo Molnar
7243f2145a Merge branches 'tracing/ftrace', 'tracing/syscalls' and 'linus' into tracing/core
Conflicts:
	arch/parisc/kernel/irq.c
2009-03-16 09:12:42 +01:00
Ilpo Järvinen
2a3a041c4e tcp: cache result of earlier divides when mss-aligning things
The results is very unlikely change every so often so we
hardly need to divide again after doing that once for a
connection. Yet, if divide still becomes necessary we
detect that and do the right thing and again settle for
non-divide state. Takes the u16 space which was previously
taken by the plain xmit_size_goal.

This should take care part of the tso vs non-tso difference
we found earlier.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-15 20:09:55 -07:00
Ilpo Järvinen
0c54b85f28 tcp: simplify tcp_current_mss
There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).

Bring xmit_goal_size parts into tcp.c

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-15 20:09:54 -07:00
Eric Dumazet
8bdd663aba net: reorder fields of struct socket
On x86_64, its rather unfortunate that "wait_queue_head_t wait"
field of "struct socket" spans two cache lines (assuming a 64
bytes cache line in current cpus)

offsetof(struct socket, wait)=0x30
sizeof(wait_queue_head_t)=0x18

This might explain why Kenny Chang noticed that his multicast workload
was performing bad with 64 bit kernels, since more cache lines ping pongs
were involved.

This litle patch moves "wait" field next "fasync_list" so that both
fields share a single cache line, to speedup sock_def_readable()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-15 19:59:13 -07:00
Linus Torvalds
fbd8104c2e Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (23 commits)
  [ARM] Fix virtual to physical translation macro corner cases
  [ARM] update mach-types
  [ARM] 5421/1: ftrace: fix crash due to tracing of __naked functions
  MX1 fix include
  [ARM] 5419/1: ep93xx: fix build warnings about struct i2c_board_info
  [ARM] 5418/1: restore lr before leaving mcount
  ARM: OMAP: board-omap3beagle: set i2c-3 to 100kHz
  ARM: OMAP: Allow I2C bus driver to be compiled as a module
  ARM: OMAP: sched_clock() corrected
  ARM: OMAP: Fix compile error if pm.h is included
  [ARM] orion5x: pass dram mbus data to xor driver
  [ARM] S3C64XX: Fix s3c64xx_setrate_clksrc
  [ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/irq.c
  [ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/s3c6400-clock.c
  [ARM] S3C64XX: Fix USB host clock mux list
  [ARM] S3C64XX: Fix name of USB host clock.
  [ARM] S3C64XX: Rename IRQ_UHOST to IRQ_USBH
  [ARM] S3C64XX: Do gpiolib configuration earlier
  [ARM] S3C64XX: Staticise s3c64xx_init_irq_eint()
  [ARM] SMDK6410: Declare iodesc table static
  ...
2009-03-15 13:34:56 -07:00
Linus Torvalds
18553c38bc Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Fix Xilinx SystemACE driver to handle empty CF slot
  block: fix memory leak in bio_clone()
  block: Add gfp_mask parameter to bio_integrity_clone()
2009-03-14 13:43:18 -07:00
un'ichi Nomura
87092698c6 block: Add gfp_mask parameter to bio_integrity_clone()
Stricter gfp_mask might be required for clone allocation.
For example, request-based dm may clone bio in interrupt context
so it has to use GFP_ATOMIC.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-14 21:06:51 +01:00
Linus Torvalds
f1823acfbc Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...
  SUNRPC: xprt_connect() don't abort the task if the transport isn't bound
  SUNRPC: Fix an Oops due to socket not set up yet...
  Bug 11061, NFS mounts dropped
  NFS: Handle -ESTALE error in access()
  NLM: Fix GRANT callback address comparison when IPv6 is enabled
  NLM: Shrink the IPv4-only version of nlm_cmp_addr()
  NFSv3: Fix posix ACL code
  NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)
  SUNRPC: Tighten up the task locking rules in __rpc_execute()
2009-03-14 12:00:18 -07:00
Ingo Molnar
0ca0f16fd1 Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug', 'x86/kconfig', 'x86/mm', 'x86/ptrace', 'x86/setup' and 'x86/urgent'; commit 'v2.6.29-rc8' into x86/core 2009-03-14 16:25:40 +01:00
Pallipadi, Venkatesh
895791dac6 VM, x86, PAT: add a new vm flag to track full pfnmap at mmap
Impact: cleanup

Add a new vm flag VM_PFN_AT_MMAP to identify a PFNMAP that is
fully mapped with remap_pfn_range. Patch removes the overloading
of VM_INSERTPAGE from the earlier patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Nick Piggin <npiggin@suse.de>
LKML-Reference: <20090313233543.GA19909@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 09:47:44 +01:00
Gabriele Paoloni
9c705260fe ppp: ppp_mp_explode() redesign
I found the PPP subsystem to not work properly when connecting channels
with different speeds to the same bundle.

Problem Description:

As the "ppp_mp_explode" function fragments the sk_buff buffer evenly
among the PPP channels that are connected to a certain PPP unit to
make up a bundle, if we are transmitting using an upper layer protocol
that requires an Ack before sending the next packet (like TCP/IP for
example), we will have a bandwidth bottleneck on the slowest channel
of the bundle.

Let's clarify by an example. Let's consider a scenario where we have
two PPP links making up a bundle: a slow link (10KB/sec) and a fast
link (1000KB/sec) working at the best (full bandwidth). On the top we
have a TCP/IP stack sending a 1000 Bytes sk_buff buffer down to the
PPP subsystem. The "ppp_mp_explode" function will divide the buffer in
two fragments of 500B each (we are neglecting all the headers, crc,
flags etc?.). Before the TCP/IP stack sends out the next buffer, it
will have to wait for the ACK response from the remote peer, so it
will have to wait for both fragments to have been sent over the two
PPP links, received by the remote peer and reconstructed. The
resulting behaviour is that, rather than having a bundle working
@1010KB/sec (the sum of the channels bandwidths), we'll have a bundle
working @20KB/sec (the double of the slowest channels bandwidth).


Problem Solution:

The problem has been solved by redesigning the "ppp_mp_explode"
function in such a way to make it split the sk_buff buffer according
to the speeds of the underlying PPP channels (the speeds of the serial
interfaces respectively attached to the PPP channels). Referring to
the above example, the redesigned "ppp_mp_explode" function will now
divide the 1000 Bytes buffer into two fragments whose sizes are set
according to the speeds of the channels where they are going to be
sent on (e.g .  10 Byets on 10KB/sec channel and 990 Bytes on
1000KB/sec channel).  The reworked function grants the same
performances of the original one in optimal working conditions (i.e. a
bundle made up of PPP links all working at the same speed), while
greatly improving performances on the bundles made up of channels
working at different speeds.

Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 16:09:12 -07:00
Marcin Slusarz
a390d1f379 phylib: convert state_queue work to delayed_work
It closes a race in phy_stop_machine when reprogramming of phy_timer
(from phy_state_machine) happens between del_timer_sync and cancel_work_sync.

Without this change it could lead to crash if phy_device would be freed after
phy_stop_machine (timer would fire and schedule freed work).

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 15:41:19 -07:00
Eric Moore
dec3f95959 [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:57:40 -05:00
Douglas Gilbert
4ab3b73f85 [SCSI] bsg: add linux/types.h include to bsg.h
Since bsg.h has recently been added to the list of kernel
headers that should be exported to the user space, this
attachment makes bsg.h more user space "friendly".
Specifically autotools dislike headers that don't compile
freestanding and bsg.h's use of __u32 types (and friends)
are not standard C (C90 or C99). The inclusion of
linux/types.h fixes that.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:33:25 -05:00
FUJITA Tomonori
5d82720a7f ide: save the returned value of dma_map_sg
dma_map_sg could return a value different to 'nents' argument of
dma_map_sg so the ide stack needs to save it for the later usage
(e.g. for_each_sg).

The ide stack also needs to save the original sg_nents value for
pci_unmap_sg.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
[bart: backport to Linus' tree]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-13 21:16:13 +01:00
Yi Zou
4d288d5767 [SCSI] net: add FCoE offload support through net_device
This adds support to provide Fiber Channel over Ethernet (FCoE) offload
through net_device's net_device_ops struct. The offload through net_device
for FCoE is enabled in kernel as built-in or module driver.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:12:12 -05:00
Chris Leech
01d5b2fca1 [SCSI] net: define feature flags for FCoE offloads
Define feature flags for FCoE offloads.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:11:07 -05:00
Chris Leech
43eb99c5b3 [SCSI] net: reclaim 8 upper bits of the netdev->features from GSO
Reclaim 8 upper bits of netdev->features from GSO.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:10:23 -05:00
Yi Zou
211c738d86 [SCSI] net, fcoe: add ETH_P_FCOE for Fibre Channel over Ethernet (FCoE)
This adds eth type ETH_P_FCOE for Fibre Channel over Ethernet (FCoE),
consequently, the ETH_P_FCOE from fc_fcoe.h and fcoe skb->protocol
is not set as ETH_P_FCOE.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:09:00 -05:00
Neil Horman
273ae44b9c Network Drop Monitor: Adding Build changes to enable drop monitor
Network Drop Monitor: Adding Build changes to enable drop monitor

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 include/linux/Kbuild |    1 +
 net/Kconfig          |   11 +++++++++++
 net/core/Makefile    |    1 +
 3 files changed, 13 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 12:09:29 -07:00
Neil Horman
9a8afc8d39 Network Drop Monitor: Adding drop monitor implementation & Netlink protocol
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 include/linux/net_dropmon.h |   56 +++++++++
 net/core/drop_monitor.c     |  263 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 319 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 12:09:29 -07:00
Neil Horman
ead2ceb0ec Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line points for skbs
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 include/linux/skbuff.h |    4 +++-
 net/core/datagram.c    |    2 +-
 net/core/skbuff.c      |   22 ++++++++++++++++++++++
 net/ipv4/arp.c         |    2 +-
 net/ipv4/udp.c         |    2 +-
 net/packet/af_packet.c |    2 +-
 6 files changed, 29 insertions(+), 5 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 12:09:28 -07:00
Ingo Molnar
0634023562 Merge branch 'x86/core' into x86/kconfig 2009-03-13 17:08:30 +01:00
Frederic Weisbecker
bed1ffca02 tracing/syscalls: core infrastructure for syscalls tracing, enhancements
Impact: new feature

This adds the generic support for syscalls tracing. This is
currently exploited through a devoted tracer but other tracing
engines can use it. (They just have to play with
{start,stop}_ftrace_syscalls() and use the display callbacks
unless they want to override them.)

The syscalls prototypes definitions are abused here to steal
some metadata informations:

- syscall name, param types, param names, number of params

The syscall addr is not directly saved during this definition
because we don't know if its prototype is available in the
namespace. But we don't really need it. The arch has just to
build a function able to resolve the syscall number to its
metadata struct.

The current tracer prints the syscall names, parameters names
and values (and their types optionally). Currently the value is
a raw hex but higher level values diplaying is on my TODO list.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 16:57:42 +01:00
Rusty Russell
082edb7bf4 numa, cpumask: move numa_node_id default implementation to topology.h
Impact: cleanup, potential bugfix

Not sure what changed to expose this, but clearly that numa_node_id()
doesn't belong in mmzone.h (the inline in gfp.h is probably overkill, too).

In file included from include/linux/topology.h:34,
                 from arch/x86/mm/numa.c:2:
/home/rusty/patches-cpumask/linux-2.6/arch/x86/include/asm/topology.h:64:1: warning: "numa_node_id" redefined
In file included from include/linux/topology.h:32,
                 from arch/x86/mm/numa.c:2:
include/linux/mmzone.h:770:1: warning: this is the location of the previous definition

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Mike Travis <travis@sgi.com>
LKML-Reference: <200903132343.37661.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 14:35:31 +01:00
Thomas Gleixner
a9d0a1a383 genirq: add doc to struct irqaction
Impact: documentation

struct irqaction is not documented. Add kernel doc comments and add
interrupt.h to the genirq docbook.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-13 14:32:29 +01:00
Thomas Gleixner
bedd30d986 genirq: make irqreturn_t an enum
Impact: cleanup

Remove the 2.4 compabiliy cruft

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
2009-03-13 14:32:29 +01:00
Thomas Gleixner
3dd3d46b78 genirq: remove unused hw_irq_controller typedef
hw_irq_controller is unused. Remove the typedef

Impact: cleanup

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-13 14:32:28 +01:00
Lai Jiangshan
e94142a67f ftrace: remove struct list_head from struct dyn_ftrace
Impact: save memory

The struct dyn_ftrace table is very large, this patch will save
about 50%.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49BA2C9F.8020009@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 11:36:20 +01:00
Ingo Molnar
d1dedb52ac panic, smp: provide smp_send_stop() wrapper on UP too
Impact: cleanup, no code changed

Remove an ugly #ifdef CONFIG_SMP from panic(), by providing
an smp_send_stop() wrapper on UP too.

LKML-Reference: <49B91A7E.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 11:24:31 +01:00
Ingo Molnar
cd80a8142e Merge branch 'x86/core' into core/ipi 2009-03-13 11:05:58 +01:00
Ingo Molnar
62a394eb77 Merge branches 'tracing/ftrace' and 'tracing/syscalls'; commit 'v2.6.29-rc8' into tracing/core 2009-03-13 10:23:39 +01:00
Frederic Weisbecker
ee08c6eccb tracing/ftrace: syscall tracing infrastructure, basics
Provide basic callbacks to do syscall tracing.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <1236401580-5758-2-git-send-email-fweisbec@gmail.com>
[ simplified it to a trace_printk() for now. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 06:25:43 +01:00
Ingo Molnar
238a5b4bff Merge branch 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-x86 into cpus4096 2009-03-13 05:54:55 +01:00
Ingo Molnar
17d85bc756 Merge commit 'v2.6.29-rc8' into cpus4096 2009-03-13 05:54:43 +01:00
Rusty Russell
a70f730282 cpumask: replace node_to_cpumask with cpumask_of_node.
Impact: cleanup

node_to_cpumask (and the blecherous node_to_cpumask_ptr which
contained a declaration) are replaced now everyone implements
cpumask_of_node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:46 +10:30
Ingo Molnar
f6411fe7e0 Merge branches 'sched/clock', 'sched/urgent' and 'linus' into sched/core 2009-03-13 04:50:44 +01:00
Pallipadi, Venkatesh
4bb9c5c021 VM, x86, PAT: Change is_linear_pfn_mapping to not use vm_pgoff
Impact: fix false positive PAT warnings - also fix VirtalBox hang

Use of vma->vm_pgoff to identify the pfnmaps that are fully
mapped at mmap time is broken. vm_pgoff is set by generic mmap
code even for cases where drivers are setting up the mappings
at the fault time.

The problem was originally reported here:

 http://marc.info/?l=linux-kernel&m=123383810628583&w=2

Change is_linear_pfn_mapping logic to overload VM_INSERTPAGE
flag along with VM_PFNMAP to mean full PFNMAP setup at mmap
time.

Problem also tracked at:

 http://bugzilla.kernel.org/show_bug.cgi?id=12800

Reported-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha>@intel.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: "ebiederm@xmission.com" <ebiederm@xmission.com>
Cc: <stable@kernel.org> # only for 2.6.29.1, not .28
LKML-Reference: <20090313004527.GA7176@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 04:28:50 +01:00
Jason Baron
5d592b44b2 tracing: tracepoints for softirq entry/exit - add softirq-to-name array
Create a 'softirq_to_name' array, which is indexed by softirq #, so
that we can easily convert between the softirq index # and its name, in
order to get more meaningful output messages.

LKML-Reference: <20090312183336.GB3352@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-12 21:15:02 -04:00
Frederic Weisbecker
48ead02030 tracing/core: bring back raw trace_printk for dynamic formats strings
Impact: fix callsites with dynamic format strings

Since its new binary implementation, trace_printk() internally uses static
containers for the format strings on each callsites. But the value is
assigned once at build time, which means that it can't take dynamic
formats.

So this patch unearthes the raw trace_printk implementation for the callers
that will need trace_printk to be able to carry these dynamic format
strings. The trace_printk() macro will use the appropriate implementation
for each callsite. Most of the time however, the binary implementation will
still be used.

The other impact of this patch is that mmiotrace_printk() will use the old
implementation because it calls the low level trace_vprintk and we can't
guess here whether the format passed in it is dynamic or not.

Some parts of this patch have been written by Steven Rostedt (most notably
the part that chooses the appropriate implementation for each callsites).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-12 21:15:00 -04:00
Ingo Molnar
25d500067d Merge branch 'linus' into core/ipi 2009-03-13 02:14:25 +01:00
Ingo Molnar
480c93df5b Merge branch 'core/locking' into tracing/ftrace 2009-03-13 01:33:21 +01:00
Ingo Molnar
d820ac4c2f locking: rename trace_softirq_[enter|exit] => lockdep_softirq_[enter|exit]
Impact: cleanup

The naming clashes with upcoming softirq tracepoints, so rename the
APIs to lockdep_*().

Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 01:32:36 +01:00
Ingo Molnar
3c1f67d60e Merge branch 'linus' into core/locking 2009-03-13 01:29:17 +01:00
Uwe Kleine-König
446c92b290 [ARM] 5421/1: ftrace: fix crash due to tracing of __naked functions
This is a fix for the following crash observed in 2.6.29-rc3:
http://lkml.org/lkml/2009/1/29/150

On ARM it doesn't make sense to trace a naked function because then
mcount is called without stack and frame pointer being set up and there
is no chance to restore the lr register to the value before mcount was
called.

Reported-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Tested-by: Matthias Kaehlcke <matthias@kaehlcke.net>

Cc: Abhishek Sagar <sagar.abhishek@gmail.com>
Cc: Steven Rostedt <rostedt@home.goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-03-12 21:33:03 +00:00
Boaz Harrosh
71969fd9e2 [SCSI] major.h: char-major number for OSD device driver
Allocate major 260 for osd.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Torben Mathiasen <device@lanana.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:05 -05:00
Magnus Damm
cbf94f0682 irq: match remove_irq() args with setup_irq()
Modify remove_irq() to match setup_irq().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120551.2926.43942.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-12 13:16:33 +01:00
Magnus Damm
f21cfb258d irq: add remove_irq() for freeing of setup_irq() irqs
Impact: add new API

This patch adds a remove_irq() function for releasing
interrupts requested with setup_irq().

Without this patch we have no way of releasing such
interrupts since free_irq() today tries to kfree()
the irqaction passed with setup_irq().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090312120542.2926.56609.sendpatchset@rx1.opensource.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-12 13:16:32 +01:00
Ingo Molnar
f8cb22cbb8 Merge branch 'linus' into irq/genirq 2009-03-12 13:16:18 +01:00
Rusty Russell
45e575ab9b cpumask: mm_cpumask for accessing the struct mm_struct's cpu_vm_mask.
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
mm_cpumask() now and won't see a difference as the changes roll into
linux-next.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-12 14:35:44 +10:30
Rusty Russell
76e6eee033 cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
tsk_cpumask() now and won't see a difference as the changes roll into
linux-next.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-12 14:35:44 +10:30
Tom Talpey
441e3e2429 SUNRPC: dynamically load RPC transport modules on-demand
Provide an api to attempt to load any necessary kernel RPC
client transport module automatically. By convention, the
desired module name is "xprt"+"transport name". For example,
when NFS mounting with "-o proto=rdma", attempt to load the
"xprtrdma" module.

Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:37:56 -04:00
Trond Myklebust
72cb77f4a5 NFS: Throttle page dirtying while we're flushing to disk
The following patch is a combination of a patch by myself and Peter
Staubach.

Trond: If we allow other processes to dirty pages while a process is doing
a consistency sync to disk, we can end up never making progress.

Peter: Attached is a patch which addresses a continuing problem with
the NFS client generating out of order WRITE requests.  While
this is compliant with all of the current protocol
specifications, there are servers in the market which can not
handle out of order WRITE requests very well.  Also, this may
lead to sub-optimal block allocations in the underlying file
system on the server.  This may cause the read throughputs to
be reduced when reading the file from the server.

Peter: There has been a lot of work recently done to address out of
order issues on a systemic level.  However, the NFS client is
still susceptible to the problem.  Out of order WRITE
requests can occur when pdflush is in the middle of writing
out pages while the process dirtying the pages calls
generic_file_buffered_write which calls
generic_perform_write which calls
balance_dirty_pages_rate_limited which ends up calling
writeback_inodes which ends up calling back into the NFS
client to writes out dirty pages for the same file that
pdflush happens to be working with.

Signed-off-by: Peter Staubach <staubach@redhat.com>
[modification by Trond to merge the two similar patches]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:30 -04:00
Trond Myklebust
fb8a1f11b6 NFS: cleanup - remove struct nfs_inode->ncommit
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:29 -04:00
Trond Myklebust
a65318bf3a NFSv4: Simplify some cache consistency post-op GETATTRs
Certain asynchronous operations such as write() do not expect
(or care) that other metadata such as the file owner, mode, acls, ...
change. All they want to do is update and/or check the change attribute,
ctime, and mtime.
By skipping the file owner and group update, we also avoid having to do a
potential idmapper upcall for these asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:28 -04:00
Trond Myklebust
bca794785c NFS: Fix the type of struct nfs_fattr->mode
There is no point in using anything other than umode_t, since we copy the
content pretty much directly into inode->i_mode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:26 -04:00
Trond Myklebust
1ca277d88d NFS: Shrink the struct nfs_fattr
We don't need the bitmap[] field anymore, since the 'valid' field tells us
all we need to know about which attributes were filled in...
Also move the pre-op attributes in order to improve the structure packing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:25 -04:00
Trond Myklebust
9e6e70f8d8 NFSv4: Support NFSv4 optional attributes in the struct nfs_fattr
Currently, filling struct nfs_fattr is more or less an all or nothing
operation, since NFSv2 and NFSv3 have only mandatory attributes.
In NFSv4, some attributes are optional, and so we may simply not be able to
fill in those fields. Furthermore, NFSv4 allows you to specify which
attributes you are interested in retrieving, thus permitting you to
optimise away retrieval of attributes that you know will no change...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:24 -04:00
Mark Brown
aaf1e176fa ASoC: Add initial driver for the WM8400 CODEC
The WM8400 is a highly integrated audio CODEC and power management unit
intended for mobile multimedia application.  This driver supports the
primary audio CODEC features, including:

 - 1W speaker driver
 - Fully differential headphone output
 - Up to 4 differential microphone inputs

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2009-03-11 13:49:46 +00:00
Ingo Molnar
78b020d035 Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core 2009-03-11 10:49:15 +01:00
Ingo Molnar
65a37b29a8 Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu 2009-03-11 10:30:23 +01:00
Ingo Molnar
1d8ce7bc4d Merge branch 'linus' into core/percpu
Conflicts:
	arch/x86/include/asm/fixmap_64.h
2009-03-11 10:29:28 +01:00
Benjamin Herrenschmidt
e14eee56c2 Merge commit 'origin/master' into next 2009-03-11 17:10:07 +11:00
Chuck Lever
78851e1aa4 NLM: Shrink the IPv4-only version of nlm_cmp_addr()
Clean up/micro-optimatization:  Make the AF_INET-only version of
nlm_cmp_addr() smaller.  This matches the style of
nlm_privileged_requester(), and makes the AF_INET-only version of
nlm_cmp_addr() nearly the same size as it was before IPv6 support.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-10 20:33:19 -04:00
Trond Myklebust
ae46141ff0 NFSv3: Fix posix ACL code
Fix a memory leak due to allocation in the XDR layer. In cases where the
RPC call needs to be retransmitted, we end up allocating new pages without
clearing the old ones. Fix this by moving the allocation into
nfs3_proc_setacls().

Also fix an issue discovered by Kevin Rudd, whereby the amount of memory
reserved for the acls in the xdr_buf->head was miscalculated, and causing
corruption.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-10 20:33:18 -04:00
Ingo Molnar
e2b8b28085 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-10 22:55:31 +01:00
Ingo Molnar
4dd163a051 Merge branches 'tracing/ftrace', 'tracing/textedit' and 'linus' into tracing/core 2009-03-10 22:54:23 +01:00
Steven Rostedt
ef18012b24 tracing: remove funky whitespace in the trace code
Impact: clean up

There existed a lot of <space><tab>'s in the tracing code. This
patch removes them.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 14:13:14 -04:00
Steven Rostedt
823f9124fb tracing: document TRACE_EVENT macro in tracepoint.h
Impact: clean up / comments

Kosaki Motohiro asked about an explanation to the TRACE_EVENT macro.
Ingo Molnar replied with a nice description.

This patch takes the description that Ingo wrote (with some slight
modifications) and adds it to the tracepoint.h file.

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 12:58:51 -04:00
Steven Rostedt
30a8fecc2d tracing: flip the TP_printk and TP_fast_assign in the TRACE_EVENT macro
Impact: clean up

In trying to stay consistant with the C style format in the TRACE_EVENT
macro, it makes more sense to do the printk after the assigning of
the variables.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 12:41:38 -04:00
Ingo Molnar
8c54436ae9 Merge branches 'sched/cleanups' and 'linus' into sched/core 2009-03-10 16:34:43 +01:00
Ingo Molnar
8293dd6f86 Merge branch 'x86/core' into tracing/ftrace
Semantic merge:

  kernel/trace/trace_functions_graph.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-10 10:17:48 +01:00
Ingo Molnar
9a1043d19c Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-10 09:57:16 +01:00
Ingo Molnar
12e87e36e0 Merge branches 'tracing/doc', 'tracing/ftrace', 'tracing/printk' and 'linus' into tracing/core 2009-03-10 09:56:25 +01:00
Ingo Molnar
467c88fee5 Merge branches 'x86/apic', 'x86/asm', 'x86/fixmap', 'x86/memtest', 'x86/mm', 'x86/urgent', 'linus' and 'core/percpu' into x86/core 2009-03-10 09:26:38 +01:00
Tejun Heo
66c3a75772 percpu: generalize embedding first chunk setup helper
Impact: code reorganization

Separate out embedding first chunk setup helper from x86 embedding
first chunk allocator and put it in mm/percpu.c.  This will be used by
the default percpu first chunk allocator and possibly by other archs.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-03-10 16:27:48 +09:00
Tejun Heo
6074d5b0a3 percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk()
Impact: cleanup, more flexibility for first chunk init

Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
This restriction stemmed from implementation detail and made things a
bit less intuitive.  This patch allows @dyn_size to be specified
regardless of @unit_size and swaps the positions of @dyn_size and
@unit_size so that the parameter order makes more sense (static,
reserved and dyn sizes followed by enclosing unit_size).

While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-03-10 16:27:48 +09:00
Paul Mundt
e161183ba6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-03-10 15:17:26 +09:00
Steven Rostedt
157587d7ac tracing: remove obsolete TRACE_EVENT_FORMAT macro
Impact: clean up

The TRACE_EVENT_FORMAT macro is no longer used by trace points
and only the DECLARE_TRACE, TRACE_FORMAT or TRACE_EVENT macros should
be used by them. Although the TRACE_EVENT_FORMAT macro is still used
by the internal tracing utility, it should not be used in core
kernel code.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 00:35:12 -04:00
Steven Rostedt
da4d03020c tracing: new format for specialized trace points
Impact: clean up and enhancement

The TRACE_EVENT_FORMAT macro looks quite ugly and is limited in its
ability to save data as well as to print the record out. Working with
Ingo Molnar, we came up with a new format that is much more pleasing to
the eye of C developers. This new macro is more C style than the old
macro, and is more obvious to what it does.

Here's the example. The only updated macro in this patch is the
sched_switch trace point.

The old method looked like this:

 TRACE_EVENT_FORMAT(sched_switch,
        TP_PROTO(struct rq *rq, struct task_struct *prev,
                struct task_struct *next),
        TP_ARGS(rq, prev, next),
        TP_FMT("task %s:%d ==> %s:%d",
              prev->comm, prev->pid, next->comm, next->pid),
        TRACE_STRUCT(
                TRACE_FIELD(pid_t, prev_pid, prev->pid)
                TRACE_FIELD(int, prev_prio, prev->prio)
                TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
                                    next_comm,
                                    TP_CMD(memcpy(TRACE_ENTRY->next_comm,
                                                 next->comm,
                                                 TASK_COMM_LEN)))
                TRACE_FIELD(pid_t, next_pid, next->pid)
                TRACE_FIELD(int, next_prio, next->prio)
        ),
        TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
        );

The above method is hard to read and requires two format fields.

The new method:

 /*
  * Tracepoint for task switches, performed by the scheduler:
  *
  * (NOTE: the 'rq' argument is not used by generic trace events,
  *        but used by the latency tracer plugin. )
  */
 TRACE_EVENT(sched_switch,

	TP_PROTO(struct rq *rq, struct task_struct *prev,
		 struct task_struct *next),

	TP_ARGS(rq, prev, next),

	TP_STRUCT__entry(
		__array(	char,	prev_comm,	TASK_COMM_LEN	)
		__field(	pid_t,	prev_pid			)
		__field(	int,	prev_prio			)
		__array(	char,	next_comm,	TASK_COMM_LEN	)
		__field(	pid_t,	next_pid			)
		__field(	int,	next_prio			)
	),

	TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
		__entry->next_comm, __entry->next_pid, __entry->next_prio),

	TP_fast_assign(
		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
		__entry->prev_pid	= prev->pid;
		__entry->prev_prio	= prev->prio;
		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
		__entry->next_pid	= next->pid;
		__entry->next_prio	= next->prio;
	)
 );

This macro is called TRACE_EVENT, it is broken up into 5 parts:

 TP_PROTO:        the proto type of the trace point
 TP_ARGS:         the arguments of the trace point
 TP_STRUCT_entry: the structure layout of the entry in the ring buffer
 TP_printk:       the printk format
 TP_fast_assign:  the method used to write the entry into the ring buffer

The structure is the definition of how the event will be saved in the
ring buffer. The printk is used by the internal tracing in case of
an oops, and the kernel needs to print out the format of the record
to the console. This the TP_printk gives a means to show the records
in a human readable format. It is also used to print out the data
from the trace file.

The TP_fast_assign is executed directly. It is basically like a C function,
where the __entry is the handle to the record.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 00:35:07 -04:00
Steven Rostedt
2939b0469d tracing: replace TP<var> with TP_<var>
Impact: clean up

The macros TPPROTO, TPARGS, TPFMT, TPRAWFMT, and TPCMD all look a bit
ugly. This patch adds an underscore to their names.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-10 00:35:04 -04:00
Michael Hennerich
b4be468cc1 Input: add AD7879 Touchscreen driver
[randy.dunlap@oracle.com: don't use bus_id]
[dtor@mail.ru: locking and other fixups]
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-03-09 20:14:10 -07:00
Linus Torvalds
99adcd9d67 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
  Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
2009-03-09 13:23:59 -07:00
Dave Jones
129f8ae9b1 Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
This reverts commit e088e4c9cd.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
 - Find out the remaining causes of overheating, and fix them
   if possible. ACPI should be doing the right thing automatically.
   If it isn't, we need to fix that.
 - mark p4-clockmod ui as deprecated
 - try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-03-09 15:07:33 -04:00
Linus Torvalds
df0b4a5080 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  p54: fix race condition in memory management
  cfg80211: test before subtraction on unsigned
  iwlwifi: fix error flow in iwl*_pci_probe
  rt2x00 : more devices to rt73usb.c
  rt2x00 : more devices to rt2500usb.c
  bonding: Fix device passed into ->ndo_neigh_setup().
  vlan: Fix vlan-in-vlan crashes.
  net: Fix missing dev->neigh_setup in register_netdevice().
  tmspci: fix request_irq race
  pkt_sched: act_police: Fix a rate estimator test.
  tg3: Fix 5906 link problems
  SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
  IPv6: add "disable" module parameter support to ipv6.ko
  sungem: another error printed one too early
  aoe: error printed 1 too early
  net pcmcia: worklimit reaches -1
  net: more timeouts that reach -1
  net: fix tokenring license
  dm9601: new vendor/product IDs
  netlink: invert error code in netlink_set_err()
  ...
2009-03-09 09:15:40 -07:00
Ingo Molnar
7bffc23e56 tracing: optimize trace_printk()
Impact: micro-optimization

trace_printk() does this unconditionally:

	trace_printk_fmt = fmt;

Where trace_printk_fmt is an entry into a global array. This is
very SMP-unfriendly.

So only write it once per bootup.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-09 10:11:36 +01:00
Daniel Mack
73969ff0ed Input: generic driver for rotary encoders on GPIOs
This patch adds a generic driver for rotary encoders connected to GPIO
pins of a system. It relies on gpiolib and generic hardware irqs. The
documentation that also comes with this patch explains the concept and
how to use the driver.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-03-08 16:35:53 -07:00
Linus Torvalds
5dc18f51a2 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmatest: fix use after free in dmatest_exit
  ipu_idmac: fix spinlock type
  iop-adma, mv_xor: fix mem leak on self-test setup failure
  fsldma: fix off by one in dma_halt
  I/OAT: fail self-test if callback test reaches timeout
  I/OAT: update driver version and copyright dates
  I/OAT: list usage cleanup
  I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
  I/OAT: cancel watchdog before dma remove
  I/OAT: fail initialization on zero channels detection
  I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
  I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
  dmaengine: update kerneldoc
2009-03-08 10:23:05 -07:00
Linus Torvalds
fd6ec5f3ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ata: add CFA specific identify data words
  remove stale comment from <linux/hdreg.h>
  AT91: initialize Compact Flash on AT91SAM9263 cpu
  ide: add at91_ide driver
  ide: allow to wrap interrupt handler
  ide-iops: fix odd-length ATAPI PIO transfers
  ide: NULL noise: drivers/ide/ide-*.c
  ide: expiry() returns int, negative expiry() return values won't be noticed
2009-03-08 10:22:22 -07:00
Linus Torvalds
83d5a32510 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Don't trust current capacity values in identify words 57-58
  libata: make sure port is thawed when skipping resets
  sata_nv: fix module parameter description
  ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
  libata: don't use on-stack sense buffer
  libata: align ap->sector_buf
  libata: fix dma_unmap_sg misuse
  libata: change drive ready wait after hard reset to 5s
2009-03-08 10:22:01 -07:00
Linus Torvalds
2a50b2560e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: serio - fix protocol number for TouchIT213
2009-03-08 10:14:19 -07:00
Ingo Molnar
dba58e39ce Merge branches 'tracing/doc', 'tracing/ftrace', 'tracing/printk' and 'tracing/textedit' into tracing/core 2009-03-08 16:48:51 +01:00
Dmitry Torokhov
ab96ddec72 Input: serio - fix protocol number for TouchIT213
Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.

Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-03-07 18:41:38 -08:00
Frederic Weisbecker
769b0441f4 tracing/core: drop the old trace_printk() implementation in favour of trace_bprintk()
Impact: faster and lighter tracing

Now that we have trace_bprintk() which is faster and consume lesser
memory than trace_printk() and has the same purpose, we can now drop
the old implementation in favour of the binary one from trace_bprintk(),
which means we move all the implementation of trace_bprintk() to
trace_printk(), so the Api doesn't change except that we must now use
trace_seq_bprintk() to print the TRACE_PRINT entries.

Some changes result of this:

- Previously, trace_bprintk depended of a single tracer and couldn't
  work without. This tracer has been dropped and the whole implementation
  of trace_printk() (like the module formats management) is now integrated
  in the tracing core (comes with CONFIG_TRACING), though we keep the file
  trace_printk (previously trace_bprintk.c) where we can find the module
  management. Thus we don't overflow trace.c

- changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.

- change a bit trace_printk/trace_vprintk macros to support non-builtin formats
  constants, and fix 'const' qualifiers warnings. But this is all transparent for
  developers.

- etc...

V2:

- Rebase against last changes
- Fix mispell on the changelog

V3:

- Rebase against last changes (moving trace_printk() to kernel.h)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 17:59:12 +01:00
Lai Jiangshan
1ba28e02a1 tracing: add trace_bprintk()
Impact: add a generic printk() for tracing, like trace_printk()

trace_bprintk() uses the infrastructure to record events on ring_buffer.

[ fweisbec@gmail.com: ported to latest -tip, made it work if
  !CONFIG_MODULES, never free the format strings from modules
  because we can't keep track of them and conditionnaly create
  the ftrace format strings section (reported by Steven Rostedt) ]

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 17:59:11 +01:00
Lai Jiangshan
1427cdf059 tracing: infrastructure for supporting binary record
Impact: save on memory for tracing

Current tracers are typically using a struct(like struct ftrace_entry,
struct ctx_switch_entry, struct special_entr etc...)to record a binary
event. These structs can only record a their own kind of events.
A new kind of tracer need a new struct and a lot of code too handle it.

So we need a generic binary record for events. This infrastructure
is for this purpose.

[fweisbec@gmail.com: rebase against latest -tip, make it safe while sched
tracing as reported by Steven Rostedt]

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 17:59:11 +01:00
Ingo Molnar
546e5354a6 Merge branch 'core/printk' into tracing/ftrace 2009-03-06 17:45:42 +01:00
Lai Jiangshan
4370aa4aa7 vsprintf: add binary printf
Impact: add new APIs for binary trace printk infrastructure

vbin_printf(): write args to binary buffer, string is copied
when "%s" is occurred.

bstr_printf(): read from binary buffer for args and format a string

[fweisbec@gmail.com: rebase]

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1236356510-8381-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 17:39:04 +01:00
Mathieu Desnoyers
0e39ac4446 tracing, Text Edit Lock - Architecture Independent Code
This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.

A mutex has been chosen so that kprobes, the main user of this, can sleep
during memory allocation between the memory read of the instructions it
must replace and the memory write of the breakpoint.

Other user of this interface: immediate values.

Paravirt and alternatives are always done when SMP is inactive, so there
is no need to use locks.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <49B142D8.7020601@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 16:48:59 +01:00
Ingo Molnar
f0ef039851 Merge branch 'x86/core' into tracing/textedit
Conflicts:
	arch/x86/Kconfig
	block/blktrace.c
	kernel/irq/handle.c

Semantic conflict:
	kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 16:45:01 +01:00
Ingo Molnar
bc722f508a Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-06 11:40:37 +01:00
Ingo Molnar
1609743970 Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core 2009-03-06 11:39:18 +01:00
Tejun Heo
6b19b0c240 x86, percpu: setup reserved percpu area for x86_64
Impact: fix relocation overflow during module load

x86_64 uses 32bit relocations for symbol access and static percpu
symbols whether in core or modules must be inside 2GB of the percpu
segement base which the dynamic percpu allocator doesn't guarantee.
This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
first chunk so that module percpu areas are always allocated from the
first chunk which is always inside the relocatable range.

This problem exists for any percpu allocator but is easily triggered
when using the embedding allocator because the second chunk is located
beyond 2GB on it.

This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
that it only indicates the size of the area to reserve for dynamic
allocation as static and dynamic areas can be separate.  New
PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
the reserved area separation eats away some allocatable space and
having slightly more headroom (currently between 4 and 8k after
minimal boot sans module area) makes sense for common case
performance.

x86_32 can address anywhere from anywhere and doesn't need reserving.

Mike Galbraith first reported the problem first and bisected it to the
embedding percpu allocator commit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
2009-03-06 14:33:59 +09:00
Tejun Heo
edcb463997 percpu, module: implement reserved allocation and use it for module percpu variables
Impact: add reserved allocation functionality and use it for module
	percpu variables

This patch implements reserved allocation from the first chunk.  When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator.  This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.

If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation.  If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation.  The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.

If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-03-06 14:33:59 +09:00
Tejun Heo
cafe8816b2 percpu: use negative for auto for pcpu_setup_first_chunk() arguments
Impact: argument semantic cleanup

In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant
auto-sizing.  It's okay for @unit_size as 0 doesn't make sense but 0
dynamic reserve size is valid.  Alos, if arch @dyn_size is calculated
from other parameters, it might end up passing in 0 @dyn_size and
malfunction when the size is automatically adjusted.

This patch makes both @unit_size and @dyn_size ssize_t and use -1 for
auto sizing.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-03-06 14:33:59 +09:00
Tejun Heo
2441d15c97 percpu: cosmetic renames in pcpu_setup_first_chunk()
Impact: cosmetic, preparation for future changes

Make the following renames in pcpur_setup_first_chunk() in preparation
for future changes.

* s/free_size/dyn_size/
* s/static_vm/first_vm/
* s/static_chunk/schunk/

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-03-06 14:33:59 +09:00
Tejun Heo
6a242909b0 percpu: clean up percpu constants
Impact: cleaup

Make the following cleanups.

* There isn't much arch-specific about PERCPU_MODULE_RESERVE.  Always
  define it whether arch overrides PERCPU_ENOUGH_ROOM or not.

* blackfin overrides PERCPU_ENOUGH_ROOM to align static area size.  Do
  it by default.

* percpu allocation sizes doesn't have much to do with the page size.
  Don't use PAGE_SHIFT in their definition.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bryan Wu <cooloney@kernel.org>
2009-03-06 14:33:58 +09:00
Ingo Molnar
a1413c89ae Merge branch 'x86/urgent' into x86/core
Conflicts:
	arch/x86/include/asm/fixmap_64.h
Semantic merge:
	arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 21:48:50 +01:00
Michael Buesch
e79c1ba84c ssb: Add SPROM fallback support
This adds SSB functionality to register a fallback SPROM image from the
architecture setup code.

Weird architectures exist that have half-assed SSB devices without SPROM attached to
their PCI busses. The architecture can register a fallback SPROM image that is
used if no SPROM is found on the SSB device.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Florian Fainelli <florian@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-05 14:39:32 -05:00
Joerg Roedel
a31fba5d68 dma-debug: add checks for sync_single_sg_*
Impact: add debug callbacks for dma_sync_sg_* functions

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:21 +01:00
Joerg Roedel
948408ba3e dma-debug: add checks for sync_single_range_*
Impact: add debug callbacks for dma_sync_single_range_for_* functions

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:21 +01:00
Joerg Roedel
b9d2317e0c dma-debug: add checks for sync_single_*
Impact: add debug callbacks for dma_sync_single_for_* functions

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:20 +01:00
Joerg Roedel
6bfd449876 dma-debug: add checking for [alloc|free]_coherent
Impact: add debug callbacks for dma_[alloc|free]_coherent

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:19 +01:00
Joerg Roedel
972aa45cea dma-debug: add add checking for map/unmap_sg
Impact: add debug callbacks for dma_{un}map_sg

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:18 +01:00
Joerg Roedel
f62bc980e6 dma-debug: add checking for map/unmap_page/single
Impact: add debug callbacks for dma_{un}map_[page|single]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:18 +01:00
Joerg Roedel
6bf078715c dma-debug: add initialization code
Impact: add code to initialize dma-debug core data structures

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 20:35:15 +01:00
Sergei Shtylyov
d42ad15b75 ata: add CFA specific identify data words
Declare CFA specific identify data words 162 and 163 for future use.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
[bart: update patch summary/description]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-05 17:20:55 +01:00
Steven Rostedt
2002c258fa tracing: add tracing_on/tracing_off to kernel.h
Impact: cleanup

The functions tracing_start/tracing_stop have been moved to kernel.h.
These are not the functions a developer most likely wants to use
when they want to insert a place to stop tracing and restart it from
user space.

tracing_start/tracing_stop was created to work with things like
suspend to ram, where even calling smp_processor_id() can crash the
system. The tracing_start/tracing_stop was used to stop the tracer from
doing anything. These are still light weight functions, but add a bit
more overhead to be able to stop the tracers. They also have no interface
back to userland. That is, if the kernel calls tracing_stop, userland
can not start tracing.

What a developer most likely wants to use is tracing_on/tracing_off.
These are very light weight functions (simply sets or clears a bit).
These functions just stop recording into the ring buffer. The tracers
don't even know that this happens except that they would receive NULL
from the ring_buffer_lock_reserve function.

Also, there's a way for the user land to enable or disable this bit.
In debugfs/tracing/tracing_on, a user may echo "0" (same as tracing_off())
or echo "1" (same as tracing_on()) into this file. This becomes handy when
a kernel developer is debugging and wants tracing to turn off when it
hits an anomaly. Then the developer can examine the trace, and restart
tracing if they want to try again (echo 1 > tracing_on).

This patch moves the prototypes for tracing_on/tracing_off to kernel.h
and comments their use, so that a kernel developer will know how
to use them.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-05 10:35:56 -05:00
Bartlomiej Zolnierkiewicz
ebcad5aaea remove stale comment from <linux/hdreg.h>
HDIO_GET_IDENTITY returns 256 words currently.

Noticed by Norman Diamond.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-05 16:10:59 +01:00
Stanislaw Gruszka
849d713000 ide: allow to wrap interrupt handler
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Andrew Victor <linux@maxim.org.za>
[bart: minor checkpatch.pl / CodingStyle fixups]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-03-05 16:10:57 +01:00
Joerg Roedel
f2f45e5f3c dma-debug: add header file and core data structures
Impact: add groundwork for DMA-API debugging

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-05 15:11:12 +01:00
Tejun Heo
84bda12af3 libata: align ap->sector_buf
ap->sector_buf is used as DMA target and should at least be aligned on
cacheline.  This caused problems on some embedded machines.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-05 07:25:02 -05:00
FUJITA Tomonori
5825627c94 libata: fix dma_unmap_sg misuse
libata passes the returned value of dma_map_sg() to
dma_unmap_sg(),which is the misuse of dma_unmap_sg().

DMA-mapping.txt says:

To unmap a scatterlist, just call:

	pci_unmap_sg(pdev, sglist, nents, direction);

Again, make sure DMA activity has already finished.

PLEASE NOTE:  The 'nents' argument to the pci_unmap_sg call must be
              the _same_ one you passed into the pci_map_sg call,
	      it should _NOT_ be the 'count' value _returned_ from the
              pci_map_sg call.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-05 07:24:57 -05:00
Stuart Hayes
e7d3ef13d5 libata: change drive ready wait after hard reset to 5s
This fixes problems during resume with drives that take longer than 1s to
be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to be
ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which also
times out, then an IDENTIFY command is sent to the drive which times out,
and finally the error handler will try to send another hard reset which
will finally get things working.

Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-03-05 07:24:42 -05:00
Ingo Molnar
7df4edb07c Merge branch 'linus' into core/iommu 2009-03-05 12:47:28 +01:00
Frederic Weisbecker
0012693ad4 tracing/function-graph-tracer: use the more lightweight local clock
Impact: decrease hangs risks with the graph tracer on slow systems

Since the function graph tracer can spend too much time on timer
interrupts, it's better now to use the more lightweight local
clock. Anyway, the function graph traces are more reliable on a
per cpu trace.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <49af243d.06e9300a.53ad.ffff840c@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 12:14:41 +01:00
Ingo Molnar
49d2d266ad Merge commit 'v2.6.29-rc7' into sched/core 2009-03-05 11:59:10 +01:00
Ingo Molnar
a140feab42 Merge commit 'v2.6.29-rc7' into core/locking 2009-03-05 11:45:22 +01:00
David S. Miller
508827ff0a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/tokenring/tmspci.c
	drivers/net/ucc_geth_mii.c
2009-03-05 02:06:47 -08:00
Ingo Molnar
526211bc58 tracing: move utility functions from ftrace.h to kernel.h
Make common utility functions such as trace_printk() and
tracing_start()/tracing_stop() generally available to kernel
code.

Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 10:28:45 +01:00
Ingo Molnar
5e1607a00b tracing: rename ftrace_printk() => trace_printk()
Impact: cleanup

Use a more generic name - this also allows the prototype to move
to kernel.h and be generally available to kernel developers who
want to do some quick tracing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 10:24:48 +01:00
David S. Miller
77827a7cf3 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2009-03-04 23:59:54 -08:00
David S. Miller
9d40bbda59 vlan: Fix vlan-in-vlan crashes.
As analyzed by Patrick McHardy, vlan needs to reset it's
netdev_ops pointer in it's ->init() function but this
leaves the compat method pointers stale.

Add a netdev_resync_ops() and call it from the vlan code.

Any other driver which changes ->netdev_ops after register_netdevice()
will need to call this new function after doing so too.

With help from Patrick McHardy.

Tested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04 23:46:25 -08:00
Ingo Molnar
28b1bd1cbc Merge branch 'core/locking' into tracing/ftrace 2009-03-04 18:49:19 +01:00
Ingo Molnar
2602c3ba45 Merge branch 'rfc/splice/tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-04 11:15:42 +01:00
Ingo Molnar
a1be621dfa Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc7' into tracing/core 2009-03-04 11:14:47 +01:00
Eric Biederman
0c5c2d3089 neigh: Allow for user space users of the neighbour table
Currently it is possible to do just about everything with the arp table
from user space except treat an entry like you are using it.  To that end
implement and a flag NTF_USE that when set in a netwlink update request
treats the neighbour table entry like the kernel does on the output path.

This allows user space applications to share the kernel's arp cache.

Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04 00:03:08 -08:00
Geert Uytterhoeven
a1d2f09544 crypto: compress - Add pcomp interface
The current "comp" crypto interface supports one-shot (de)compression only,
i.e. the whole data buffer to be (de)compressed must be passed at once, and
the whole (de)compressed data buffer will be received at once.
In several use-cases (e.g. compressed file systems that store files in big
compressed blocks), this workflow is not suitable.
Furthermore, the "comp" type doesn't provide for the configuration of
(de)compression parameters, and always allocates workspace memory for both
compression and decompression, which may waste memory.

To solve this, add a "pcomp" partial (de)compression interface that provides
the following operations:
  - crypto_compress_{init,update,final}() for compression,
  - crypto_decompress_{init,update,final}() for decompression,
  - crypto_{,de}compress_setup(), to configure (de)compression parameters
    (incl. allocating workspace memory).

The (de)compression methods take a struct comp_request, which was mimicked
after the z_stream object in zlib, and contains buffer pointer and length
pairs for input and output.

The setup methods take an opaque parameter pointer and length pair. Parameters
are supposed to be encoded using netlink attributes, whose meanings depend on
the actual (name of the) (de)compression algorithm.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-03-04 15:05:33 +08:00
Steven Rostedt
ef7a4a1614 ring-buffer: fix ring_buffer_read_page
The ring_buffer_read_page was broken if it were to only copy part
of the page. This patch fixes that up as well as adds a parameter
to allow a length field, in order to only copy part of the buffer page.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-03 20:51:24 -05:00
Ingo Molnar
91d75e209b Merge branch 'x86/core' into core/percpu 2009-03-04 02:29:19 +01:00
Ingo Molnar
8b0e5860cb Merge branches 'x86/apic', 'x86/cpu', 'x86/fixmap', 'x86/mm', 'x86/sched', 'x86/setup-lzma', 'x86/signal' and 'x86/urgent' into x86/core 2009-03-04 02:22:31 +01:00
Linus Torvalds
219f170a85 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: don't allow setuid to succeed if the user does not have rt bandwidth
  sched_rt: don't start timer when rt bandwidth disabled
2009-03-03 14:33:20 -08:00
Linus Torvalds
b24746c7be Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Teach RCU that idle task is not quiscent state at boot
2009-03-03 14:32:04 -08:00
Linus Torvalds
2d44947a56 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  fix warning in io_mapping_map_wc()
  x86: i915 needs pgprot_writecombine() and is_io_mapping_possible()
2009-03-02 15:47:01 -08:00
Ingo Molnar
654952bcb9 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-03-02 23:14:02 +01:00
Uwe Kleine-Koenig
c79a61f557 tracing: make CALLER_ADDRx overwriteable
The current definition of CALLER_ADDRx isn't suitable for all platforms.
E.g. for ARM __builtin_return_address(N) doesn't work for N > 0 and
AFAIK for powerpc there are no frame pointers needed to have a working
__builtin_return_address.  This patch allows defining the CALLER_ADDRx
macros in <asm/ftrace.h> and let these take precedence.

Because now <asm/ftrace.h> is included unconditionally in
<linux/ftrace.h> all archs that don't already had this include get an
empty one for free.

Signed-off-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-02 16:49:37 -05:00
Ingo Molnar
fdfa66ab45 Merge branches 'tracing/ftrace', 'tracing/mmiotrace' and 'linus' into tracing/core 2009-03-02 22:37:35 +01:00
Ingo Molnar
c02368a9d0 Merge branch 'linus' into irq/genirq 2009-03-02 22:08:56 +01:00
Randy Dunlap
d3a21be86c skbuff.h: fix timestamps kernel-doc
Fix skbuff.h kernel-doc for timestamps: must include "struct" keyword,
otherwise there are kernel-doc errors:

Error(linux-next-20090227//include/linux/skbuff.h:161): cannot understand prototype: 'struct skb_shared_hwtstamps '
Error(linux-next-20090227//include/linux/skbuff.h:177): cannot understand prototype: 'union skb_shared_tx '

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:15:58 -08:00
Inaky Perez-Gonzalez
c747583d19 wimax/i2400m: implement RX reorder support
Allow the device to give the driver RX data with reorder information.

When that is done, the device will indicate the driver if a packet has
to be held in a (sorted) queue. It will also tell the driver when held
packets have to be released to the OS.

This is done to improve the WiMAX-protocol level retransmission
support when missing frames are detected.

The code docs provide details about the implementation.

In general, this just hooks into the RX path in rx.c; if a packet with
the reorder bit in the RX header is detected, the reorder information
in the header is extracted and one of the four main reorder operations
are executed. In one case (queue) no packet will be delivered to the
networking stack, just queued, whereas in the others (reset, update_ws
and queue_update_ws), queued packet might be delivered depending on
the window start for the specific queue.

The modifications to files other than rx.c are:

- control.c: during device initialization, enable reordering support
  if the rx_reorder_disabled module parameter is not enabled

- driver.c: expose a rx_reorder_disable module parameter and call
  i2400m_rx_setup/release() to initialize/shutdown RX reorder
  support.

- i2400m.h: introduce members in 'struct i2400m' needed for
  implementing reorder support.

- linux/i2400m.h: introduce TLVs, commands and constant definitions
  related to RX reorder

Last but not least, the rx reorder code includes an small circular log
where the last N reorder operations are recorded to be displayed in
case of inconsistency. Otherwise diagnosing issues would be almost
impossible.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:10:28 -08:00
Inaky Perez-Gonzalez
fd5c565c0c wimax/i2400m: support extended data RX protocol (no need to reallocate skbs)
Newer i2400m firmwares (>= v1.4) extend the data RX protocol so that
each packet has a 16 byte header. This header is mainly used to
implement host reordeing (which is addressed in later commits).

However, this header also allows us to overwrite it (once data has
been extracted) with an Ethernet header and deliver to the networking
stack without having to reallocate the skb (as it happened in fw <=
v1.3) to make room for it.

- control.c: indicate the device [dev_initialize()] that the driver
  wants to use the extended data RX protocol. Also involves adding the
  definition of the needed data types in include/linux/wimax/i2400m.h.

- rx.c: handle the new payload type for the extended RX data
  protocol. Prepares the skb for delivery to
  netdev.c:i2400m_net_erx().

- netdev.c: Introduce i2400m_net_erx() that adds the fake ethernet
  address to a prepared skb and delivers it to the networking
  stack.

- cleanup: in most instances in rx.c, the variable 'single' was
  renamed to 'single_last' for it better conveys its meaning.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:10:26 -08:00
Kay Sievers
347707baa7 wimax: struct device - replace bus_id with dev_name(), dev_set_name()
Cc: inaky.perez-gonzalez@intel.com
Cc: linux-wimax@intel.com
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:10:26 -08:00
Inaky Perez-Gonzalez
8987691a4a wimax/i2400m: allow control of the base-station idle mode timeout
For power saving reasons, WiMAX links can be put in idle mode while
connected after a certain time of the link not being used for tx or
rx. In this mode, the device pages the base-station regularly and when
data is ready to be transmitted, the link is revived.

This patch allows the user to control the time the device has to be
idle before it decides to go to idle mode from a sysfs
interace.

It also updates the initialization code to acknowledge the module
variable 'idle_mode_disabled' when the firmware is a newer version
(upcoming 1.4 vs 2.6.29's v1.3).

The method for setting the idle mode timeout in the older firmwares is
much more limited and can be only done at initialization time. Thus,
the sysfs file will return -ENOSYS on older ones.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:10:25 -08:00
Ingo Molnar
5512b3ece0 Merge branches 'sched/clock', 'sched/urgent' and 'linus' into sched/core 2009-03-02 12:02:36 +01:00
Ilpo Järvinen
cabeccbd17 tcp: kill eff_sacks "cache", the sole user can calculate itself
Also fixes insignificant bug that would cause sending of stale
SACK block (would occur in some corner cases).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:00:16 -08:00
Ingo Molnar
f180053694 x86, mm: dont use non-temporal stores in pagecache accesses
Impact: standardize IO on cached ops

On modern CPUs it is almost always a bad idea to use non-temporal stores,
as the regression in this commit has shown it:

  30d697f: x86: fix performance regression in write() syscall

The kernel simply has no good information about whether using non-temporal
stores is a good idea or not - and trying to add heuristics only increases
complexity and inserts fragility.

The regression on cached write()s took very long to be found - over two
years. So dont take any chances and let the hardware decide how it makes
use of its caches.

The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are
absolutely sure that another entity (the GPU) will pick up the dirty
data immediately and that the CPU will not touch that data before the
GPU will.

Also, keep the _nocache() primitives to make it easier for people to
experiment with these details. There may be more clear-cut cases where
non-cached copies can be used, outside of filemap.c.

Cc: Salman Qazi <sqazi@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02 11:06:49 +01:00
Pallipadi, Venkatesh
5ce04e3de8 fix warning in io_mapping_map_wc()
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02 09:43:32 +01:00
David S. Miller
aa4abc9bcc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-tx.c
	net/8021q/vlan_core.c
	net/core/dev.c
2009-03-01 21:35:16 -08:00
Ingo Molnar
55f2b78995 Merge branch 'x86/urgent' into x86/pat 2009-03-01 12:47:58 +01:00
Ingo Molnar
f5c1aa1537 Revert "gpu/drm, x86, PAT: PAT support for io_mapping_*"
This reverts commit 17581ad812.

Sitsofe Wheeler reported that /dev/dri/card0 is MIA on his EeePC 900
and bisected it to this commit.

Graphics card is an i915 in an EeePC 900:

 00:02.0 VGA compatible controller [0300]:
   Intel Corporation Mobile 915GM/GMS/910GML
     Express Graphics Controller [8086:2592] (rev 04)

( Most likely the ioremap() of the driver failed and hence the card
  did not initialize. )

Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Bisected-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-01 12:47:49 +01:00
Chris Leech
709ab3261e net headers: export dcbnl.h
The DCB netlink interface is required for building the userspace tools
available at e1000.sourceforge.net

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-01 00:19:36 -08:00
Chris Leech
5c25222180 net headers: cleanup dcbnl.h
1) add an include for <linux/types.h>
2) change dcbmsg.dcb_family from unsigned char to __u8 to be more
   consistent with use of kernel types

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-01 00:19:35 -08:00
David S. Miller
8010dc306b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-02-28 22:32:16 -08:00
David S. Miller
18963caaf5 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2009-02-28 15:36:58 -08:00
Ingo Molnar
acdb2c2879 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-28 10:15:58 +01:00
Steven Rostedt
629928041c tracing: create the C style tracing for the sched subsystem
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style
faster tracing for the sched subsystem trace points.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28 04:04:13 -05:00
Linus Torvalds
535d8e8f19 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: enable DMAR by default
  xen: disable interrupts early, as start_kernel expects
  gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_t
  gpu/drm, x86, PAT: Handle io_mapping_create_wc() errors in a clean way
  x86, Voyager: fix compile by lifting the degeneracy of phys_cpu_present_map
  x86, doc: fix references to Documentation/x86/i386/boot.txt
2009-02-27 16:43:05 -08:00
David Howells
5170836679 Fix recursive lock in free_uid()/free_user_ns()
free_uid() and free_user_ns() are corecursive when CONFIG_USER_SCHED=n,
but free_user_ns() is called from free_uid() by way of uid_hash_remove(),
which requires uidhash_lock to be held.  free_user_ns() then calls
free_uid() to complete the destruction.

Fix this by deferring the destruction of the user_namespace.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-27 16:26:21 -08:00
Jouni Malinen
98c8a60a04 nl80211: Provide access to STA TX/RX packet counters
The TX/RX packet counters are needed to fill in RADIUS Accounting
attributes Acct-Output-Packets and Acct-Input-Packets. We already
collect the needed information, but only the TX/RX bytes were
previously exposed through nl80211. Allow applications to fetch the
packet counters, too, to provide more complete support for accounting.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-27 14:52:39 -05:00
Dhaval Giani
54e9912428 sched: don't allow setuid to succeed if the user does not have rt bandwidth
Impact: fix hung task with certain (non-default) rt-limit settings

Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.
On failure, return EINVAL, which is also returned in
CONFIG_CGROUP_SCHED.

Reported-by: Corey Hickey <bugfood-ml@fatooh.org>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-27 11:11:53 +01:00
Ingo Molnar
f701d35407 Merge branches 'tracing/ftrace' and 'linus' into tracing/core 2009-02-27 09:04:43 +01:00
Magnus Damm
bdaa6e8062 sh: multiple vectors per irq - base
Instead of keeping the single vector -> single linux irq mapping
we extend the intc code to support merging of vectors to a single
linux irq. This helps processors such as sh7750, sh7780 and sh7785
which have more vectors than masking ability. With this patch in
place we can modify the intc tables to use one irq per maskable
irq source. Please note the following:

 - If multiple vectors share the same enum then only the
   first vector will be available as a linux irq.

 - Drivers may need to be rewritten to get pending irq
   source from the hardware block instead of irq number.

This patch together with the sh7785 specific intc tables solves
DMA controller irq issues related to buggy interrupt masking.

Reported-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-02-27 16:53:50 +09:00
Andy Grover
db49b9d26c RDS: Add userspace header
Applications include this header in order to use RDS sockets.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26 23:42:11 -08:00
Andy Grover
8a7c4c7726 RDS: Add AF and PF #defines for RDS sockets
RDS is a reliable datagram protocol used for IPC on Oracle
database clusters. This adds address and protocol family numbers
for it.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26 23:41:38 -08:00
Ingo Molnar
1b49061d40 Merge branch 'sched/clock' into tracing/ftrace
Conflicts:
	kernel/sched_clock.c
2009-02-27 08:35:19 +01:00
Adrian McMenamin
b233b28eac sh: maple: Support block reads and writes.
This patch updates the maple bus to support asynchronous block reads
and writes as well as generally improving the quality of the code and
supporting concurrency (all needed to support the Dreamcast visual
memory unit - a driver will also be posted for that).

Changes in the bus driver necessitate some changes in the two maple bus
input drivers that are currently in mainline.

As well as supporting block reads and writes this code clean up removes
some poor handling of locks, uses an atomic status variable to serialise
access to devices and more robusly handles the general performance
problems of the bus.

Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-02-27 16:07:32 +09:00
Ingo Molnar
5d0859cef2 Merge branch 'sched/clock' into tracing/ftrace
Conflicts:
	kernel/sched_clock.c
2009-02-26 21:21:59 +01:00
Ingo Molnar
b342501cd3 sched: allow architectures to specify sched_clock_stable
Allow CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures to still specify
that their sched_clock() implementation is reliable.

This will be used by x86 to switch on a faster sched_clock_cpu()
implementation on certain CPU types.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 21:20:22 +01:00
Ingo Molnar
14131f2f98 tracing: implement trace_clock_*() APIs
Impact: implement new tracing timestamp APIs

Add three trace clock variants, with differing scalability/precision
tradeoffs:

 -   local: CPU-local trace clock
 -  medium: scalable global clock with some jitter
 -  global: globally monotonic, serialized clock

Make the ring-buffer use the local trace clock internally.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 18:44:06 +01:00
Ingo Molnar
4434e51564 Merge branches 'sched/cleanups', 'sched/urgent' and 'linus' into sched/core 2009-02-26 13:22:13 +01:00
Jens Axboe
1e42807918 block: reduce stack footprint of blk_recount_segments()
blk_recalc_rq_segments() requires a request structure passed in, which
we don't have from blk_recount_segments(). So the latter allocates one on
the stack, using > 400 bytes of stack for that. This can cause us to spill
over one page of stack from ext4 at least:

 0)     4560     400   blk_recount_segments+0x43/0x62
 1)     4160      32   bio_phys_segments+0x1c/0x24
 2)     4128      32   blk_rq_bio_prep+0x2a/0xf9
 3)     4096      32   init_request_from_bio+0xf9/0xfe
 4)     4064     112   __make_request+0x33c/0x3f6
 5)     3952     144   generic_make_request+0x2d1/0x321
 6)     3808      64   submit_bio+0xb9/0xc3
 7)     3744      48   submit_bh+0xea/0x10e
 8)     3696     368   ext4_mb_init_cache+0x257/0xa6a [ext4]
 9)     3328     288   ext4_mb_regular_allocator+0x421/0xcd9 [ext4]
10)     3040     160   ext4_mb_new_blocks+0x211/0x4b4 [ext4]
11)     2880     336   ext4_ext_get_blocks+0xb61/0xd45 [ext4]
12)     2544      96   ext4_get_blocks_wrap+0xf2/0x200 [ext4]
13)     2448      80   ext4_da_get_block_write+0x6e/0x16b [ext4]
14)     2368     352   mpage_da_map_blocks+0x7e/0x4b3 [ext4]
15)     2016     352   ext4_da_writepages+0x2ce/0x43c [ext4]
16)     1664      32   do_writepages+0x2d/0x3c
17)     1632     144   __writeback_single_inode+0x162/0x2cd
18)     1488      96   generic_sync_sb_inodes+0x1e3/0x32b
19)     1392      16   sync_sb_inodes+0xe/0x10
20)     1376      48   writeback_inodes+0x69/0xb3
21)     1328     208   balance_dirty_pages_ratelimited_nr+0x187/0x2f9
22)     1120     224   generic_file_buffered_write+0x1d4/0x2c4
23)      896     176   __generic_file_aio_write_nolock+0x35f/0x393
24)      720      80   generic_file_aio_write+0x6c/0xc8
25)      640      80   ext4_file_write+0xa9/0x137 [ext4]
26)      560     320   do_sync_write+0xf0/0x137
27)      240      48   vfs_write+0xb3/0x13c
28)      192      64   sys_write+0x4c/0x74
29)      128     128   system_call_fastpath+0x16/0x1b

Split the segment counting out into a __blk_recalc_rq_segments() helper
to avoid allocating an onstack request just for checking the physical
segment count.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Ingo Molnar
ecc25fbd6b Merge branches 'x86/apic', 'x86/defconfig', 'x86/memtest', 'x86/mm' and 'linus' into x86/core 2009-02-26 06:31:32 +01:00
Ingo Molnar
801c0be814 Merge branches 'x86/urgent' and 'x86/pat' into x86/core
Conflicts:
	arch/x86/include/asm/pat.h
2009-02-26 06:31:23 +01:00
Ingo Molnar
13b2eda64d Merge branch 'x86/urgent' into x86/core
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
2009-02-26 06:30:42 +01:00
Paul E. McKenney
a682604838 rcu: Teach RCU that idle task is not quiscent state at boot
This patch fixes a bug located by Vegard Nossum with the aid of
kmemcheck, updated based on review comments from Nick Piggin,
Ingo Molnar, and Andrew Morton.  And cleans up the variable-name
and function-name language.  ;-)

The boot CPU runs in the context of its idle thread during boot-up.
During this time, idle_cpu(0) will always return nonzero, which will
fool Classic and Hierarchical RCU into deciding that a large chunk of
the boot-up sequence is a big long quiescent state.  This in turn causes
RCU to prematurely end grace periods during this time.

This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
function to ignore the idle task as a quiescent state until the
system has started up the scheduler in rest_init(), introducing a
new non-API function rcu_idle_now_means_idle() to inform RCU of this
transition.  RCU maintains an internal rcu_idle_cpu_truthful variable
to track this state, which is then used by rcu_check_callback() to
determine if it should believe idle_cpu().

Because this patch has the effect of disallowing RCU grace periods
during long stretches of the boot-up sequence, this patch also introduces
Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
no-op if num_online_cpus() returns 1.  This allows boot-time code that
calls synchronize_rcu() to proceed normally.  Note, however, that RCU
callbacks registered by call_rcu() will likely queue up until later in
the boot sequence.  Although rcuclassic and rcutree can also use this
same optimization after boot completes, rcupreempt must restrict its
use of this optimization to the portion of the boot sequence before the
scheduler starts up, given that an rcupreempt RCU read-side critical
section may be preeempted.

In addition, this patch takes Nick Piggin's suggestion to make the
system_state global variable be __read_mostly.

Changes since v4:

o	Changes the name of the introduced function and variable to
	be less emotional.  ;-)

Changes since v3:

o	WARN_ON(nr_context_switches() > 0) to verify that RCU
	switches out of boot-time mode before the first context
	switch, as suggested by Nick Piggin.

Changes since v2:

o	Created rcu_blocking_is_gp() internal-to-RCU API that
	determines whether a call to synchronize_rcu() is itself
	a grace period.

o	The definition of rcu_blocking_is_gp() for rcuclassic and
	rcutree checks to see if but a single CPU is online.

o	The definition of rcu_blocking_is_gp() for rcupreempt
	checks to see both if but a single CPU is online and if
	the system is still in early boot.

	This allows rcupreempt to again work correctly if running
	on a single CPU after booting is complete.

o	Added check to rcupreempt's synchronize_sched() for there
	being but one online CPU.

Tested all three variants both SMP and !SMP, booted fine, passed a short
rcutorture test on both x86 and Power.

Located-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 04:08:14 +01:00
Ingo Molnar
f4abfb8d0d Merge branch 'tip/tracing/ftrace' of ssh://master.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-26 03:48:44 +01:00
Ingo Molnar
e36b1e136a Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'linus' into tracing/core 2009-02-26 03:47:27 +01:00
Steven Rostedt
3cdfdf91fc tracing: wrap arguments with PARAMS
Peter Zijlstra warned that TPPROTO and TPARGS might become something
other than a simple copy of itself. To prevent this from having
side effects in the TRACE_FORMAT macro in tracepoint.h, we add a
PARAMS() macro to be defined as just a wrapper.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25 21:44:26 -05:00
Steven Rostedt
eef62a6826 tracing: rename DEFINE_TRACE_FMT to just TRACE_FORMAT
There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should
be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it
TRACE_FORMAT.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25 21:44:22 -05:00
Tejun Heo
e317603694 percpu: fix too low alignment restriction on UP
UP __alloc_percpu() triggered WARN_ON_ONCE() if the requested
alignment is larger than that of unsigned long long, which is too
small for all the cacheline aligned allocations.  Bump it up to
SMP_CACHE_BYTES which kmalloc allocations generally guarantee.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 10:54:17 +09:00
Bartlomiej Zolnierkiewicz
8fed436841 ide: fix refcounting in device drivers
During host driver module removal del_gendisk() results in a final
put on drive->gendev and freeing the drive by drive_release_dev().

Convert device drivers from using struct kref to use struct device
so device driver's object holds reference on ->gendev and prevents
drive from prematurely going away.

Also fix ->remove methods to not erroneously drop reference on a
host driver by using only put_device() instead of ide*_put().

Reported-by: Stanislaw Gruszka <stf_xl@wp.pl>
Tested-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:24 +01:00
Ingo Molnar
2b9d1496e7 time: ntp: make 64-bit constants more robust
Impact: cleanup, no functionality changed

 - make PPM_SCALE an explicit s64 constant, to
   remove (s64) casts from usage sites.

kernel/time/ntp.o:

   text	   data	    bss	    dec	    hex	filename
   2536	    114	    136	   2786	    ae2	ntp.o.before
   2536	    114	    136	   2786	    ae2	ntp.o.after

md5:
   40a7728d1188aa18e83e21a81fa7b150  ntp.o.before.asm
   40a7728d1188aa18e83e21a81fa7b150  ntp.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 18:38:15 +01:00
Linus Torvalds
60042600c5 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: fix endless "Unknown DMAR structure type" loop
  VT-d: handle Invalidation Queue Error to avoid system hang
  intel-iommu: fix build error with INTR_REMAP=y and DMAR=n
2009-02-25 09:31:21 -08:00
Ingo Molnar
d2b0261506 alloc_percpu: fix UP build
Impact: build fix

the !SMP branch had a 'gfp' leftover:

 include/linux/percpu.h: In function '__alloc_percpu':
 include/linux/percpu.h:160: error: 'gfp' undeclared (first use in this function)
 include/linux/percpu.h:160: error: (Each undeclared identifier is reported only once
 include/linux/percpu.h:160: error: for each function it appears in.)

Use GFP_KERNEL like the SMP version does.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 14:38:12 +01:00
Peter Zijlstra
6e2756376c generic-ipi: remove CSD_FLAG_WAIT
Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
the code so that we can use CSD_FLAG_LOCK for both purposes.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 14:13:44 +01:00
Venkatesh Pallipadi
17581ad812 gpu/drm, x86, PAT: PAT support for io_mapping_*
Make io_mapping_create_wc and io_mapping_free go through PAT to make sure
that there are no memory type aliases.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 13:09:52 +01:00
Venkatesh Pallipadi
4ab0d47d0a gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_t
io_mapping_create_wc should take a resource_size_t parameter in place of
unsigned long. With unsigned long, there will be no way to map greater than 4GB
address in i386/32 bit.

On x86, greater than 4GB addresses cannot be mapped on i386 without PAE. Return
error for such a case.

Patch also adds a structure for io_mapping, that saves the base, size and
type on HAVE_ATOMIC_IOMAP archs, that can be used to verify the offset on
io_mapping_map calls.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 13:09:51 +01:00
Ingo Molnar
3255aa2eb6 x86, mm: pass in 'total' to __copy_from_user_*nocache()
Impact: cleanup, enable future change

Add a 'total bytes copied' parameter to __copy_from_user_*nocache(),
and update all the callsites.

The parameter is not used yet - architecture code can use it to
more intelligently decide whether the copy should be cached or
non-temporal.

Cc: Salman Qazi <sqazi@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 10:20:03 +01:00
David S. Miller
f11c179eea Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/orinoco/orinoco.c
2009-02-25 00:02:05 -08:00
Pablo Neira Ayuso
1ce85fe402 netlink: change nlmsg_notify() return value logic
This patch changes the return value of nlmsg_notify() as follows:

If NETLINK_BROADCAST_ERROR is set by any of the listeners and
an error in the delivery happened, return the broadcast error;
else if there are no listeners apart from the socket that
requested a change with the echo flag, return the result of the
unicast notification. Thus, with this patch, the unicast
notification is handled in the same way of a broadcast listener
that has set the NETLINK_BROADCAST_ERROR socket flag.

This patch is useful in case that the caller of nlmsg_notify()
wants to know the result of the delivery of a netlink notification
(including the broadcast delivery) and take any action in case
that the delivery failed. For example, ctnetlink can drop packets
if the event delivery failed to provide reliable logging and
state-synchronization at the cost of dropping packets.

This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-24 23:18:28 -08:00
Steven Rostedt
7c37730cd3 tracing: add DEFINE_TRACE_FMT to tracepoint.h
This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
This allows for the developers to place format strings and
args in with their tracepoint declaration. A tracer may now
override the DEFINE_TRACE_FMT macro and use it to record
a default format.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-24 21:53:32 -05:00
David S. Miller
8b6f92b1bd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-02-24 13:49:05 -08:00
Ingo Molnar
0edcf8d692 Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
	arch/x86/include/asm/pgtable.h
2009-02-24 21:52:45 +01:00
Ingo Molnar
a852cbfaaf Merge branches 'x86/acpi', 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/mm', 'x86/signal' and 'x86/urgent'; commit 'v2.6.29-rc6' into x86/core 2009-02-24 21:50:43 +01:00
Jean Delvare
cd97f39b7c i2c-dev: Clarify the unit of ioctl I2C_TIMEOUT
The unit in which user-space can set the bus timeout value is jiffies
for historical reasons (back when HZ was always 100.) This is however
not good because user-space doesn't know how long a jiffy lasts. The
timeout value should instead be set in a fixed time unit. Given the
original value of HZ, this unit should be 10 ms, for compatibility.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
2009-02-24 19:19:49 +01:00
Ingo Molnar
a7f4463e03 Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc6' into tracing/core 2009-02-24 18:22:39 +01:00
Jan Engelhardt
d060ffc184 netfilter: install missing headers
iptables imports headers from (the unifdefed headers of a)
kernel tree, but some headers happened to not be installed.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-24 15:23:58 +01:00
David S. Miller
e70049b9e7 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2009-02-24 03:50:29 -08:00
Tejun Heo
8d408b4be3 percpu: give more latitude to arch specific first chunk initialization
Impact: more latitude for first percpu chunk allocation

The first percpu chunk serves the kernel static percpu area and may or
may not contain extra room for further dynamic allocation.
Initialization of the first chunk needs to be done before normal
memory allocation service is up, so it has its own init path -
pcpu_setup_static().

It seems archs need more latitude while initializing the first chunk
for example to take advantage of large page mapping.  This patch makes
the following changes to allow this.

* Define PERCPU_DYNAMIC_RESERVE to give arch hint about how much space
  to reserve in the first chunk for further dynamic allocation.

* Rename pcpu_setup_static() to pcpu_setup_first_chunk().

* Make pcpu_setup_first_chunk() much more flexible by fetching page
  pointer by callback and adding optional @unit_size, @free_size and
  @base_addr arguments which allow archs to selectively part of chunk
  initialization to their likings.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-24 11:57:21 +09:00
Tejun Heo
c0c0a29379 vmalloc: add @align to vm_area_register_early()
Impact: allow larger alignment for early vmalloc area allocation

Some early vmalloc users might want larger alignment, for example, for
custom large page mapping.  Add @align to vm_area_register_early().
While at it, drop docbook comment on non-existent @size.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
2009-02-24 11:57:21 +09:00
Tejun Heo
2d0aae4169 bootmem: reorder interface functions and add a missing one
Impact: cleanup and addition of missing interface wrapper

The interface functions in bootmem.h was ordered in not so orderly
manner.  Reorder them such that

* functions allocating the same area group together -
  ie. alloc_bootmem group and alloc_bootmem_low group.

* functions w/o node parameter come before the ones w/ node parameter.

* nopanic variants are immediately below their panicky counterparts.

While at it, add alloc_bootmem_pages_node_nopanic() which was missing.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
2009-02-24 11:57:21 +09:00
Tejun Heo
c132937556 bootmem: clean up arch-specific bootmem wrapping
Impact: cleaner and consistent bootmem wrapping

By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
arch-specific wrappers for bootmem allocation.  However, this is done
a bit strangely in that only the high level convenience macros can be
changed while lower level, but still exported, interface functions
can't be wrapped.  This not only is messy but also leads to strange
situation where alloc_bootmem() does what the arch wants it to do but
the equivalent __alloc_bootmem() call doesn't although they should be
able to be used interchangeably.

This patch updates bootmem such that archs can override / wrap the
backend function - alloc_bootmem_core() instead of the highlevel
interface functions to allow simpler and consistent wrapping.  Also,
HAVE_ARCH_BOOTMEM_NODE is renamed to HAVE_ARCH_BOOTMEM.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@saeurebad.de>
2009-02-24 11:57:20 +09:00
Linus Torvalds
d38e84ee39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netns: fix double free at netns creation
  veth : add the set_mac_address capability
  sunlance: Beyond ARRAY_SIZE of ib->btx_ring
  sungem: another error printed one too early
  ISDN: fix sc/shmem printk format warning
  SMSC: timeout reaches -1
  smsc9420: handle magic field of ethtool_eeprom
  sundance: missing parentheses?
  smsc9420: fix another postfixed timeout
  wimax/i2400m: driver loads firmware v1.4 instead of v1.3
  vlan: Update skb->mac_header in __vlan_put_tag().
  cxgb3: Add support for PCI ID 0x35.
  tcp: remove obsoleted comment about different passes
  TG3: &&/|| confusion
  ATM: misplaced parentheses?
  net/mv643xx: don't disable the mib timer too early and lock properly
  net/mv643xx: use GFP_ATOMIC while atomic
  atl1c: Atheros L1C Gigabit Ethernet driver
  net: Kill skb_truesize_check(), it only catches false-positives.
  net: forcedeth: Fix wake-on-lan regression
2009-02-23 14:36:05 -08:00
David Rientjes
3b89d7d881 slub: move min_partial to struct kmem_cache
Although it allows for better cacheline use, it is unnecessary to save a
copy of the cache's min_partial value in each kmem_cache_node.

Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-23 12:05:41 +02:00
Benjamin Herrenschmidt
35f88e6b06 Merge commit 'ftrace/function-graph' into next 2009-02-23 10:47:23 +11:00
Ingo Molnar
fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
Rafael J. Wysocki
770824bdc4 PM: Split up sysdev_[suspend|resume] from device_power_[down|up]
Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.

This is based on an earlier patch from Linus.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-22 10:33:44 -08:00
Ingo Molnar
c478f87869 Merge branch 'tip/x86/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Conflicts:
	include/linux/ftrace.h
	kernel/trace/ftrace.c
2009-02-22 18:12:01 +01:00
Alexander Clouter
9c3c133b1e hwrng: timeriomem - New driver
Some hardware platforms, the TS-7800[1] is one for example, can
supply the kernel with an entropy source, albeit a slow one for
TS-7800 users, by just reading a particular IO address.  This
source must not be read above a certain rate otherwise the quality
suffers.

The driver is then hooked into by calling
platform_device_(register|add|del) passing a structure similar to:
------
static struct timeriomem_rng_data ts78xx_ts_rng_data = {
        .address        = (u32 *__iomem) TS_RNG,
        .period         = 1000000, /* one second */
};

static struct platform_device ts78xx_ts_rng_device = {
        .name           = "timeriomem_rng",
        .id             = -1,
        .dev            = {
                .platform_data  = &ts78xx_ts_rng_data,
        },
        .num_resources  = 0,
};
------

[1] http://www.embeddedarm.com/products/board-detail.php?product=TS-7800

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-22 12:03:56 +08:00
Linus Torvalds
adfafefd10 Merge branch 'hibernate'
* hibernate:
  PM: Fix suspend_console and resume_console to use only one semaphore
  PM: Wait for console in resume
  PM: Fix pm_notifiers during user mode hibernation
  swsusp: clean up shrink_all_zones()
  swsusp: dont fiddle with swappiness
  PM: fix build for CONFIG_PM unset
  PM/hibernate: fix "swap breaks after hibernation failures"
  PM/resume: wait for device probing to finish
  Consolidate driver_probe_done() loops into one place
2009-02-21 14:17:26 -08:00
Arjan van de Ven
216773a787 Consolidate driver_probe_done() loops into one place
there's a few places that currently loop over driver_probe_done(), and
I'm about to add another one. This patch abstracts it into a helper
to reduce duplication.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21 14:17:17 -08:00
Mauro Carvalho Chehab
b6adea334c 8250: fix boot hang with serial console when using with Serial Over Lan port
Intel 8257x Ethernet boards have a feature called Serial Over Lan.

This feature works by emulating a serial port, and it is detected by
kernel as a normal 8250 port.  However, this emulation is not perfect, as
also noticed on changeset 7500b1f602.

Before this patch, the kernel were trying to check if the serial TX is
capable of work using IRQ's.

This were done with a code similar this:

        serial_outp(up, UART_IER, UART_IER_THRI);
        lsr = serial_in(up, UART_LSR);
        iir = serial_in(up, UART_IIR);
        serial_outp(up, UART_IER, 0);

        if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
		up->bugs |= UART_BUG_TXEN;

This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.

Due to that, UART_BUG_TXEN is sometimes enabled.  However, as TX IRQ keeps
working, and the TX polling is now enabled, the driver miss-interprets the
IRQ received later, hanging up the machine until a key is pressed at the
serial console.

This is the 6 version of this patch.  Previous versions were trying to
introduce a large enough delay between serial_outp and serial_in(up,
UART_IIR), but not taking forever.  However, the needed delay couldn't be
safely determined.

At the experimental tests, a delay of 1us solves most of the cases, but
still hangs sometimes.  Increasing the delay to 5us was better, but still
doesn't solve.  A very high delay of 50 ms seemed to work every time.

However, poking around with delays and pray for it to be enough doesn't
seem to be a good approach, even for a quirk.

So, instead of playing with random large arbitrary delays, let's just
disable UART_BUG_TXEN for all SoL ports.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:50 -08:00
Michael Buesch
01b24fee28 spi_bitbang: add more lowlevel function documentation
This adds more documentation of the lowlevel API to avoid future bugs.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:49 -08:00
Johannes Weiner
3ef0e5ba46 slab: introduce kzfree()
kzfree() is a wrapper for kfree() that additionally zeroes the underlying
memory before releasing it to the slab allocator.

Currently there is code which memset()s the memory region of an object
before releasing it back to the slab allocator to make sure
security-sensitive data are really zeroed out after use.

These callsites can then just use kzfree() which saves some code, makes
users greppable and allows for a stupid destructor that isn't necessarily
aware of the actual object size.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20 17:57:48 -08:00
Matthew Garrett
b1569e99c7 ACPI: move thermal trip handling to generic thermal layer
The ACPI code currently carries its own thermal trip handling, meaning that
any other thermal implementation will need to reimplement it. Move the code
to the generic thermal layer.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-20 18:41:56 -05:00
Ingo Molnar
b18018126f x86, mm, kprobes: fault.c, simplify notify_page_fault()
Impact: cleanup

Remove an #ifdef from notify_page_fault(). The function still
compiles to nothing in the !CONFIG_KPROBES case.

Introduce kprobes_built_in() and kprobe_fault_handler() helpers
to allow this - they returns 0 if !CONFIG_KPROBES.

No code changed:

   text	   data	    bss	    dec	    hex	filename
   4618	     32	     24	   4674	   1242	fault.o.before
   4618	     32	     24	   4674	   1242	fault.o.after

Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:42 +01:00
Ingo Molnar
b814d41f09 x86, mm: fault.c, simplify kmmio_fault()
Impact: cleanup

Remove an #ifdef from kmmio_fault() - we can do this by
providing default implementations for is_kmmio_active()
and kmmio_handler(). The compiler optimizes it all away
in the !CONFIG_MMIOTRACE case.

Also, while at it, clean up mmiotrace.h a bit:

 - standard header guards
 - standard vertical spaces for structure definitions

No code changed (both with mmiotrace on and off in the config):

   text	   data	    bss	    dec	    hex	filename
   2947	     12	     12	   2971	    b9b	fault.o.before
   2947	     12	     12	   2971	    b9b	fault.o.after

Cc: Pekka Paalanen <pq@iki.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:42 +01:00
Steven Rostedt
000ab69117 ftrace: allow archs to preform pre and post process for code modification
This patch creates the weak functions: ftrace_arch_code_modify_prepare
and ftrace_arch_code_modify_post_process that are called before and
after the stop machine is called to modify the kernel text.

If the arch needs to do pre or post processing, it only needs to define
these functions.

[ Update: Ingo Molnar suggested using the name ftrace_arch_code_modify_*
          over using ftrace_arch_modify_* ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 13:16:18 -05:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Matthew Garrett
6503e5df08 thermal: use integers rather than strings for thermal values
The thermal API currently uses strings to pass values to userspace. This
makes it difficult to use from within the kernel. Change the interface
to use integers and fix up the consumers.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-20 10:52:37 -05:00
Ingo Molnar
057685cf57 Merge branch 'for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 into tracing/kmemtrace
Conflicts:
	mm/slub.c
2009-02-20 12:15:30 +01:00
Christoph Lameter
fe1200b63d SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants
As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-20 12:28:36 +02:00
Pekka Enberg
51735a7ca6 SLUB: Do not pass 8k objects through to the page allocator
Increase the maximum object size in SLUB so that 8k objects are not
passed through to the page allocator anymore. The network stack uses 8k
objects for performance critical operations.

The patch is motivated by a SLAB vs. SLUB regression in the netperf
benchmark. The problem is that the kfree(skb->head) call in
skb_release_data() that is subject to page allocator pass-through as the
size passed to __alloc_skb() is larger than 4 KB in this test.

As explained by Yanmin Zhang:

  I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server
  pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's
  result is about 2.3 times of SLUB's. After applying the reverting patch,
  the result difference between SLUB and SLAB becomes 1% which we might
  consider as fluctuation.

[ penberg@cs.helsinki.fi: fix oops in kmalloc() ]
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-20 12:25:47 +02:00
Christoph Lameter
ffadd4d0fe SLUB: Introduce and use SLUB_MAX_SIZE and SLUB_PAGE_SHIFT constants
As a preparational patch to bump up page allocator pass-through threshold,
introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
mm/slub.c to use them.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-20 12:22:44 +02:00
Adam Nielsen
268cb38e18 netfilter: x_tables: add LED trigger target
Kernel module providing implementation of LED netfilter target.  Each
instance of the target appears as a led-trigger device, which can be
associated with one or more LEDs in /sys/class/leds/

Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:55:14 +01:00
Stephen Hemminger
784544739a netfilter: iptables: lock free counters
The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:

1) it prevents changes to table state (xt_replace) while table is in use.
   This is now handled by doing rcu on the xt_table. When table is
   replaced, the new table(s) are put in and the old one table(s) are freed
   after RCU period.

2) it provides synchronization when accesing the counter values.
   This is now handled by swapping in new table_info entries for each cpu
   then summing the old values, and putting the result back onto one
   cpu.  On a busy system it may cause sampling to occur at different
   times on each cpu, but no packet/byte counts are lost in the process.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)

Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:35:32 +01:00
Pablo Neira Ayuso
be0c22a46c netlink: add NETLINK_BROADCAST_ERROR socket option
This patch adds NETLINK_BROADCAST_ERROR which is a netlink
socket option that the listener can set to make netlink_broadcast()
return errors in the delivery to the caller. This option is useful
if the caller of netlink_broadcast() do something with the result
of the message delivery, like in ctnetlink where it drops a network
packet if the event delivery failed, this is used to enable reliable
logging and state-synchronization. If this socket option is not set,
netlink_broadcast() only reports ESRCH errors and silently ignore
ENOBUFS errors, which is what most netlink_broadcast() callers
should do.

This socket option is based on a suggestion from Patrick McHardy.
Patrick McHardy can exchange this patch for a beer from me ;).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20 01:01:08 -08:00
Santwona Behera
59089d8d16 ethtool: Add RX pkt classification interface
Signed-off-by: Santwona Behera <santwona.behera@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20 00:58:13 -08:00
Tejun Heo
fbf59bc9d7 percpu: implement new dynamic percpu allocator
Impact: new scalable dynamic percpu allocator which allows dynamic
        percpu areas to be accessed the same way as static ones

Implement scalable dynamic percpu allocator which can be used for both
static and dynamic percpu areas.  This will allow static and dynamic
areas to share faster direct access methods.  This feature is optional
and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
arch.  Please read comment on top of mm/percpu.c for details.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
2009-02-20 16:29:08 +09:00
Tejun Heo
8fc4898500 vmalloc: add un/map_kernel_range_noflush()
Impact: two more public map/unmap functions

Implement map_kernel_range_noflush() and unmap_kernel_range_noflush().
These functions respectively map and unmap address range in kernel VM
area but doesn't do any vcache or tlb flushing.  These will be used by
new percpu allocator.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
2009-02-20 16:29:08 +09:00
Tejun Heo
f0aa661790 vmalloc: implement vm_area_register_early()
Impact: allow multiple early vm areas

There are places where kernel VM area needs to be allocated before
vmalloc is initialized.  This is done by allocating static vm_struct,
initializing several fields and linking it to vmlist and later vmalloc
initialization picking up these from vmlist.  This is currently done
manually and if there's more than one such areas, there's no defined
way to arbitrate who gets which address.

This patch implements vm_area_register_early(), which takes vm_area
struct with flags and size initialized, assigns address to it and puts
it on the vmlist.  This way, multiple early vm areas can determine
which addresses they should use.  The only current user - alpha mm
init - is converted to use it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Tejun Heo
f2a8205c4e percpu: kill percpu_alloc() and friends
Impact: kill unused functions

percpu_alloc() and its friends never saw much action.  It was supposed
to replace the cpu-mask unaware __alloc_percpu() but it never happened
and in fact __percpu_alloc_mask() itself never really grew proper
up/down handling interface either (no exported interface for
populate/depopulate).

percpu allocation is about to go through major reimplementation and
there's no reason to carry this unused interface around.  Replace it
with __alloc_percpu() and free_percpu().

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Rusty Russell
313e458f81 alloc_percpu: add align argument to __alloc_percpu.
This prepares for a real __alloc_percpu, by adding an alignment argument.
Only one place uses __alloc_percpu directly, and that's for a string.

tj: af_inet also uses __alloc_percpu(), update it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
2009-02-20 16:29:08 +09:00
Rusty Russell
b36128c830 alloc_percpu: change percpu_ptr to per_cpu_ptr
Impact: cleanup

There are two allocated per-cpu accessor macros with almost identical
spelling.  The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.

tj: kill percpu_ptr() and update UP too

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Ingo Molnar
4cd0332db7 Merge branch 'mainline/function-graph' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/function-graph-tracer 2009-02-19 12:13:33 +01:00
Ingo Molnar
40999096e8 Merge branches 'tracing/blktrace', 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-19 10:20:17 +01:00
Ingo Molnar
72c26c9a26 Merge branch 'linus' into tracing/blktrace
Conflicts:
	block/blktrace.c

Semantic merge:
	kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 09:00:35 +01:00
Jarek Poplawski
e4dd61882e vlan: Update skb->mac_header in __vlan_put_tag().
After moving mac addresses in __vlan_put_tag() skb->mac_header needs
to be updated.

Reported-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-18 23:31:11 -08:00
Linus Torvalds
ba95fd47d1 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
  block: fix booting from partitioned md array
  block: revert part of 18ce3751cc
  cciss: PCI power management reset for kexec
  paride/pg.c: xs(): &&/|| confusion
  fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free
  block: fix bad definition of BIO_RW_SYNC
  bsg: Fix sense buffer bug in SG_IO
2009-02-18 18:33:04 -08:00
Bernhard Walle
97bef7dd05 Bernhard has moved
Since I don't work for SUSE any more and the bwalle@suse.de address is
invalid, correct it in the copyright headers and documentation.

Signed-off-by: Bernhard Walle <bernhard.walle@gmx.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:56 -08:00
Adam Lackorzynski
ffa7525c13 jsm: additional device support
I have a Digi Neo 8 PCI card (114f:00b1) Serial controller: Digi
International Digi Neo 8 (rev 05)

that works with the jsm driver after using the following patch.

Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: Scott H Kilau <Scott_Kilau@digi.com>
Cc: Wendy Xiong <wendyx@us.ibm.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
cc2559bccc mm: fix memmap init for handling memory hole
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.

To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
f2dbcfa738 mm: clean up for early_pfn_to_nid()
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:

	BUG_ON(page_zone(start_page) != page_zone(end_page));

Once I knew this is what was happening, I added some annotations:

	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
		printk(KERN_ERR "move_freepages: Bogus zones: "
		       "start_page[%p] end_page[%p] zone[%p]\n",
		       start_page, end_page, zone);
		printk(KERN_ERR "move_freepages: "
		       "start_zone[%p] end_zone[%p]\n",
		       page_zone(start_page), page_zone(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
		       page_to_pfn(start_page), page_to_pfn(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_nid[%d] end_nid[%d]\n",
		       page_to_nid(start_page), page_to_nid(end_page));
 ...

And here's what I got:

	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
	move_freepages: start_nid[1] end_nid[0]

My memory layout on this box is:

[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff5d
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081fe50
[    0.000000]     1: 0x0081fed1 -> 0x0081fed8
[    0.000000]     1: 0x0081feda -> 0x0081fedb
[    0.000000]     1: 0x0081fedd -> 0x0081fee5
[    0.000000]     1: 0x0081fee7 -> 0x0081ff51
[    0.000000]     1: 0x0081ff59 -> 0x0081ff5d

So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.

This patch:

Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
 I think it makes fix-for-memmap-init not easy.

This patch moves all declaration to include/linux/mm.h

After this,
  if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use static definition in include/linux/mm.h
  else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use generic definition in mm/page_alloc.c
  else
     -> per-arch back end function will be called.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Dan Williams
287d859222 atmel-mci: fix initialization of dma slave data
The conversion of atmel-mci to dma_request_channel missed the
initialization of the channel dma_slave information.  The filter_fn passed
to dma_request_channel is responsible for initializing the channel's
private data.  This implementation has the additional benefit of enabling
a generic client-channel data passing mechanism.

Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Nick Piggin
1cf6e7d83b mm: task dirty accounting fix
YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for
cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty).

Additionally, there is some inconsistency about when task_dirty_inc is
called.  It is used for dirty balancing, however it even gets called for
__set_page_dirty_no_writeback.

So rather than increment it in a set_page_dirty wrapper, move it down to
exactly where the dirty page accounting stats are incremented.

Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:54 -08:00
Davide Libenzi
610d18f412 timerfd: add flags check
As requested by Michael, add a missing check for valid flags in
timerfd_settime(), and make it return EINVAL in case some extra bits are
set.

Michael said:
If this is to be any use to userland apps that want to check flag
support (perhaps it is too late already), then the sooner we get it
into the kernel the better: 2.6.29 would be good; earlier stables as
well would be even better.

[akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: <stable@kernel.org>		[2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Eric Biederman
8f19d47293 seq_file: properly cope with pread
Currently seq_read assumes that the offset passed to it is always the
offset it passed to user space.  In the case pread this assumption is
broken and we do the wrong thing when presented with pread.

To solve this I introduce an offset cache inside of struct seq_file so we
know where our logical file position is.  Then in seq_read if we try to
read from another offset we reset our data structures and attempt to go to
the offset user space wanted.

[akpm@linux-foundation.org: restore FMODE_PWRITE]
[pjt@google.com: seq_open needs its fmode opened up to take advantage of this]
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Turner <pjt@google.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Paul Turner
55ec82176e vfs: separate FMODE_PREAD/FMODE_PWRITE into separate flags
Separate FMODE_PREAD and FMODE_PWRITE into separate flags to reflect the
reality that the read and write paths may have independent restrictions.

A git grep verifies that these flags are always cleared together so this
new behavior will only apply to interfaces that change to clear flags
individually.

This is required for "seq_file: properly cope with pread", a post-2.6.25
regression fix.

[akpm@linux-foundation.org: add comment]
Signed-off-by: Paul Turner <pjt@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc:  Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Benjamin Herrenschmidt
c296861291 vmalloc: add __get_vm_area_caller()
We have get_vm_area_caller() and __get_vm_area() but not
__get_vm_area_caller()

On powerpc, I use __get_vm_area() to separate the ranges of addresses
given to vmalloc vs.  ioremap (various good reasons for that) so in order
to be able to implement the new caller tracking in /proc/vmallocinfo, I
need a "_caller" variant of it.

(akpm: needed for ongoing powerpc development, so merge it early)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Steven Rostedt
712406a6bf tracing/function-graph-tracer: make arch generic push pop functions
There is nothing really arch specific of the push and pop functions
used by the function graph tracer. This patch moves them to generic
code.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-18 13:43:04 -05:00
Ingo Molnar
74019224ac timers: add mod_timer_pending()
Impact: new timer API

Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:

introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.

(regular mod_timer() re-activates non-pending timers.)

This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.

Also while at it:

- optimize the regular mod_timer() path some more, the
  timer-stat and a debug check was needlessly duplicated
  in __mod_timer().

- make the exports come straight after the function, as
  most other exports in timer.c already did.

- eliminate __mod_timer() as an external API, change the
  users to mod_timer().

The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.

Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-18 19:26:33 +01:00
Stephen Hemminger
4a2f965ca5 netfilter: x_tables: change elements in x_tables
Change to proper type on private pointer rather than anonymous void.
Keep active elements on same cache line.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-18 16:29:44 +01:00
Eric Leblond
5ca431f9ae netfilter: nfnetlink_log: fix per-rule qthreshold override
In NFLOG the per-rule qthreshold should overrides per-instance only
it is set. With current code, the per-rule qthreshold is 1 if not set
and it overrides the per-instance qthreshold.

This patch modifies the default xt_NFLOG threshold from 1 to
0. Thus a value of 0 means there is no per-rule setting and the instance
parameter has to apply.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-18 15:29:23 +01:00
Patrick McHardy
4667ba1511 Merge branch 'master' of /repos/git/net-2.6 2009-02-18 15:16:18 +01:00
Jens Axboe
93dbb39350 block: fix bad definition of BIO_RW_SYNC
We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO
and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before
213d9417fe.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-18 10:32:00 +01:00
Herbert Xu
3f683d6175 crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention
This is based on a report and patch by Geert Uytterhoeven.

The functions crypto_alloc_tfm and create_create_tfm return a
pointer that needs to be adjusted by the caller when successful
and otherwise an error value.  This means that the caller has
to check for the error and only perform the adjustment if the
pointer returned is valid.

Since all callers want to make the adjustment and we know how
to adjust it ourselves, it's much easier to just return adjusted
pointer directly.

The only caveat is that we have to return a void * instead of
struct crypto_tfm *.  However, this isn't that bad because both
of these functions are for internal use only (by types code like
shash.c, not even algorithms code).

This patch also moves crypto_alloc_tfm into crypto/internal.h
(crypto_create_tfm is already there) to reflect this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-18 16:56:59 +08:00
David S. Miller
92a0acce18 net: Kill skb_truesize_check(), it only catches false-positives.
A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-17 21:24:05 -08:00
Benjamin Herrenschmidt
82a0a1cc8f Merge commit 'origin/master' into next
Manual merge of:
	arch/powerpc/include/asm/pgtable-ppc32.h
2009-02-18 13:19:25 +11:00
Hannes Eder
886a63e865 drivers/net/hamradio: fix sparse warnings: fix signedness
Fix this sparse warnings:
  drivers/net/hamradio/hdlcdrv.c:274:34: warning: incorrect type in argument 2 (different signedness)
  drivers/net/hamradio/hdlcdrv.c:279:47: warning: incorrect type in argument 2 (different signedness)
  drivers/net/hamradio/hdlcdrv.c:288:39: warning: incorrect type in argument 2 (different signedness)
  drivers/net/hamradio/hdlcdrv.c:300:47: warning: incorrect type in argument 2 (different signedness)

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-17 17:37:40 -08:00
Zlatko Calusic
5955c7a2cf Add support for VT6415 PCIE PATA IDE Host Controller
Signed-off-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-17 16:56:31 -08:00
Linus Torvalds
35010334aa Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vm86: fix preemption bug
  x86, olpc: fix model detection without OFW
  x86, hpet: fix for LS21 + HPET = boot hang
  x86: CPA avoid repeated lazy mmu flush
  x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
  x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
  x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
  x86/cpa: make sure cpa is safe to call in lazy mmu mode
  x86, ptrace, mm: fix double-free on race
2009-02-17 14:27:39 -08:00
Linus Torvalds
3512a79dbc Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling
  ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages
  ext4: Initialize preallocation list_head's properly
  ext4: Fix lockdep warning
  ext4: Fix to read empty directory blocks correctly in 64k
  jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()
  Revert "ext4: wait on all pending commits in ext4_sync_fs()"
  jbd2: Fix return value of jbd2_journal_start_commit()
2009-02-17 14:05:05 -08:00
Steven Rostedt
b6887d7916 ftrace: rename _hook to _probe
Impact: clean up

Ingo Molnar did not like the _hook naming convention used by the
select function tracer. Luis Claudio R. Goncalves suggested using
the "_probe" extension. This patch implements the change of
calling the functions and variables "_hook" and replacing them
with "_probe".

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-17 12:32:04 -05:00
Ingo Molnar
494df596f9 Merge branches 'x86/acpi', 'x86/apic', 'x86/cpudetect', 'x86/headers', 'x86/paravirt', 'x86/urgent' and 'x86/xen'; commit 'v2.6.29-rc5' into x86/core 2009-02-17 12:07:00 +01:00
Ingo Molnar
97d0bb8dcd ftrace: fix !CONFIG_FTRACE [un_]register_ftrace_command() prototypes
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 11:47:39 +01:00
Ingo Molnar
f492d3f838 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-17 11:31:01 +01:00
Steven Rostedt
809dcf29ce ftrace: add pretty print to selected fuction traces
This patch adds a call back for the tracers that have hooks to
selected functions. This allows the tracer to show better output
in the set_ftrace_filter file.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 23:06:01 -05:00
Steven Rostedt
988ae9d6b2 ring-buffer: add tracing_is_on to test if ring buffer is enabled
This patch adds the tracing_is_on() interface to tell if the ring
buffer is turned on or not.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 22:50:01 -05:00
Steven Rostedt
59df055f19 ftrace: trace different functions with a different tracer
Impact: new feature

Currently, the function tracer only gives you an ability to hook
a tracer to all functions being traced. The dynamic function trace
allows you to pick and choose which of those functions will be
traced, but all functions being traced will call all tracers that
registered with the function tracer.

This patch adds a new feature that allows a tracer to hook to specific
functions, even when all functions are being traced. It allows for
different functions to call different tracer hooks.

The way this is accomplished is by a special function that will hook
to the function tracer and will set up a hash table knowing which
tracer hook to call with which function. This is the most general
and easiest method to accomplish this. Later, an arch may choose
to supply their own method in changing the mcount call of a function
to call a different tracer. But that will be an exercise for the
future.

To register a function:

 struct ftrace_hook_ops {
	void			(*func)(unsigned long ip,
					unsigned long parent_ip,
					void **data);
	int			(*callback)(unsigned long ip, void **data);
	void			(*free)(void **data);
 };

 int register_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
				  void *data);

glob is a simple glob to search for the functions to hook.
ops is a pointer to the operations (listed below)
data is the default data to be passed to the hook functions when traced

ops:
 func is the hook function to call when the functions are traced
 callback is a callback function that is called when setting up the hash.
   That is, if the tracer needs to do something special for each
   function, that is being traced, and wants to give each function
   its own data. The address of the entry data is passed to this
   callback, so that the callback may wish to update the entry to
   whatever it would like.
 free is a callback for when the entry is freed. In case the tracer
   allocated any data, it is give the chance to free it.

To unregister we have three functions:

  void
  unregister_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
				void *data)

This will unregister all hooks that match glob, point to ops, and
have its data matching data. (note, if glob is NULL, blank or '*',
all functions will be tested).

  void
  unregister_ftrace_function_hook_func(char *glob,
				 struct ftrace_hook_ops *ops)

This will unregister all functions matching glob that has an entry
pointing to ops.

  void unregister_ftrace_function_hook_all(char *glob)

This simply unregisters all funcs.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 22:44:09 -05:00
Steven Rostedt
f6180773d9 ftrace: add command interface for function selection
Allow for other tracers to add their own commands for function
selection. This interface gives a trace the ability to name a
command for function selection. Right now it is pretty limited
in what it offers, but this is a building step for more features.

The :mod: command is converted to this interface and also serves
as a template for other implementations.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 17:06:02 -05:00
Vlad Yasevich
4458f04c02 sctp: Clean up sctp checksumming code
The sctp crc32c checksum is always generated in little endian.
So, we clean up the code to treat it as little endian and remove
all the __force casts.

Suggested by Herbert Xu.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-16 00:03:10 -08:00
Patrick Ohly
ac45f602ee net: infrastructure for hardware time stamping
The additional per-packet information (16 bytes for time stamps, 1
byte for flags) is stored for all packets in the skb_shared_info
struct. This implementation detail is hidden from users of that
information via skb_* accessor functions. A separate struct resp.
union is used for the additional information so that it can be
stored/copied easily outside of skb_shared_info.

Compared to previous implementations (reusing the tstamp field
depending on the context, optional additional structures) this
is the simplest solution. It does not extend sk_buff itself.

TX time stamping is implemented in software if the device driver
doesn't support hardware time stamping.

The new semantic for hardware/software time stamping around
ndo_start_xmit() is based on two assumptions about existing
network device drivers which don't support hardware time
stamping and know nothing about it:
 - they leave the new skb_shared_tx unmodified
 - the keep the connection to the originating socket in skb->sk
   alive, i.e., don't call skb_orphan()

Given that skb_shared_tx is new, the first assumption is safe.
The second is only true for some drivers. As a result, software
TX time stamping currently works with the bnx2 driver, but not
with the unmodified igb driver (the two drivers this patch series
was tested with).

Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15 22:43:34 -08:00
Patrick Ohly
cb9eff0978 net: new user space API for time stamping of incoming and outgoing packets
User space can request hardware and/or software time stamping.
Reporting of the result(s) via a new control message is enabled
separately for each field in the message because some of the
fields may require additional computation and thus cause overhead.
User space can tell the different kinds of time stamps apart
and choose what suits its needs.

When a TX timestamp operation is requested, the TX skb will be cloned
and the clone will be time stamped (in hardware or software) and added
to the socket error queue of the skb, if the skb has a socket
associated with it.

The actual TX timestamp will reach userspace as a RX timestamp on the
cloned packet. If timestamping is requested and no timestamping is
done in the device driver (potentially this may use hardware
timestamping), it will be done in software after the device's
start_hard_xmit routine.

Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15 22:43:33 -08:00
Patrick Ohly
a75244c3d5 timecompare: generic infrastructure to map between two time bases
Mapping from a struct timecounter to a time returned by functions like
ktime_get_real() is implemented. This is sufficient to use this code
in a network device driver which wants to support hardware time
stamping and transformation of hardware time stamps to system time.

The interface could have been made more versatile by not depending on
a time counter, but this wasn't done to avoid writing glue code
elsewhere.

The method implemented here is the one used and analyzed under the name
"assisted PTP" in the LCI PTP paper:
http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf

Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15 22:43:32 -08:00
Patrick Ohly
a038a353c3 clocksource: allow usage independent of timekeeping.c
So far struct clocksource acted as the interface between time/timekeeping.c
and hardware. This patch generalizes the concept so that a similar
interface can also be used in other contexts. For that it introduces
new structures and related functions *without* touching the existing
struct clocksource.

The reasons for adding these new structures to clocksource.[ch] are
* the APIs are clearly related
* struct clocksource could be cleaned up to use the new structs
* avoids proliferation of files with similar names (timesource.h?
  timecounter.h?)

As outlined in the discussion with John Stultz, this patch adds
* struct cyclecounter: stateless API to hardware which counts clock cycles
* struct timecounter: stateful utility code built on a cyclecounter which
  provides a nanosecond counter
* only the function to read the nanosecond counter; deltas are used internally
  and not exposed to users of timecounter

The code does no locking of the shared state. It must be called at least
as often as the cycle counter wraps around to detect these wrap arounds.
Both is the responsibility of the timecounter user.

Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15 22:43:31 -08:00
Yinghai Lu
970ec1a821 [IA64] fix __apci_unmap_table
Impact: fix build error

to fix:

  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here
  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 00:43:24 +01:00
Ingo Molnar
5274f8354d Merge branch 'sched/urgent'; commit 'v2.6.29-rc5' into sched/core 2009-02-15 21:15:16 +01:00
Ingo Molnar
9abd603048 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core
Conflicts:
	kernel/trace/trace_mmiotrace.c
2009-02-15 20:08:55 +01:00
Ingo Molnar
b69bc39674 Merge commit 'v2.6.29-rc5' into x86/apic 2009-02-15 09:00:18 +01:00
David S. Miller
5e30589521 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn.c
	drivers/net/wireless/iwlwifi/iwl3945-base.c
2009-02-14 23:12:00 -08:00
David S. Miller
ac178ef0ae Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-02-14 23:06:44 -08:00
Harvey Harrison
f3a7c66b5c net: replace __constant_{endian} uses in net headers
Base versions handle constant folding now.  For headers exposed to
userspace, we must only expose the __ prefixed versions.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-14 22:58:35 -08:00
Harvey Harrison
35c26c2cf6 rndis: remove private wrapper of __constant_cpu_to_le32
Use cpu_to_le32 directly as it handles constant folding now, replace direct
uses of __constant_cpu_to_{endian} as well.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-14 22:56:56 -08:00
Sheng Yang
ad8ba2cd44 KVM: Add kvm_arch_sync_events to sync with asynchronize events
kvm_arch_sync_events is introduced to quiet down all other events may happen
contemporary with VM destroy process, like IRQ handler and work struct for
assigned device.

For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so
the state of KVM here is legal and can provide a environment to quiet down other
events.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:36 +02:00
Avi Kivity
7a0eb1960e KVM: Avoid using CONFIG_ in userspace visible headers
Kconfig symbols are not available in userspace, and are not stripped by
headers-install.  Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:35 +02:00
Peter Zijlstra
9851673bc3 lockdep: move state bit definitions around
For convenience later.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:27:59 +01:00
Peter Zijlstra
a652d7081b lockdep: sanitize reclaim bit names
s/HELD_OVER/ENABLED/g

so that its similar to the hard and soft-irq names.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:27:52 +01:00
Peter Zijlstra
4fc95e867f lockdep: sanitize bit names
s/\(LOCKF\?_ENABLED_[^ ]*\)S\(_READ\)\?\>/\1\2/g

So that the USED_IN and ENABLED have the same names.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:27:51 +01:00
Nick Piggin
cf40bd16fd lockdep: annotate reclaim context (__GFP_NOFS)
Here is another version, with the incremental patch rolled up, and
added reclaim context annotation to kswapd, and allocation tracing
to slab allocators (which may only ever reach the page allocator
in rare cases, so it is good to put annotations here too).

Haven't tested this version as such, but it should be getting closer
to merge worthy ;)

--
After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
allocation when it should not have been, I thought it might be a good idea to
try to catch this kind of thing with lockdep.

I coded up a little idea that seems to work. Unfortunately the system has to
actually be in __GFP_FS page reclaim, then take the lock, before it will mark
it. But at least that might still be some orders of magnitude more common
(and more debuggable) than an actual deadlock condition, so we have some
improvement I hope (the concept is no less complete than discovery of a lock's
interrupt contexts).

I guess we could even do the same thing with __GFP_IO (normal reclaim), and
even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
code paths, so let's start there and see how it goes.

It *seems* to work. I did a quick test.

=================================
[ INFO: inconsistent lock state ]
2.6.28-rc6-00007-ged31348-dirty #26
---------------------------------
inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
{in-reclaim-W} state was registered at:
  [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60
  [<ffffffff80268f71>] lock_acquire+0x91/0xc0
  [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310
  [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd]
  [<ffffffff8020903b>] _stext+0x3b/0x170
  [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
  [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
  [<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 3929
hardirqs last  enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310
hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310
softirqs last  enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0
softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0

other info that might help us debug this:
1 lock held by modprobe/8526:
 #0:  (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]

stack backtrace:
Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged31348-dirty #26
Call Trace:
 [<ffffffff80265483>] print_usage_bug+0x193/0x1d0
 [<ffffffff80266530>] mark_lock+0xaf0/0xca0
 [<ffffffff80266735>] mark_held_locks+0x55/0xc0
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60
 [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580
 [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd]
 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
 [<ffffffff8020903b>] _stext+0x3b/0x170
 [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10
 [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180
 [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190
 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:27:49 +01:00
Johannes Berg
6f2b9b9a9d timer: implement lockdep deadlock detection
This modifies the timer code in a way to allow lockdep to detect
deadlocks resulting from a lock being taken in the timer function
as well as around the del_timer_sync() call.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2009-02-14 23:25:52 +01:00
Ingo Molnar
22796b1572 Merge branch 'core/header-fixes' into x86/headers
Conflicts:
	arch/x86/include/asm/setup.h
2009-02-13 21:05:03 +01:00
Johannes Berg
2a51931192 cfg80211/nl80211: scanning (and mac80211 update to use it)
This patch adds basic scan capability to cfg80211/nl80211 and
changes mac80211 to use it. The BSS list that cfg80211 maintains
is made driver-accessible with a private area in each BSS struct,
but mac80211 doesn't yet use it. That's another large project.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-13 13:45:49 -05:00
Linus Torvalds
b51ebdc40c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: Only register AC97 bus if it's not done already
  ALSA: hda - Add snd_hda_multi_out_dig_cleanup()
  ALSA: hda - Add missing terminator in slave dig-out array
  ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model
  ALSA: hda - Register (new) devices at reconfig
  ALSA: mtpav - Fix initial value for input hwport
  ALSA: hda - add id for Intel IbexPeak integrated HDMI codec
  ALSA: hda - compute checksum in HDMI audio infoframe
  ALSA: hda - enable HDMI audio pin out at module loading time
  ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present
  ASoC: Update SDP3430 machine driver for snd_soc_card
  ALSA: hda - Add quirk for Asus z37e (1043:8284)
  sound: Remove OSSlib stuff from linux/soundcard.h
  ASoC: WM8990: Fix kcontrol's private value use in put callback
  ASoC: TLV320AIC3X: Fix kcontrol's private value use in put callback
2009-02-13 08:19:11 -08:00
Ingo Molnar
8f8573ae9f Merge branches 'irq/genirq', 'irq/sparseirq' and 'irq/urgent' into irq/core 2009-02-13 11:57:18 +01:00
Ingo Molnar
5fb896a4e9 Merge branch 'tip/tracing/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-13 11:02:40 +01:00
Ingo Molnar
1c511f740f Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', 'tracing/urgent' and 'linus' into tracing/core 2009-02-13 10:25:18 +01:00
Ingo Molnar
7032e86967 Merge branches 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-02-13 09:47:32 +01:00
Ingo Molnar
a56cdcb662 Merge branches 'x86/acpi', 'x86/asm', 'x86/cpudetect', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/header-fixes', 'x86/headers' and 'x86/minor-fixes' into x86/core 2009-02-13 09:46:36 +01:00
Ingo Molnar
ab639f3593 Merge branch 'core/percpu' into x86/core 2009-02-13 09:45:09 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
Linus Torvalds
37bed90094 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface
  net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
  netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform
  bnx2: Update version to 1.9.2 and copyright.
  bnx2: Fix jumbo frames error handling.
  bnx2: Update 5709 firmware.
  bnx2: Update 5706/5708 firmware.
  3c505: do not set pcb->data.raw beyond its size
  Documentation/connector/cn_test.c: don't use gfp_any()
  net: don't use in_atomic() in gfp_any()
  IRDA: cnt is off by 1
  netxen: remove pcie workaround
  sun3: print when lance_open() fails
  qlge: bugfix: Add missing rx buf clean index on early exit.
  qlge: bugfix: Fix RX scaling values.
  qlge: bugfix: Fix TSO breakage.
  qlge: bugfix: Add missing dev_kfree_skb_any() call.
  qlge: bugfix: Add missing put_page() call.
  qlge: bugfix: Fix fatal error recovery hang.
  qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().
  ...
2009-02-12 17:47:15 -08:00
Steven Rostedt
2a7b8df04c sched: do not account for NMIs
Impact: avoid corruption in system time accounting

Martin Schwidefsky told me that there was an issue with NMIs and
system accounting. The problem is that the accounting code is
not reentrant, and if an NMI goes off after an interrupt it can
corrupt the accounting.

For now, the best we can do is to treat NMIs like SMIs and they
are not accounted for.

This patch changes nmi_enter to not call __irq_enter and to do
the preempt-count and tracing calls directly.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-12 14:16:46 -05:00
Steven Rostedt
5a5fb7dbe8 preempt-count: force hardirq-count to max of 10
To add a bit in the preempt_count to be set when in NMI context, we
found that some archs did not have enough bits to spare. This is
due to the hardirq_count being a mask that can hold NR_IRQS.

Some archs allow for over 16000 IRQs, and that would require a mask
of 14 bits. The sofitrq mask is 8 bits and the preempt disable mask
is also 8 bits.  The PREEMP_ACTIVE bit is bit 30, and bit 31 would
make the preempt_count (which is type int) a negative number.
A negative preempt_count is a sign of failure.

Add them up 14+8+8+1+1 you get 32 bits. No room for the NMI bit.

But the hardirq_count is to track the number of nested IRQs, not
the number of total IRQs.  This originally took the paranoid approach
of setting the max nesting to NR_IRQS. But when we have archs with
over 1000 IRQs, it is not practical to think they will ever all
nest on a single CPU. Not to mention that this would most definitely
cause a stack overflow.

This patch sets a max of 10 bits to be used for IRQ nesting.
I did a 'git grep HARDIRQ' to examine all users of HARDIRQ_BITS and
HARDIRQ_MASK, and found that making it a max of 10 would not hurt
anyone. I did find that the m68k expected it to be 8 bits, so
I allow for the archs to set the number to be less than 10.

I removed the setting of HARDIRQ_BITS from the archs that set it
to more than 10. This includes ALPHA, ia64 and avr32.

This will always allow room for the NMI bit, and if we need to allow
for NMI nesting, we have 4 bits to play with.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-12 11:19:05 -05:00
Ingo Molnar
871cafcc96 Merge branch 'linus' into core/softlockup 2009-02-12 13:08:57 +01:00
Paul Mundt
41480ae7a3 Merge branch 'sh/stable-updates' 2009-02-12 17:27:56 +09:00
Kentaro Takeda
f9ce1f1cda Add in_execve flag into task_struct.
This patch allows LSM modules to determine whether current process is in an
execve operation or not so that they can behave differently while an execve
operation is in progress.

This patch is needed by TOMOYO. Please see another patch titled "LSM adapter
functions." for backgrounds.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-12 15:15:03 +11:00
Johannes Weiner
1d93e52eb4 dmaengine: update kerneldoc
Some of the kerneldoc comments in the dmaengine header describe
already removed structure members.  Remove them.

Also add a short description for dma_device->device_is_tx_complete.

Signed-off-by: Johannes Weiner <jw@emlix.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-02-11 17:12:27 -07:00
Mimi Zohar
523979adfa integrity: audit update
Based on discussions on linux-audit, as per Steve Grubb's request
http://lkml.org/lkml/2009/2/6/269, the following changes were made:
- forced audit result to be either 0 or 1.
- made template names const
- Added new stand-alone message type: AUDIT_INTEGRITY_RULE

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-12 09:40:14 +11:00
Heiko Carstens
6c5979631b syscall define: fix uml compile bug
With the new system call defines we get this on uml:

arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'

Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask.  However sys_sigprocmask is missing
because of this.

To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11 14:25:36 -08:00
Li Zefan
cfebe563bd cgroups: fix lockdep subclasses overflow
I enabled all cgroup subsystems when compiling kernel, and then:
 # mount -t cgroup -o net_cls xxx /mnt
 # mkdir /mnt/0

This showed up immediately:
 BUG: MAX_LOCKDEP_SUBCLASSES too low!
 turning off the locking correctness validator.

It's caused by the cgroup hierarchy lock:
	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
		struct cgroup_subsys *ss = subsys[i];
		if (ss->root == root)
			mutex_lock_nested(&ss->hierarchy_mutex, i);
	}

Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.

This patch uses different lockdep keys for different subsystems.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11 14:25:36 -08:00
Linus Torvalds
94dba89533 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: fix TIMER_ABSTIME for process wide cpu timers
  timers: split process wide cpu clocks/timers, fix
  x86: clean up hpet timer reinit
  timers: split process wide cpu clocks/timers, remove spurious warning
  timers: split process wide cpu clocks/timers
  signal: re-add dead task accumulation stats.
  x86: fix hpet timer reinit for x86_64
  sched: fix nohz load balancer on cpu offline
2009-02-11 08:24:32 -08:00
Markus Metzger
9f339e7028 x86, ptrace, mm: fix double-free on race
Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 15:44:20 +01:00
Peter Zijlstra
4da94d49b2 timers: fix TIMER_ABSTIME for process wide cpu timers
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:04:21 +01:00
Peter Zijlstra
3fccfd67df timers: split process wide cpu clocks/timers, fix
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:04:19 +01:00
Arjan van de Ven
ad0b0fd554 sched, latencytop: incorporate review feedback from Andrew Morton
Andrew had some suggestions for the latencytop file; this patch takes care
of most of these:

* Add documentation
* Turn account_scheduler_latency into an inline function
* Don't report negative values to userspace
* Make the file operations struct const
* Fix a few checkpatch.pl warnings

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 10:18:04 +01:00
Ingo Molnar
f437e8b53e Merge commit 'v2.6.29-rc4' into sched/core 2009-02-11 10:17:42 +01:00
Ingo Molnar
4040068dce Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-11 10:03:53 +01:00
Mimi Zohar
ed850a52af integrity: shmem zero fix
Based on comments from Mike Frysinger and Randy Dunlap:
(http://lkml.org/lkml/2009/2/9/262)
- moved ima.h include before CONFIG_SHMEM test to fix compiler error
  on Blackfin:
mm/shmem.c: In function 'shmem_zero_setup':
mm/shmem.c:2670: error: implicit declaration of function 'ima_shm_check'

- added 'struct linux_binprm' in ima.h to fix compiler warning on Blackfin:
In file included from mm/shmem.c:32:
include/linux/ima.h:25: warning: 'struct linux_binprm' declared inside
parameter list
include/linux/ima.h:25: warning: its scope is only this definition or
declaration, which is probably not what you want

- moved fs.h include within _LINUX_IMA_H definition

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-11 15:27:15 +11:00
Benjamin Herrenschmidt
edbc29d76d Merge commit 'kumar/next' into next 2009-02-11 13:37:44 +11:00
Chuck Ebbert
e672f7db76 pkt_sched: type should be __u32 in header
Using u32 in this header breaks the build of iptables.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10 17:18:17 -08:00
Stefan Richter
1db8508cf4 hugetlbfs: fix build failure with !CONFIG_HUGETLBFS
Fix regression due to 5a6fe12595,
"Do not account for the address space used by hugetlbfs using VM_ACCOUNT"
which added an argument to the function hugetlb_file_setup() but not to
the macro hugetlb_file_setup().

Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10 14:56:59 -08:00
Linus Torvalds
29ef01179d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
  bridge: Fix LRO crash with tun
  IPv6: fix to set device name when new IPv6 over IPv6 tunnel device is created.
  gianfar: Fix boot hangs while bringing up gianfar ethernet
  netfilter: xt_sctp: sctp chunk mapping doesn't work
  netfilter: ctnetlink: fix echo if not subscribed to any multicast group
  netfilter: ctnetlink: allow changing NAT sequence adjustment in creation
  netfilter: nf_conntrack_ipv6: don't track ICMPv6 negotiation message
  netfilter: fix tuple inversion for Node information request
  netxen: fix msi-x interrupt handling
  de2104x: force correct order when writing to rx ring
  tun: Fix unicast filter overflow
  drivers/isdn: introduce missing kfree
  drivers/atm: introduce missing kfree
  sunhme: Don't match PCI devices in SBUS probe.
  9p: fix endian issues [attempt 3]
  net_dma: call dmaengine_get only if NET_DMA enabled
  3c509: Fix resume from hibernation for PnP mode.
  sungem: Soft lockup in sungem on Netra AC200 when switching interface up
  RxRPC: Fix a potential NULL dereference
  r8169: Don't update statistics counters when interface is down
  ...
2009-02-10 11:48:11 -08:00
Mel Gorman
5a6fe12595 Do not account for the address space used by hugetlbfs using VM_ACCOUNT
When overcommit is disabled, the core VM accounts for pages used by anonymous
shared, private mappings and special mappings. It keeps track of VMAs that
should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
with VM_NORESERVE.

Overcommit for hugetlbfs is much riskier than overcommit for base pages
due to contiguity requirements. It avoids overcommiting on both shared and
private mappings using reservation counters that are checked and updated
during mmap(). This ensures (within limits) that hugepages exist in the
future when faults occurs or it is too easy to applications to be SIGKILLed.

As hugetlbfs makes its own reservations of a different unit to the base page
size, VM_ACCOUNT should never be set. Even if the units were correct, we would
double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
be set because an application can request no reserves be made for hugetlbfs
at the risk of getting killed later.

With commit fc8744adc8, VM_NORESERVE and
VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
breaks the accounting for both the core VM and hugetlbfs, can trigger an
OOM storm when hugepage pools are too small lockups and corrupted counters
otherwise are used. This patch brings hugetlbfs more in line with how the
core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10 10:48:42 -08:00
Wenji Huang
c3706f005c tracing: fix typos in comments
Impact: clean up.

Fix typos in the comments.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10 12:32:35 -05:00
Jan Kara
7f5aa21508 jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()
If we race with commit code setting i_transaction to NULL, we could
possibly dereference it.  Proper locking requires the journal pointer
(to access journal->j_list_lock), which we don't have.  So we have to
change the prototype of the function so that filesystem passes us the
journal pointer.  Also add a more detailed comment about why the
function jbd2_journal_begin_ordered_truncate() does what it does and
how it should be used.

Thanks to Dan Carpenter <error27@gmail.com> for pointing to the
suspitious code.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Joel Becker <joel.becker@oracle.com>
CC: linux-ext4@vger.kernel.org
CC: ocfs2-devel@oss.oracle.com
CC: mfasheh@suse.de
CC: Dan Carpenter <error27@gmail.com>
2009-02-10 11:15:34 -05:00
Ingo Molnar
f9915bfef3 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-10 13:25:42 +01:00
David S. Miller
d54e6d8727 net: Kill skbuff macros from the stone ages.
This kills of HAVE_ALLOC_SKB and HAVE_ALIGNABLE_SKB.

Nothing in-tree uses them and nothing in-tree has used them
since 2.0.x times.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-09 23:45:29 -08:00
Tejun Heo
6cd61c0baa elf: add ELF_CORE_COPY_KERNEL_REGS()
ELF core dump is used for both user land core dump and kernel crash
dump.  Depending on architecture, register might need to be accessed
differently for userland and kernel.  Allow architectures to define
ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
register dump.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:26 +01:00
Arnd Bergmann
43a990765a sound: Remove OSSlib stuff from linux/soundcard.h
Removed OSSlib stuff from linux/soundcard.h to fix the warnings for
'make headers_check'.

This patch breaks building against OSSlib with the kernel headers
instead of its own headers. It should still work with any
version of the library from the 2003 onwards which provide
their own headers for the latest interface.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-10 00:02:54 +01:00
Michael Buesch
c970314615 ssb: Add PMU support
This adds support for the SSB PMU.
A PMU is found on Low-Power devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-09 15:03:49 -05:00
Mike Rapoport
0c2bec9694 libertas: if_spi: add ability to call board specific setup/teardown methods
In certain cases it is required to perform board specific actions
before activating libertas G-SPI interface. These actions may include
power up of the chip, GPIOs setup, proper pin-strapping and SPI
controller config.
This patch adds ability to call board specific setup/teardown methods

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-09 15:03:48 -05:00
Luis R. Rodriguez
f130347c2d cfg80211: add get reg command
This lets userspace request to get the currently set
regulatory domain.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-09 15:03:45 -05:00
Linus Torvalds
d7c41b6165 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: scatterwalk - Avoid flush_dcache_page on slab pages
  crypto: shash - Fix tfm destruction
  crypto: api - Fix zeroing on free
  crypto: shash - Fix module refcount
  crypto: api - Fix algorithm test race that broke aead initialisation
2009-02-09 08:52:02 -08:00
Takashi Iwai
2a074f4a54 Merge branch 'topic/quirk-cleanup' into topic/hda 2009-02-09 17:19:21 +01:00
Kyle McMartin
a5ef7ca0e2 x86: spinlocks: define dummy __raw_spin_is_contended
Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since there is
nobody known to be waiting on it (arguably /fairly/ unfair locks).

Dummy it out to return 0 on other architectures.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-09 08:15:39 -08:00
Ingo Molnar
249d51b53a Merge commit 'v2.6.29-rc4' into core/percpu
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-09 14:58:11 +01:00
Yinghai Lu
7d97277b75 acpi/x86: introduce __apci_map_table, v4
to prevent wrongly overwriting fixmap that still want to use.

ACPI used to rely on low mappings being all linearly mapped and
grew a habit: it never really unmapped certain kinds of tables
after use.

This can cause problems - for example the hypothetical case
when some spurious access still references it.

v2: remove prev_map and prev_size in __apci_map_table
v3: let acpi_os_unmap_memory() call early_iounmap too, so remove extral calling to
early_acpi_os_unmap_memory
v4: fix typo in one acpi_get_table_with_size calling

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:35:07 +01:00
Yu Zhao
704126ad81 VT-d: handle Invalidation Queue Error to avoid system hang
When hardware detects any error with a descriptor from the invalidation
queue, it stops fetching new descriptors from the queue until software
clears the Invalidation Queue Error bit in the Fault Status register.
Following fix handles the IQE so the kernel won't be trapped in an
infinite loop.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-02-09 11:03:17 +00:00
Mandeep Singh Baines
17406b82d6 softlockup: remove timestamp checking from hung_task
Impact: saves sizeof(long) bytes per task_struct

By guaranteeing that sysctl_hung_task_timeout_secs have elapsed between
tasklist scans we can avoid using timestamps.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 11:03:49 +01:00
Frederic Weisbecker
1292211058 tracing/power: move the power trace headers to a dedicated file
Impact: cleanup

Move the power tracer headers to trace/power.h to keep ftrace.h and power bits
more easy to maintain as separated topics.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:51:38 +01:00
Ingo Molnar
44b0635481 Merge branch 'tip/tracing/core/devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Conflicts:
	kernel/trace/trace_hw_branches.c
2009-02-09 10:35:12 +01:00
Ingo Molnar
4ad476e11f Merge commit 'v2.6.29-rc4' into tracing/core 2009-02-09 10:32:48 +01:00
Brian Gerst
d3770449d3 percpu: make PER_CPU_BASE_SECTION overridable by arches
Impact: bug fix

IA-64 needs to put percpu data in the seperate section even on UP.
Fixes regression caused by "percpu: refactor percpu.h"

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:30:29 +01:00
Herbert Xu
aa4b9f533e gro: Optimise Ethernet header comparison
This patch optimises the Ethernet header comparison to use 2-byte
and 4-byte xors instead of memcmp.  In order to facilitate this,
the actual comparison is now carried out by the callers of the
shared dev_gro_receive function.

This has a significant impact when receiving 1500B packets through
10GbE.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-08 20:22:18 -08:00
Herbert Xu
4ae5544f9a gro: Remember number of held packets instead of counting every time
This patch prepares for the move of the same_flow checks out of
dev_gro_receive.  As such we need to remember the number of held
packets since doing a loop just to count them every time is silly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-08 20:22:17 -08:00
David S. Miller
d6301d3dd1 net: Increase default NET_SKB_PAD to 32.
Several devices need to insert some "pre headers" in front of the
main packet data when they transmit a packet.

Currently we allocate only 16 bytes of pad room and this ends up not
being enough for some types of hardware (NIU, usb-net, s390 qeth,
etc.)

So increase this to 32.

Note that drivers still need to check in their transmit routine
whether enough headroom exists, and if not use skb_realloc_headroom().
Tunneling, IPSEC, and other encapsulation methods can cause the
padding area to be used up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-08 19:24:13 -08:00
Ingo Molnar
140573d33b Merge branches 'sched/rt' and 'sched/urgent' into sched/core 2009-02-08 20:12:46 +01:00
Cornelia Huck
766ccb9ed4 async: Rename _special -> _domain for clarity.
Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-08 09:56:11 -08:00
Jaswinder Singh Rajput
0fb807c3e5 unconditionally include asm/types.h from linux/types.h
Reported-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-08 11:00:25 +05:30
Wenji Huang
57794a9d48 trace: trivial fixes in comment typos.
Impact: clean up

Fixed several typos in the comments.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:03:36 -05:00
Steven Rostedt
a81bd80a0b ring-buffer: use generic version of in_nmi
Impact: clean up

Now that a generic in_nmi is available, this patch removes the
special code in the ring_buffer and implements the in_nmi generic
version instead.

With this change, I was also able to rename the "arch_ftrace_nmi_enter"
back to "ftrace_nmi_enter" and remove the code from the ring buffer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:03:33 -05:00
Steven Rostedt
375b38b421 nmi: add generic nmi tracking state
This code adds an in_nmi() macro that uses the current tasks preempt count
to track when it is in NMI context. Other parts of the kernel can
use this to determine if the context is in NMI context or not.

This code was inspired by the -rt patch in_nmi version that was
written by Peter Zijlstra, who borrowed that code from
Mathieu Desnoyers.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:01:23 -05:00
Steven Rostedt
d8b891a2db ring-buffer: allow tracing_off to be used in core kernel code
tracing_off() is the fastest way to stop recording to the ring buffers.
This may be used in places like panic and die, just before the
ftrace_dump is called.

This patch adds the appropriate CPP conditionals to make it a stub
function when the ring buffer is not configured it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:01:18 -05:00
Steven Rostedt
78d904b46a ring-buffer: add NMI protection for spinlocks
Impact: prevent deadlock in NMI

The ring buffers are not yet totally lockless with writing to
the buffer. When a writer crosses a page, it grabs a per cpu spinlock
to protect against a reader. The spinlocks taken by a writer are not
to protect against other writers, since a writer can only write to
its own per cpu buffer. The spinlocks protect against readers that
can touch any cpu buffer. The writers are made to be reentrant
with the spinlocks disabling interrupts.

The problem arises when an NMI writes to the buffer, and that write
crosses a page boundary. If it grabs a spinlock, it can be racing
with another writer (since disabling interrupts does not protect
against NMIs) or with a reader on the same CPU. Luckily, most of the
users are not reentrant and protects against this issue. But if a
user of the ring buffer becomes reentrant (which is what the ring
buffers do allow), if the NMI also writes to the ring buffer then
we risk the chance of a deadlock.

This patch moves the ftrace_nmi_enter called by nmi_enter() to the
ring buffer code. It replaces the current ftrace_nmi_enter that is
used by arch specific code to arch_ftrace_nmi_enter and updates
the Kconfig to handle it.

When an NMI is called, it will set a per cpu variable in the ring buffer
code and will clear it when the NMI exits. If a write to the ring buffer
crosses page boundaries inside an NMI, a trylock is used on the spin
lock instead. If the spinlock fails to be acquired, then the entry
is discarded.

This bug appeared in the ftrace work in the RT tree, where event tracing
is reentrant. This workaround solved the deadlocks that appeared there.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:00:17 -05:00
Linus Torvalds
e83102cab0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI PM: make the PM core more careful with drivers using the new PM framework
  PCI PM: Read power state from device after trying to change it on resume
  PCI PM: Do not disable and enable bridges during suspend-resume
  PCI: PCIe portdrv: Simplify suspend and resume
  PCI PM: Fix saving of device state in pci_legacy_suspend
  PCI PM: Check if the state has been saved before trying to restore it
  PCI PM: Fix handling of devices without drivers
  PCI: return error on failure to read PCI ROMs
  PCI: properly clean up ASPM link state on device remove
2009-02-07 10:46:30 -08:00
Ingo Molnar
673f820591 Merge branch 'linus' into core/locking
Conflicts:
	fs/btrfs/locking.c
2009-02-07 18:31:54 +01:00
Rusty Russell
7f9a50a5b8 module: remove over-zealous check in __module_get()
Impact: fix spurious BUG_ON() triggered under load

module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
 can give false results, for example).

Almost noone should be using __module_get, but that's another issue.

Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-07 08:33:01 -08:00
David S. Miller
409f0a9014 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn.c
	drivers/net/wireless/iwlwifi/iwl3945-base.c
2009-02-07 02:52:44 -08:00
David S. Miller
b4bd07c20b net_dma: call dmaengine_get only if NET_DMA enabled
Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>

--------------------
The commit 649274d993 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma.  The API should be called only if
NET_DMA was enabled.
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
2009-02-06 22:06:43 -08:00
Jaswinder Singh Rajput
527bdfee18 make linux/types.h as assembly safe
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-06 20:47:58 +05:30
Ingo Molnar
7d8e23df69 timers: split process wide cpu clocks/timers, remove spurious warning
Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.

Oleg Nesterov observed:

  Can't fastpath_timer_check()->thread_group_cputimer() have the
  false warning too? Suppose we had the timer, then posix_cpu_timer_del()
  removes this timer, but task_cputime_zero(&sig->cputime_expires) still
  not true.

Remove the spurious debug warning.

Reported-by: Mike Galbraith <efault@gmx.de>
Explained-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-06 14:57:51 +01:00
Graf Yang
fe2918b098 net: fix some trailing whitespaces
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05 21:26:19 -08:00
Herbert Xu
33dccbb050 tun: Limit amount of queued packets per device
Unlike a normal socket path, the tuntap device send path does
not have any accounting.  This means that the user-space sender
may be able to pin down arbitrary amounts of kernel memory by
continuing to send data to an end-point that is congested.

Even when this isn't an issue because of limited queueing at
most end points, this can also be a problem because its only
response to congestion is packet loss.  That is, when those
local queues at the end-point fills up, the tuntap device will
start wasting system time because it will continue to send
data there which simply gets dropped straight away.

Of course one could argue that everybody should do congestion
control end-to-end, unfortunately there are people in this world
still hooked on UDP, and they don't appear to be going away
anywhere fast.  In fact, we've always helped them by performing
accounting in our UDP code, the sole purpose of which is to
provide congestion feedback other than through packet loss.

This patch attempts to apply the same bandaid to the tuntap device.
It creates a pseudo-socket object which is used to account our
packets just as a normal socket does for UDP.  Of course things
are a little complex because we're actually reinjecting traffic
back into the stack rather than out of the stack.

The stack complexities however should have been resolved by preceding
patches.  So this one can simply start using skb_set_owner_w.

For now the accounting is essentially disabled by default for
backwards compatibility.  In particular, we set the cap to INT_MAX.
This is so that existing applications don't get confused by the
sudden arrival EAGAIN errors.

In future we may wish (or be forced to) do this by default.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05 21:25:32 -08:00
James Morris
cb5629b10d Merge branch 'master' into next
Conflicts:
	fs/namei.c

Manually merged per:

diff --cc fs/namei.c
index 734f2b5,bbc15c2..0000000
--- a/fs/namei.c
+++ b/fs/namei.c
@@@ -860,9 -848,8 +849,10 @@@ static int __link_path_walk(const char
  		nd->flags |= LOOKUP_CONTINUE;
  		err = exec_permission_lite(inode);
  		if (err == -EAGAIN)
- 			err = vfs_permission(nd, MAY_EXEC);
+ 			err = inode_permission(nd->path.dentry->d_inode,
+ 					       MAY_EXEC);
 +		if (!err)
 +			err = ima_path_check(&nd->path, MAY_EXEC);
   		if (err)
  			break;

@@@ -1525,14 -1506,9 +1509,14 @@@ int may_open(struct path *path, int acc
  		flag &= ~O_TRUNC;
  	}

- 	error = vfs_permission(nd, acc_mode);
+ 	error = inode_permission(inode, acc_mode);
  	if (error)
  		return error;
 +
- 	error = ima_path_check(&nd->path,
++	error = ima_path_check(path,
 +			       acc_mode & (MAY_READ | MAY_WRITE | MAY_EXEC));
 +	if (error)
 +		return error;
  	/*
  	 * An append-only file must be opened in append mode for writing.
  	 */

Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 11:01:45 +11:00
Arnaldo Carvalho de Melo
0a9877514c ring_buffer: remove unused flags parameter
Impact: API change, cleanup

>From ring_buffer_{lock_reserve,unlock_commit}.

$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
  trace_vprintk              |  -14
  trace_graph_return         |  -14
  trace_graph_entry          |  -10
  trace_function             |   -8
  __ftrace_trace_stack       |   -8
  ftrace_trace_userstack     |   -8
  tracing_sched_switch_trace |   -8
  ftrace_trace_special       |  -12
  tracing_sched_wakeup_trace |   -8
 9 functions changed, 90 bytes removed, diff: -90

linux-2.6-tip/block/blktrace.c:
  __blk_add_trace |   -1
 1 function changed, 1 bytes removed, diff: -1

/tmp/vmlinux.after:
 10 functions changed, 91 bytes removed, diff: -91

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-06 01:01:40 +01:00
Mimi Zohar
1df9f0a731 Integrity: IMA file free imbalance
The number of calls to ima_path_check()/ima_file_free()
should be balanced.  An extra call to fput(), indicates
the file could have been accessed without first being
measured.

Although f_count is incremented/decremented in places other
than fget/fput, like fget_light/fput_light and get_file, the
current task must already hold a file refcnt.  The call to
__fput() is delayed until the refcnt becomes 0, resulting
in ima_file_free() flagging any changes.

- add hook to increment opencount for IPC shared memory(SYSV),
  shmat files, and /dev/zero
- moved NULL iint test in opencount_get()

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 09:05:33 +11:00
Mimi Zohar
3323eec921 integrity: IMA as an integrity service provider
IMA provides hardware (TPM) based measurement and attestation for
file measurements. As the Trusted Computing (TPM) model requires,
IMA measures all files before they are accessed in any way (on the
integrity_bprm_check, integrity_path_check and integrity_file_mmap
hooks), and commits the measurements to the TPM. Once added to the
TPM, measurements can not be removed.

In addition, IMA maintains a list of these file measurements, which
can be used to validate the aggregate value stored in the TPM.  The
TPM can sign these measurements, and thus the system can prove, to
itself and to a third party, the system's integrity in a way that
cannot be circumvented by malicious or compromised software.

- alloc ima_template_entry before calling ima_store_template()
- log ima_add_boot_aggregate() failure
- removed unused IMA_TEMPLATE_NAME_LEN
- replaced hard coded string length with #define name

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 09:05:30 +11:00
Mimi Zohar
6146f0d5e4 integrity: IMA hooks
This patch replaces the generic integrity hooks, for which IMA registered
itself, with IMA integrity hooks in the appropriate places directly
in the fs directory.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 09:05:30 +11:00
Ingo Molnar
9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Ingo Molnar
a146649bc1 smp, generic: introduce arch_disable_smp_support(), build fix
This function should be provided on UP too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:57 +01:00