Commit Graph

13017 Commits

Author SHA1 Message Date
Bartlomiej Zolnierkiewicz
d6e2955a6b ide: move IDE{FLOPPY,TAPE}_WAIT_CMD defines to <linux/ide.h>
While at it:

* IDE{FLOPPY,TAPE}_WAIT_CMD -> WAIT_{FLOPPY,TAPE}_CMD

* Use enum for WAIT_* defines.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:39 +02:00
Bartlomiej Zolnierkiewicz
de699ad595 ide: add ide_do_test_unit_ready() helper
* Add ide_do_test_unit_ready() helper and convert ide-{floppy,tape}.c
  to use it.

* Remove no longer used idetape_create_test_unit_ready_cmd().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:39 +02:00
Bartlomiej Zolnierkiewicz
0c8a6c7aea ide: add ide_do_start_stop() helper
* Add ide_do_start_stop() helper and convert ide-{floppy,tape}.c
  to use it.

* Remove no longer used idefloppy_create_start_stop_cmd()
  and idetape_create_load_unload_cmd().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:39 +02:00
Bartlomiej Zolnierkiewicz
0578042db3 ide: add ide_set_media_lock() helper
* Set IDE_AFLAG_NO_DOORLOCK in idetape_get_mode_sense_result(), check it
  in ide_tape_set_media_lock() and cleanup idetape_create_prevent_cmd().

* Set IDE_AFLAG_NO_DOORLOCK in ide_floppy_create_read_capacity_cmd() and
  check it instead of IDE_AFLAG_CLIK_DRIVE in ide_floppy_set_media_lock().

* Add ide_set_media_lock() helper and convert ide-{floppy,tape}.c to use it.

* Remove no longer used ide*_create_prevent_cmd()/ide*_set_media_lock().

* Update comment in <linux/ide.h> accordingly.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:38 +02:00
Bartlomiej Zolnierkiewicz
49cac39e71 ide-floppy: ->{srfp,wp} -> IDE_AFLAG_{SRFP,WP}
Add IDE_AFLAG_{SRFP,WP} drive->atapi_flags and use them
instead of ->{srfp,wp} struct ide_floppy_obj fields.

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:38 +02:00
Bartlomiej Zolnierkiewicz
2ac07d9206 ide: add ide_queue_pc_tail() helper
* Add ide_queue_pc_tail() and convert ide-{floppy,tape}.c to use it
  instead of ide*_queue_pc_tail().

* Remove no longer used ide*_queue_pc_tail().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:38 +02:00
Bartlomiej Zolnierkiewicz
7645c1514c ide: add ide_queue_pc_head() helper
* Move REQ_IDETAPE_* enums to <linux/ide.h>.

* Add ide_queue_pc_head() and convert ide-{floppy,tape}.c to use it
  instead of ide*_queue_pc_head().

* Remove no longer used ide*_queue_pc_head().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:37 +02:00
Bartlomiej Zolnierkiewicz
7bf7420a31 ide: add ide_init_pc() helper
* Add IDE_PC_BUFFER_SIZE define.

* Add ide_init_pc() and convert ide-{floppy,tape}.c to use it
  instead of ide*_init_pc().

* Remove no longer used IDE*_PC_BUFFER_SIZE and ide*_init_pc().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:37 +02:00
Bartlomiej Zolnierkiewicz
acaa0f5f67 ide: add ide_io_buffers() helper
* Make ->io_buffers method return number of bytes transferred.

* Use ide_end_request() instead of idefloppy_end_request()
  in ide_floppy_io_buffers() and then move the call out to
  ide_pc_intr().

* Add ide_io_buffers() helper and convert ide-{floppy,scsi}.c
  to use it instead of ide*_io_buffers().

There should be no functional changes caused by this patch.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:36 +02:00
Bartlomiej Zolnierkiewicz
51509eec34 ide: add ide_check_atapi_device() helper
* Add ide_check_atapi_device() to ide-atapi.c and convert
  ide-{floppy,tape}.c to use it instead of ide*_identify_device().

While at it:

* Add DRV_NAME defines to ide-{floppy,tape}.c.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:34 +02:00
Bartlomiej Zolnierkiewicz
05236ea6df ide: move ioctls handling to ide-ioctls.c
* Move ioctls handling to ide-ioctls.c
  (except HDIO_DRIVE_TASKFILE for now).

* Make ide_{cmd,task}() static.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:33 +02:00
Bartlomiej Zolnierkiewicz
aa7687738a ide: add ide_setting_ioctl() helper
* Add struct ide_ioctl_devset representing ioctl device setting.

* Add ide_setting_ioctl() helper for matching given ioctl
  and its parameters against table of ioctl device settings.

* Convert ide_setting_ioctl() and idedisk_ioctl() to use
  ide_setting_ioctl().

* Un-export ide_setting_mtx.

While at it:

* {get,set}_lba_addressing() -> {get,set}_addressing()

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:33 +02:00
Bartlomiej Zolnierkiewicz
9232c14bff ide: remove ->bus_state field from ide_hwif_t
It is always set to BUSSTATE_ON.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:33 +02:00
Bartlomiej Zolnierkiewicz
feb22b7f8e ide: add proper PCI PM support (v2)
* Keep pointer to ->init_chipset method also in
  struct ide_host and set it in ide_host_alloc_all().

* Add ide_pci_suspend() and ide_pci_resume() helpers
  (default ->suspend and ->resume implementations).

* ->init_chipset can no longer be marked __devinit.

* Add proper PCI PM support to IDE PCI host drivers
  (rz1000.c and tc86c001.c are skipped for now since
  they need to be converted from using ->init_hwif
  to use ->init_chipset instead).

v2:
* Cleanup CONFIG_PM #ifdef-s per akpm's suggestion.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:32 +02:00
Bartlomiej Zolnierkiewicz
a02227c977 ide: lba_capacity_is_ok() -> ata_id_is_lba_capacity_ok()
Rename lba_capacity_is_ok() to ata_id_is_lba_capacity_ok()
and move it to <linux/ata.h> (remove needless parens while at it).

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:31 +02:00
Bartlomiej Zolnierkiewicz
93734a2344 ide: ide_id_to_hd_driveid() -> ata_id_to_hd_driveid()
Rename ide_id_to_hd_driveid() to ata_id_to_hd_driveid()
and move it to <linux/ata.h>.

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:31 +02:00
Bartlomiej Zolnierkiewicz
ff2779b568 ide: ide_id_has_flush_cache_ext() -> ata_id_flush_ext_enabled()
* Add ata_id_flush_ext_enabled() inline helper to <linux/ata.h>.

* ide_id_has_flush_cache_ext() -> ata_id_flush_ext_enabled()

  The latter one also checks if the command is marked as
  supported in word 83 and validity of words 83 & 86.

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:31 +02:00
Bartlomiej Zolnierkiewicz
4b58f17d7c ide: ide_id_has_flush_cache() -> ata_id_flush_enabled()
* Add ata_id_flush_enabled() inline helper to <linux/ata.h>.

* ide_id_has_flush_cache() -> ata_id_flush_enabled()

  The latter one also checks if the command is marked as
  supported in word 83 and validity of words 83 & 86.

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:30 +02:00
Bartlomiej Zolnierkiewicz
1a4e4d4d2c ide: check only for CACHE FLUSH command support in ide_id_has_flush_cache()
All devices supporting CACHE FLUSH EXT command should also support
CACHE FLUSH command so it is sufficient to check only for CACHE FLUSH
in ide_id_has_flush_cache().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:30 +02:00
Bartlomiej Zolnierkiewicz
942dcd85bf ide: idedisk_supports_lba48() -> ata_id_lba48_enabled()
* Add ata_id_lba48_enabled() inline helper to <linux/ata.h>.

* idedisk_supports_lba48() -> ata_id_lba48_enabled()

  The latter one also checks validity of words 83 & 86.

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:30 +02:00
Bartlomiej Zolnierkiewicz
367d7e78dd ide: ide_dev_is_sata() -> ata_id_is_sata()
* Use optimized ATA version check from Sergei in ata_id_is_sata().

* ide_dev_is_sata() -> ata_id_is_sata()

Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:30 +02:00
Bartlomiej Zolnierkiewicz
5d5870f0a2 ide: ide_dev_has_iordy() -> ata_id_has_iordy()
* Remove (id[ATA_ID_FIELD_VALID] & 2) check from ide_dev_has_iordy()
  (it is for validity of words 64-70, IORDY is in word 49).

* ide_dev_has_iordy() -> ata_id_has_iordy()

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:29 +02:00
Bartlomiej Zolnierkiewicz
02d599a365 ide: remove ->supports_dsc_overlap field from ide_driver_t
* Use drive->media and drive->scsi to check if ->dsc_overlap
  can be set by HDIO_SET_NICE ioctl in generic_ide_ioctl().

* Remove unused ->supports_dsc_overlap field from ide_driver_t.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:29 +02:00
Bartlomiej Zolnierkiewicz
ebc6be5206 ide: remove read-only ->atapi_overlap field from ide_drive_t
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:29 +02:00
Bartlomiej Zolnierkiewicz
151a670186 ide: remove SECTOR_WORDS define
Just use SECTOR_SIZE instead.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:28 +02:00
Bartlomiej Zolnierkiewicz
8185d5aa93 ide: /proc/ide/hd*/settings rework
* Add struct ide_devset, S_* flags, *DEVSET() & ide*_devset_*() macros.

* Add 'const struct ide_devset **settings' to ide_driver_t.

* Use 'const struct ide_devset **settings' in ide_drive_t instead of
  'struct ide_settings_s *settings'.  Then convert core code and device
  drivers to use struct ide_devset and co.:

  - device settings are no longer allocated dynamically for each device
    but instead there is an unique struct ide_devset instance per setting

  - device driver keeps the pointer to the table of pointers to its
    settings in ide_driver_t.settings

  - generic settings are kept in ide_generic_setting[]

  - ide_proc_[un]register_driver(), ide_find_setting_by_name(),
    ide_{read,write}_setting() and proc_ide_{read,write}_settings()
    are updated accordingly

  - ide*_add_settings() are removed

* Remove no longer used __ide_add_setting(), ide_add_setting(),
  __ide_remove_setting() and auto_remove_settings().

* Remove no longer used TYPE_*, SETTING_*, ide_procset_t
  and ide_settings_t.

* ->keep_settings, ->using_dma, ->unmask, ->noflush, ->dsc_overlap,
  ->nice1, ->addressing, ->wcache and ->nowerr ide_drive_t fields
  can now be bitfield flags.

While at it:

* Rename ide_find_setting_by_name() to ide_find_setting().

* Rename write_wcache() to set_wcache().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:28 +02:00
Bartlomiej Zolnierkiewicz
263138a0ad ide: preparations for /proc/ide/hd*/settings rework
After rework settings will be no longer created dynamically
for each device so we need to make some fixups first.

* Use set_[ksettings,unmaskirq]() as a set function for
  ["keepsettings","unmaskirq"] setting.

* Allow writes to ["io_32bit","unmaskirq"] settings also when
  drive->no_[io_32bit,unmask] is set (this is checked later inside
  set_[io_32bit,unmaskirq]() anywyay and keeps consistency with
  the corresponding HDIO_SET_[32BIT,UNMASKINTR] ioctls).

* Use max possible multi sectors value (16) as an allowed max for
  "multcount" setting.  set_multcount() set function checks against
  device's max possbile value anyway and it makes the proc setting
  consistent with the corresponding HDIO_SET_MULTCOUNT ioctl.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:27 +02:00
Bartlomiej Zolnierkiewicz
3ceca727fe ide: include <linux/hdreg.h> only when needed
* Include <linux/ata.h> directly in <linux/ide.h>
  instead of through <linux/hdreg.h>.

* Include <linux/hdreg.h> only when needed.

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:27 +02:00
Bartlomiej Zolnierkiewicz
7e59ea21aa ide: check drive->present in ide_get_paired_drive()
* Change ide_get_paired_drive() to return NULL if peer device
  is not present and update all users accordingly.

While at it:

* ide_get_paired_drive() -> ide_get_pair_dev()

* Use ide_get_pair_dev() in cs5530.c, sc1200.c and via82cxxx.c.

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:26 +02:00
Bartlomiej Zolnierkiewicz
a2cdee5a9a ide: remove IDE_CHIPSET_* macros
They just obfuscate the code.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:24 +02:00
Bartlomiej Zolnierkiewicz
b163f46d5e ide: enhance ide_busy_sleep()
* Make ide_busy_sleep() take timeout value as a parameter
  and also allow use of AltStatus Register if requested with
  altstatus parameter.  Update existing users accordingly.

* Convert ide_driveid_update() and actual_try_to_identify()
  to use ide_busy_sleep().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:23 +02:00
Bartlomiej Zolnierkiewicz
3c619ffd48 ide: remove no longer needed ide_drive_t fields
Remove ->remap_0_to_1 and ->sect0 (they are always zero now).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:22 +02:00
Bartlomiej Zolnierkiewicz
3a7d24841a ide: use ATA_* defines instead of *_STAT and *_ERR ones
* ERR_STAT   -> ATA_ERR
* INDEX_STAT -> ATA_IDX
* ECC_STAT   -> ATA_CORR
* DRQ_STAT   -> ATA_DRQ
* SEEK_STAT  -> ATA_DSC
* WRERR_STAT -> ATA_DF
* READY_STAT -> ATA_DRDY
* BUSY_STAT  -> ATA_BUSY

* MARK_ERR   -> ATA_AMNF
* TRK0_ERR   -> ATA_TRK0NF
* ABRT_ERR   -> ATA_ABORTED
* MCR_ERR    -> ATA_MCR
* ID_ERR     -> ATA_IDNF
* MC_ERR     -> ATA_MC
* ECC_ERR    -> ATA_UNC
* ICRC_ERR   -> ATA_ICRC

* BBD_ERR    -> ATA_BBK

Also:

* ILI_ERR    -> ATAPI_ILI
* EOM_ERR    -> ATAPI_EOM
* LFS_ERR    -> ATAPI_LFS

* CD         -> ATAPI_COD
* IO         -> ATAPI_IO

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:21 +02:00
Bartlomiej Zolnierkiewicz
48fb2688aa ide: remove drive->driveid
* Factor out HDIO_[OBSOLETE,GET]_IDENTITY ioctls handling
  to ide_get_identity_ioctl().

* Use temporary buffer in ide_get_identity_ioctl() instead
  of accessing drive->id directly.

* Add ide_id_to_hd_driveid() inline to convert raw id into
  struct hd_driveid format (needed on big-endian).

* Use ide_id_to_hd_driveid() in ide_get_identity_ioctl(),
  cleanup ide_fix_driveid() and switch ide to use use raw id.

* Remove no longer needed drive->driveid.

  This leaves us with 3 users of struct hd_driveid in tree:
  - arch/um/drivers/ubd_kern.c
  - drivers/block/xsysace.c
  - drivers/usb/storage/isd200.c

While at it:

* Use ata_id_u{32,64}() and ata_id_has_{dma,lba,iordy}() macros.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:19 +02:00
Bartlomiej Zolnierkiewicz
4dde4492d8 ide: make drive->id an union (take 2)
Make drive->id an unnamed union so id can be accessed either by using
'u16 *id' or 'struct hd_driveid *driveid'.  Then convert all existing
drive->id users accordingly (using 'u16 *id' when possible).

This is an intermediate step to make ide 'struct hd_driveid'-free.

While at it:

- Add missing KERN_CONTs in it821x.c.

- Use ATA_ID_WORDS and ATA_ID_*_LEN defines.

- Remove unnecessary checks for drive->id.

- s/drive_table/table/ in ide_in_drive_list().

- Cleanup ide_config_drive_speed() a bit.

- s/drive1/dev1/ & s/drive0/dev0/ in ide_undecoded_slave().

v2:
Fix typo in drivers/ide/ppc/pmac.c. (From Stephen Rothwell)

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:19 +02:00
Linus Torvalds
b922df7383 Merge branch 'rcu-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'rcu-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
  rcu: RCU-based detection of stalled CPUs for Classic RCU, fix
  rcu: RCU-based detection of stalled CPUs for Classic RCU
  rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched()
  rcu: fix sparse shadowed variable warning
  doc/RCU: fix pseudocode in rcuref.txt
  rcuclassic: fix compiler warning
  rcu: use irq-safe locks
  rcuclassic: fix compilation NG
  rcu: fix locking cleanup fallout
  rcu: remove redundant ACCESS_ONCE definition from rcupreempt.c
  rcu: fix classic RCU locking cleanup lockdep problem
  rcu: trace fix possible mem-leak
  rcu: just rename call_rcu_bh instead of making it a macro
  rcu: remove list_for_each_rcu()
  rcu: fixes to include/linux/rcupreempt.h
  rcu: classic RCU locking and memory-barrier cleanups
  rcu: prevent console flood when one CPU sees another AWOL via RCU
  rcu, debug: detect stalled grace periods, cleanups
  rcu, debug: detect stalled grace periods
  rcu classic: new algorithm for callbacks-processing(v2)
  ...
2008-10-10 13:10:51 -07:00
Linus Torvalds
c54dcd8ec9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
  selinux: use default proc sid on symlinks
  file capabilities: uninline cap_safe_nice
  Update selinux info in MAINTAINERS and Kconfig help text
  SELinux: add gitignore file for mdp script
  SELinux: add boundary support and thread context assignment
  securityfs: do not depend on CONFIG_SECURITY
  selinux: add support for installing a dummy policy (v2)
  security: add/fix security kernel-doc
  selinux: Unify for- and while-loop style
  selinux: conditional expression type validation was off-by-one
  smack: limit privilege by label
  SELinux: Fix a potentially uninitialised variable in SELinux hooks
  SELinux: trivial, remove unneeded local variable
  SELinux: Trivial minor fixes that change C null character style
  make selinux_write_opts() static
2008-10-10 12:44:43 -07:00
Linus Torvalds
b11ce8a26d Merge branch 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
  sched debug: add name to sched_domain sysctl entries
  sched: sync wakeups vs avg_overlap
  sched: remove redundant code in cpu_cgroup_create()
  sched_rt.c: resch needed in rt_rq_enqueue() for the root rt_rq
  cpusets: scan_for_empty_cpusets(), cpuset doesn't seem to be so const
  sched: minor optimizations in wake_affine and select_task_rq_fair
  sched: maintain only task entities in cfs_rq->tasks list
  sched: fixup buddy selection
  sched: more sanity checks on the bandwidth settings
  sched: add some comments to the bandwidth code
  sched: fixlet for group load balance
  sched: rework wakeup preemption
  CFS scheduler: documentation about scheduling policies
  sched: clarify ifdef tangle
  sched: fix list traversal to use _rcu variant
  sched: turn off WAKEUP_OVERLAP
  sched: wakeup preempt when small overlap
  kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
  kernel/cpu.c: Move the CPU_DYING notifiers
  sched: fix __load_balance_iterator() for cfq with only one task
  ...
2008-10-10 12:42:31 -07:00
Tom Talpey
5675add36e RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls.
Add defensive timeouts to wait_for_completion() calls in RDMA
address resolution, and make them interruptible. Fix the timeout
units to milliseconds (formerly jiffies) and move to private header.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:13:31 -04:00
Tom Talpey
fe9053b30b RPC/RDMA: add data types and new FRMR memory registration enum.
Internal RPC/RDMA structure updates in preparation for FRMR support.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:24 -04:00
Yevgeny Petrilin
a3cdcbfa8f mlx4_core: Add QP range reservation support
To allow allocating an aligned range of consecutive QP numbers, add an
interface to reserve an aligned range of QP numbers and have the QP
allocation function always take a QP number.

This will be used for RSS support in the mlx4_en Ethernet driver and
also potentially by IPoIB RSS support.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-10-10 12:01:37 -07:00
Linus Torvalds
f6bccf6954 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: skcipher - Use RNG interface instead of get_random_bytes
  crypto: rng - RNG interface and implementation
  crypto: api - Add fips_enable flag
  crypto: skcipher - Move IV generators into their own modules
  crypto: cryptomgr - Test ciphers using ECB
  crypto: api - Use test infrastructure
  crypto: cryptomgr - Add test infrastructure
  crypto: tcrypt - Add alg_test interface
  crypto: tcrypt - Abort and only log if there is an error
  crypto: crc32c - Use Intel CRC32 instruction
  crypto: tcrypt - Avoid using contiguous pages
  crypto: api - Display larval objects properly
  crypto: api - Export crypto_alg_lookup instead of __crypto_alg_lookup
  crypto: Kconfig - Replace leading spaces with tabs
2008-10-10 11:20:42 -07:00
Linus Torvalds
13dd7f876d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
  dlm: choose better identifiers
  dlm: remove bkl
  dlm: fix address compare
  dlm: fix locking of lockspace list in dlm_scand
  dlm: detect available userspace daemon
  dlm: allow multiple lockspace creates
2008-10-10 11:13:55 -07:00
Linus Torvalds
b0af205afb Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm: detect lost queue
  dm: publish dm_vcalloc
  dm: publish dm_table_unplug_all
  dm: publish dm_get_mapinfo
  dm: export struct dm_dev
  dm crypt: avoid unnecessary wait when splitting bio
  dm crypt: tidy ctx pending
  dm crypt: fix async inc_pending
  dm crypt: move dec_pending on error into write_io_submit
  dm crypt: remove inc_pending from write_io_submit
  dm crypt: tidy write loop pending
  dm crypt: tidy crypt alloc
  dm crypt: tidy inc pending
  dm exception store: use chunk_t for_areas
  dm exception store: introduce area_location function
  dm raid1: kcopyd should stop on error if errors handled
  dm mpath: remove is_active from struct dm_path
  dm mpath: use more error codes

Fixed up trivial conflict in drivers/md/dm-mpath.c manually.
2008-10-10 11:11:47 -07:00
Linus Torvalds
445e1ceda3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
  GFS2: Support for I/O barriers
  GFS2: Add UUID to GFS2 sb
  GFS2: high time to take some time over atime
  GFS2: The war on bloat
  GFS2: GFS2 will panic if you misspell any mount options
  GFS2: Direct IO write at end of file error
  GFS2: Use an IS_ERR test rather than a NULL test
  GFS2: Fix race relating to glock min-hold time
  GFS2: Fix & clean up GFS2 rename
  GFS2: rm on multiple nodes causes panic
  GFS2: Fix metafs mounts
  GFS2: Fix debugfs glock file iterator
2008-10-10 11:02:22 -07:00
Linus Torvalds
ef5bef357c Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits)
  [SCSI] zfcp: fix double dbf id usage
  [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
  [SCSI] zfcp: fix erp list usage without using locks
  [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
  [SCSI] zfcp: fix deadlock caused by shared work queue tasks
  [SCSI] zfcp: put threshold data in hba trace
  [SCSI] zfcp: Simplify zfcp data structures
  [SCSI] zfcp: Simplify get_adapter_by_busid
  [SCSI] zfcp: remove all typedefs and replace them with standards
  [SCSI] zfcp: attach and release SAN nameserver port on demand
  [SCSI] zfcp: remove unused references, declarations and flags
  [SCSI] zfcp: Update message with input from review
  [SCSI] zfcp: add queue_full sysfs attribute
  [SCSI] scsi_dh: suppress comparison warning
  [SCSI] scsi_dh: add Dell product information into rdac device handler
  [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option
  [SCSI] qla2xxx: fix printk format warnings
  [SCSI] qla2xxx: Update version number to 8.02.01-k8.
  [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing.
  [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling.
  ...
2008-10-10 10:53:26 -07:00
Linus Torvalds
e26feff647 Merge branch 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block: (132 commits)
  doc/cdrom: Trvial documentation error, file not present
  block_dev: fix kernel-doc in new functions
  block: add some comments around the bio read-write flags
  block: mark bio_split_pool static
  block: Find bio sector offset given idx and offset
  block: gendisk integrity wrapper
  block: Switch blk_integrity_compare from bdev to gendisk
  block: Fix double put in blk_integrity_unregister
  block: Introduce integrity data ownership flag
  block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1
  bio.h: Remove unused conditional code
  block: remove end_{queued|dequeued}_request()
  block: change elevator to use __blk_end_request()
  gdrom: change to use __blk_end_request()
  memstick: change to use __blk_end_request()
  virtio_blk: change to use __blk_end_request()
  blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure
  block: add lld busy state exporting interface
  block: Fix blk_start_queueing() to not kick a stopped queue
  include blktrace_api.h in headers_install
  ...
2008-10-10 10:52:45 -07:00
Ingo Molnar
725c25819e Merge branches 'core/iommu', 'x86/amd-iommu' and 'x86/iommu' into x86-v28-for-linus-phase3-B
Conflicts:
	arch/x86/kernel/pci-gart_64.c
	include/asm-x86/dma-mapping.h
2008-10-10 19:47:12 +02:00
Linus Torvalds
82219fceeb Merge branch 'upstream-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  ata_piix: IDE Mode SATA patch for Intel Ibex Peak DeviceIDs
  libata-eh: clear UNIT ATTENTION after reset
  ata_piix: add Hercules EC-900 mini-notebook to ich_laptop short cable list
  libata: reorder ata_device to remove 8 bytes of padding on 64 bits
  [libata] pata_bf54x: Add proper PM operation
  pata_sil680: convert CONFIG_PPC_MERGE to CONFIG_PPC
  libata: Implement disk shock protection support
  [libata] Introduce ata_id_has_unload()
  PATA: RPC now selects HAVE_PATA_PLATFORM for pata platform driver
  ata_piix: drop merged SCR access and use slave_link instead
  libata: implement slave_link
  libata: misc updates to prepare for slave link
  libata: reimplement link iterator
  libata: make SCR access ops per-link
2008-10-10 07:46:45 -07:00
Mikulas Patocka
5416090426 dm: publish dm_vcalloc
Publish dm_vcalloc in include/linux/device-mapper.h because this function is
used by targets.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10 13:37:12 +01:00
Mikulas Patocka
ea0ec64094 dm: publish dm_table_unplug_all
Publish dm_table_unplug_all in include/linux/device-mapper.h because this
function is used by targets.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10 13:37:11 +01:00
Mikulas Patocka
89343da077 dm: publish dm_get_mapinfo
Publish dm_get_mapinfo in include/linux/device-mapper.h because this function
is used by targets.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10 13:37:10 +01:00
Mikulas Patocka
82b1519b34 dm: export struct dm_dev
Split struct dm_dev in two and publish the part that other targets need in
include/linux/device-mapper.h.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-10 13:37:09 +01:00
David Vrabel
99ee3a6d45 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb into for-upstream 2008-10-10 11:47:31 +01:00
Ingo Molnar
8eb95f28f6 Merge commit 'v2.6.27' into timers/hpet 2008-10-10 09:25:29 +02:00
James Morris
9ac684fc38 Merge branch 'next' into for-linus 2008-10-10 11:09:47 +11:00
Russell King
c97f68145e Merge branch 'for-rmk' of git://source.mvista.com/git/linux-davinci-2.6.git
Merge branch 'davinci' into devel
2008-10-09 21:33:05 +01:00
Herbert Xu
e1a8000228 gre: Add Transparent Ethernet Bridging
This patch adds support for Ethernet over GRE encapsulation.
This is exposed to user-space with a new link type of "gretap"
instead of "gre".  It will create an ARPHRD_ETHER device in
lieu of the usual ARPHRD_IPGRE.

Note that to preserver backwards compatibility all Transparent
Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel
if its key matches and there is no ARPHRD_ETHER device whose
key matches more closely.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 12:00:17 -07:00
Herbert Xu
c19e654ddb gre: Add netlink interface
This patch adds a netlink interface that will eventually displace
the existing ioctl interface.  It utilises the elegant rtnl_link_ops
mechanism.

This also means that user-space no longer needs to rely on the
tunnel interface being of type GRE to identify GRE tunnels.  The
identification can now occur using rtnl_link_ops.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 11:59:55 -07:00
venkatesh.pallipadi@intel.com
8083e4ad97 [CPUFREQ][5/6] cpufreq: Changes to get_cpu_idle_time_us(), used by ondemand governor
export get_cpu_idle_time_us() for it to be used in ondemand governor.
Last update time can be current time when the CPU is currently non-idle,
accounting for the busy time since last idle.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:44 -04:00
venkatesh.pallipadi@intel.com
bf0b90e357 [CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()
Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software
cpufreq coordination where policy->cpu may not be same as the CPU on which we
want to getavg frequency.

A follow-on patch will use this parameter to getavg freq from all cpus
in policy->cpus.

Change since last patch. Fix the offline/online and suspend/resume
oops reported by Youquan Song <youquan.song@intel.com>

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:43 -04:00
Ingo Molnar
a5d8c3483a sched debug: add name to sched_domain sysctl entries
add /proc/sys/kernel/sched_domain/cpu0/domain0/name, to make
it easier to see which specific scheduler domain remained at
that entry.

Since we process the scheduler domain tree and
simplify it, it's not always immediately clear during debugging
which domain came from where.

depends on CONFIG_SCHED_DEBUG=y.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-09 17:13:06 +02:00
Jens Axboe
af56394240 block: add some comments around the bio read-write flags
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 09:01:10 +02:00
Denis ChengRq
6feef531f5 block: mark bio_split_pool static
Since all bio_split calls refer the same single bio_split_pool, the bio_split
function can use bio_split_pool directly instead of the mempool_t parameter;

then the mempool_t parameter can be removed from bio_split param list, and
bio_split_pool is only referred in fs/bio.c file, can be marked static.

Signed-off-by: Denis ChengRq <crquan@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:57:05 +02:00
Martin K. Petersen
ad3316bf4e block: Find bio sector offset given idx and offset
Helper function to find the sector offset in a bio given bvec index
and page offset.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:22 +02:00
Martin K. Petersen
b02739b01c block: gendisk integrity wrapper
This is a wrapper for accessing a gendisk's integrity bits.  It allows
the integrity support in MD to be compiled with BLK_DEV_INTEGRITY off.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:22 +02:00
Martin K. Petersen
ad7fce9314 block: Switch blk_integrity_compare from bdev to gendisk
The DM and MD integrity support now depends on being able to use
gendisks instead of block_devices when comparing integrity profiles.
Change function parameters accordingly.

Also update comparison logic so that two NULL profiles are a valid
configuration.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:21 +02:00
Martin K. Petersen
74aa8c2cc0 block: Introduce integrity data ownership flag
A filesystem might supply its own integrity metadata.  Introduce a
flag that indicates whether the filesystem or the block layer owns the
integrity buffer.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:21 +02:00
Jens Axboe
b04accc425 block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1
We need bdev_get_integrity() to support the pending md/dm patches.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:21 +02:00
Alberto Bertogli
8deaf72107 bio.h: Remove unused conditional code
The whole bio_integrity() definition is inside an #ifdef
CONFIG_BLK_DEV_INTEGRITY, there's no need for the conditional code.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:21 +02:00
Kiyoshi Ueda
d00e29fd99 block: remove end_{queued|dequeued}_request()
This patch removes end_queued_request() and end_dequeued_request(),
which are no longer used.

As a results, users of __end_request() became only end_request().
So the actual code in __end_request() is moved to end_request()
and __end_request() is removed.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:21 +02:00
Jens Axboe
0497b345e7 blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure
Define as 32, which is is what BDEVNAME_SIZE is/was as well. This keeps
the user interface the same and gets rid of the difference between
kernel and user api here.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:20 +02:00
Kiyoshi Ueda
ef9e3facdf block: add lld busy state exporting interface
This patch adds an new interface, blk_lld_busy(), to check lld's
busy state from the block layer.
blk_lld_busy() calls down into low-level drivers for the checking
if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().

This resolves a performance problem on request stacking devices below.

Some drivers like scsi mid layer stop dispatching request when
they detect busy state on its low-level device like host/target/device.
It allows other requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the request stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance of merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via q->lld_busy_fn().  So the request stacking driver can check it
and stop dispatching requests if busy.

The underlying device driver must return the busy state appropriately:
    1: when the device driver can't process requests immediately.
    0: when the device driver can process requests immediately,
       including abnormal situations where the device driver needs
       to kill all requests.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:20 +02:00
Sven Schuetz
c0ddffa84a include blktrace_api.h in headers_install
This header file is of interest for user space programming, i.e.
for tools that process blktrace data.

We would like to use it for a tool on-top of blktrace which processes
data provided by blktrace. For this purpose, it would be helpful
if the blktrace API would make it to /usr/include/linux.

The git tree for the blktrace tools comes with its own copy of this header
file. I didn't manage to replace that copy with the file generated
by the patch below yet. A few more cleanups would be needed.
For example, the blktrace ioctl numbers, which are currently defined in
usr/include/fs.h, might need to be moved. Should be feasible, though.

Signed-off-by: Sven Schuetz <sven@linux.vnet.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
Jens Axboe
8bff7c6b0f libata: set queue SSD flag for SSD devices
SSD devices should give an RPM setting of 1 in word 217 of the ID
page. If we see such a device, tell the block layer about it.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
Jens Axboe
a68bbddba4 block: add queue flag for SSD/non-rotational devices
We don't want to idle in AS/CFQ if the device doesn't have a seek
penalty. So add a QUEUE_FLAG_NONROT to indicate a non-rotational
device, low level drivers should set this flag upon discovery of
an SSD or similar device type.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
Keith Wansbrough
9e49184c82 floppy: support arbitrary first-sector numbers
The current floppy_struct allows floppies to number sectors starting
from 0 or 1.  This patch allows arbitrary first-sector numbers - for
example, 0xC1 for Amstrad CPC disks.

This extends the existing 1-bit field (FD_ZEROBASED, bit 2 of stretch)
to 8 bits (FD_SECTMASK, bits 2 to 9).

Currently 0x00 denotes a first sector number of 1, and 0x01 denotes a
first sector number of 0.  We extend this by interpreting FD_SECTMASK
as the first sector number with the LSB flipped.

Signed-off-by: Keith Wansbrough <keith@lochan.org>
Cc: Alain Knaff <alain@linux.lu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
Kiyoshi Ueda
4ee5eaf451 block: add a queue flag for request stacking support
This patch adds a queue flag to indicate the block device can be
used for request stacking.

Request stacking drivers need to stack their devices on top of
only devices of which q->request_fn is functional.
Since bio stacking drivers (e.g. md, loop) basically initialize
their queue using blk_alloc_queue() and don't set q->request_fn,
the check of (q->request_fn == NULL) looks enough for that purpose.

However, dm will become both types of stacking driver (bio-based and
request-based).  And dm will always set q->request_fn even if the dm
device is bio-based of which q->request_fn is not functional actually.
So we need something else to distinguish the type of the device.
Adding a queue flag is a solution for that.

The reason why dm always sets q->request_fn is to keep
the compatibility of dm user-space tools.
Currently, all dm user-space tools are using bio-based dm without
specifying the type of the dm device they use.
To use request-based dm without changing such tools, the kernel
must decide the type of the dm device automatically.
The automatic type decision can't be done at the device creation time
and needs to be deferred until such tools load a mapping table,
since the actual type is decided by dm target type included in
the mapping table.

So a dm device has to be initialized using blk_init_queue()
so that we can load either type of table.
Then, all queue stuffs are set (e.g. q->request_fn) and we have
no element to distinguish that it is bio-based or request-based,
even after a table is loaded and the type of the device is decided.

By the way, some stuffs of the queue (e.g. request_list, elevator)
are needless when the dm device is used as bio-based.
But the memory size is not so large (about 20[KB] per queue on ia64),
so I hope the memory loss can be acceptable for bio-based dm users.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:18 +02:00
Kiyoshi Ueda
82124d6035 block: add request submission interface
This patch adds blk_insert_cloned_request(), a generic request
submission interface for request stacking drivers.
Request-based dm will use it to submit their clones to underlying
devices.

blk_rq_check_limits() is also added because it is possible that
the lower queue has stronger limitations than the upper queue
if multiple drivers are stacking at request-level.
Not only for blk_insert_cloned_request()'s internal use, the function
will be used by request-based dm when the queue limitation is
modified (e.g. by replacing dm's table).

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:18 +02:00
Kiyoshi Ueda
32fab448e5 block: add request update interface
This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.

Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.

  - Request stacking drivers can't use blk_end_request() directly from
    the lower driver's completion context (bio->bi_end_io or rq->end_io),
    because some device drivers (e.g. ide) may try to complete
    their request with queue lock held, and it may cause deadlock.
    See below for detailed description of possible deadlock:
    <http://marc.info/?l=linux-kernel&m=120311479108569&w=2>

  - To solve that, request-based dm offloads the completion of
    cloned struct request to softirq context (i.e. using
    blk_complete_request() from rq->end_io).

  - Though it is possible to use the same solution from bio->bi_end_io,
    it will delay the notification of bio completion to the original
    submitter.  Also, it will cause inefficient partial completion,
    because the lower driver can't perform the cloned request anymore
    and request-based dm needs to requeue and redispatch it to
    the lower driver again later.  That's not good.

  - So request-based dm needs blk_update_request() to perform the bio
    completion in the lower driver's completion context, which is more
    efficient.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:18 +02:00
Jens Axboe
9c02f2b02e block: cleanup some of the integrity stuff in blkdev.h
Don't put functions that are only used in fs/bio-integrity.c in
blkdev.h, it's much cleaner to just keep it in there. Also kill
completely unused bdev_get_tag_size()

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:17 +02:00
Jens Axboe
581d4e28d9 block: add fault injection mechanism for faking request timeouts
Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:17 +02:00
Jens Axboe
0a0d96b03a block: add bio_kmalloc()
Not all callers need (or want!) the mempool backing guarentee, it
essentially means that you can only use bio_alloc() for short allocations
and not for preallocating some bio's at setup or init time.

So add bio_kmalloc() which does the same thing as bio_alloc(), except
it just uses kmalloc() as the backing instead of the bio mempools.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:17 +02:00
Hugh Dickins
3e6053d76d block: adjust blkdev_issue_discard for swap
Two mods to blkdev_issue_discard(), thinking ahead to its use on swap:

1. Add gfp_mask argument, so swap allocation can use it where GFP_KERNEL
   might deadlock but GFP_NOIO is safe.

2. Enlarge nr_sects argument from unsigned to sector_t: unsigned long is
   enough to cover a whole swap area, but sector_t suits any partition.

Change sb_issue_discard()'s nr_blocks to sector_t too; but no need seen
for a gfp_mask there, just pass GFP_KERNEL down to blkdev_issue_discard().

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:17 +02:00
Mike Anderson
11914a53d2 block: Add interface to abort queued requests
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
Jens Axboe
242f9dcb8b block: unify request timeout handling
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
Andrew Patterson
c3279d1454 Adjust block device size after an online resize of a disk.
The revalidate_disk routine now checks if a disk has been resized by
comparing the gendisk capacity to the bdev inode size.  If they are
different (usually because the disk has been resized underneath the kernel)
the bdev inode size is adjusted to match the capacity.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
Andrew Patterson
0c002c2f74 Wrapper for lower-level revalidate_disk routines.
This is a wrapper for the lower-level revalidate_disk call-backs such
as sd_revalidate_disk(). It allows us to perform pre and post
operations when calling them.

We will use this wrapper in a later patch to adjust block device sizes
after an online resize (a _post_ operation).

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:12 +02:00
FUJITA Tomonori
818827669d block: make blk_rq_map_user take a NULL user-space buffer
This patch changes blk_rq_map_user to accept a NULL user-space buffer
with a READ command if rq_map_data is not NULL. Thus a caller can pass
page frames to lk_rq_map_user to just set up a request and bios with
page frames propely. bio_uncopy_user (called via blk_rq_unmap_user)
doesn't copy data to user space with such request.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
FUJITA Tomonori
879040742c block: add blk_rq_aligned helper function
This adds blk_rq_aligned helper function to see if alignment and
padding requirement is satisfied for DMA transfer. This also converts
blk_rq_map_kern and __blk_rq_map_user to use the helper function.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:11 +02:00
FUJITA Tomonori
152e283fdf block: introduce struct rq_map_data to use reserved pages
This patch introduces struct rq_map_data to enable bio_copy_use_iov()
use reserved pages.

Currently, bio_copy_user_iov allocates bounce pages but
drivers/scsi/sg.c wants to allocate pages by itself and use
them. struct rq_map_data can be used to pass allocated pages to
bio_copy_user_iov.

The current users of bio_copy_user_iov simply passes NULL (they don't
want to use pre-allocated pages).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
FUJITA Tomonori
a3bce90edd block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:10 +02:00
Jens Axboe
ab780f1ece block: inherit CPU completion on bio->rq and rq->rq merges
Somewhat incomplete, as we do allow merges of requests and bios
that have different completion CPUs given. This is done on the
assumption that a larger IO is still more beneficial than CPU
locality.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
c7c22e4d5c block: add support for IO CPU affinity
This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Jens Axboe
18887ad910 block: make kblockd_schedule_work() take the queue as parameter
Preparatory patch for checking queuing affinity.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:09 +02:00
Tejun Heo
3e1a7ff8a0 block: allow disk to have extended device number
Now that disk and partition handlings are mostly unified, it's easy to
allow disk to have extended device number.  This patch makes
add_disk() use extended device number if disk->minors is zero.  Both
sd and ide-disk are updated to use this.

* sd_format_disk_name() is implemented which can generically determine
  the drive name.  This removes disk number restriction stemming from
  limited device names.

* If sd index goes over SD_MAX_DISKS (which can be increased now BTW),
  sd simply doesn't initialize minors letting block layer choose
  extended device number.

* If CONFIG_DEBUG_EXT_DEVT is set, both sd and ide-disk always set
  minors to 0 and use extended device numbers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
689d6fac40 block: replace @ext_minors with GENHD_FL_EXT_DEVT
With previous changes, it's meaningless to limit the number of
partitions.  Replace @ext_minors with GENHD_FL_EXT_DEVT such that
setting the flag allows the disk to have maximum number of allowed
partitions (only limited by the number of entries in parsed_partitions
as determined by MAX_PART constant).

This kills not-too-pretty alloc_disk_ext[_node]() functions and makes
@minors parameter to alloc_disk[_node]() unnecessary.  The parameter
is left alone to avoid disturbing the users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
540eed5637 block: make partition array dynamic
disk->__part used to be statically allocated to the maximum possible
number of partitions.  This patch makes partition array allocation
dynamic.  The added overhead is minimal as only real change is one
memory dereference changed to RCU one.  This saves both a bit of
memory and cpu cycles iterating through unoccupied slots and makes
increasing partition limit easier.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
074a7aca7a block: move stats from disk to part0
Move stats related fields - stamp, in_flight, dkstats - from disk to
part0 and unify stat handling such that...

* part_stat_*() now updates part0 together if the specified partition
  is not part0.  ie. part_stat_*() are now essentially all_stat_*().

* {disk|all}_stat_*() are gone.

* part_round_stats() is updated similary.  It handles part0 stats
  automatically and disk_round_stats() is killed.

* part_{inc|dec}_in_fligh() is implemented which automatically updates
  part0 stats for parts other than part0.

* disk_map_sector_rcu() is updated to return part0 if no part matches.
  Combined with the above changes, this makes NULL special case
  handling in callers unnecessary.

* Separate stats show code paths for disk are collapsed into part
  stats show code paths.

* Rename disk_stat_lock/unlock() to part_stat_lock/unlock()

While at it, reposition stat handling macros a bit and add missing
parentheses around macro parameters.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
eddb2e26b5 block: kill GENHD_FL_FAIL and use part0->make_it_fail
GENHD_FL_FAIL for disk is what make_it_fail is for parts.  Kill it and
use part0->make_it_fail.  Sysfs node handling is unified too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
0762b8bde9 block: always set bdev->bd_part
Till now, bdev->bd_part is set only if the bdev was for parts other
than part0.  This patch makes bdev->bd_part always set so that code
paths don't have to differenciate common handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
4c46501d16 block: move holder_dir from disk to part0
Move disk->holder_dir to part0->holder_dir.  Kill now mostly
superflous bdev_get_holder().

While at it, kill superflous kobject_get/put() around holder_dir,
slave_dir and cmd_filter creation and collapse
disk_sysfs_add_subdirs() into register_disk().  These serve no purpose
but obfuscating the code.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo
b7db9956e5 block: move policy from disk to part0
Move disk->policy to part0->policy.  Implement and use get_disk_ro().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
e561052149 block: unify sysfs size node handling
Now that capacity and __dev are moved to part0, part0 and others can
share the same method.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
548b10eb29 block: move __dev from disk to part0
Move disk->__dev to part0->__dev.  This simplifies bdget_disk() and
lookup_devt() and allows common sysfs attributes to be unified.
part_to_disk() is updated to handle part0 -> disk.

Updated to include a fix from Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
he writes:

"part0 is a "special" partition and doesn't need to have capacity set - this
fixes regression caused by "block: move __dev from disk to part0" commit."

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
80795aefb7 block: move capacity from disk to part0
Move disk->capacity to part0->nr_sects and convert all users who
directly accessed the field to use {get|set}_capacity().  This is done
early to allow the __dev field to be moved.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
b5d0b9df0b block: introduce partition 0
genhd and partition code handled disk and partitions separately.  All
information about the whole disk was in struct genhd and partitions in
struct hd_struct.  However, the whole disk (part0) and other
partitions have a lot in common and the data structures end up having
good number of common fields and thus separate code paths doing the
same thing.  Also, the partition array was indexed by partno - 1 which
gets pretty confusing at times.

This patch introduces partition 0 and makes the partition array
indexed by partno.  Following patches will unify the handling of disk
and parts piece-by-piece.

This patch also implements disk_partitionable() which tests whether a
disk is partitionable.  With coming dynamic partition array change,
the most common usage of disk_max_parts() will be testing whether a
disk is partitionable and the number of max partitions will become
much less important.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
ed9e198234 block: implement and use {disk|part}_to_dev()
Implement {disk|part}_to_dev() and use them to access generic device
instead of directly dereferencing {disk|part}->dev.  To make sure no
user is left behind, rename generic devices fields to __dev.

This is in preparation of unifying partition 0 handling with other
partitions.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo
1f0142905d block: adjust formatting for large minors and add ext_range sysfs attr
With extended minors and the soon-to-follow debug feature, large minor
numbers for block devices will be common.  This patch does the
followings to make printouts pretty.

* Adapt print formats such that large minors don't break the
  formatting.

* For extended MAJ:MIN, %02x%02x for MAJ:MIN used in
  printk_all_partitions() doesn't cut it anymore.  Update it such that
  %03x:%05x is used if either MAJ or MIN doesn't fit in %02x.

* Implement ext_range sysfs attribute which shows total minors the
  device can use including both conventional minor space and the
  extended one.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo
bcce3de1be block: implement extended dev numbers
Implement extended device numbers.  A block driver can tell block
layer that it wants to use extended device numbers.  After the usual
minor space is used up, block layer automatically allocates devt's
from EXT_BLOCK_MAJOR.

Currently only one major number is allocated for this but as the
allocation is strictly on-demand, ~1mil minor space under it should
suffice unless the system actually has more than ~1mil partitions and
if that ever happens adding more majors to the extended devt area is
easy.

Due to internal implementation issues, the first partition can't be
allocated on the extended area.  In other words, genhd->minors should
at least be 1.  This limitation will be lifted by later changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo
c995905916 block: fix diskstats access
There are two variants of stat functions - ones prefixed with double
underbars which don't care about preemption and ones without which
disable preemption before manipulating per-cpu counters.  It's unclear
whether the underbarred ones assume that preemtion is disabled on
entry as some callers don't do that.

This patch unifies diskstats access by implementing disk_stat_lock()
and disk_stat_unlock() which take care of both RCU (for partition
access) and preemption (for per-cpu counter access).  diskstats access
should always be enclosed between the two functions.  As such, there's
no need for the versions which disables preemption.  They're removed
and double underbars ones are renamed to drop the underbars.  As an
extra argument is added, there's no danger of using the old version
unconverted.

disk_stat_lock() uses get_cpu() and returns the cpu index and all
diskstat functions which access per-cpu counters now has @cpu
argument to help RT.

This change adds RCU or preemption operations at some places but also
collapses several preemption ops into one at others.  Overall, the
performance difference should be negligible as all involved ops are
very lightweight per-cpu ones.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo
e71bf0d0ee block: fix disk->part[] dereferencing race
disk->part[] is protected by its matching bdev's lock.  However,
non-critical accesses like collecting stats and printing out sysfs and
proc information used to be performed without any locking.  As
partitions can come and go dynamically, partitions can go away
underneath those non-critical accesses.  As some of those accesses are
writes, this theoretically can lead to silent corruption.

This patch fixes the race by using RCU for the partition array and dev
reference counter to hold partitions.

* Rename disk->part[] to disk->__part[] to make sure no one outside
  genhd layer proper accesses it directly.

* Use RCU for disk->__part[] dereferencing.

* Implement disk_{get|put}_part() which can be used to get and put
  partitions from gendisk respectively.

* Iterators are implemented to help iterate through all partitions
  safely.

* Functions which require RCU readlock are marked with _rcu suffix.

* Use disk_put_part() in __blkdev_put() instead of directly putting
  the contained kobject.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo
f331c0296f block: don't depend on consecutive minor space
* Implement disk_devt() and part_devt() and use them to directly
  access devt instead of computing it from ->major and ->first_minor.

  Note that all references to ->major and ->first_minor outside of
  block layer is used to determine devt of the disk (the part0) and as
  ->major and ->first_minor will continue to represent devt for the
  disk, converting these users aren't strictly necessary.  However,
  convert them for consistency.

* Implement disk_max_parts() to avoid directly deferencing
  genhd->minors.

* Update bdget_disk() such that it doesn't assume consecutive minor
  space.

* Move devt computation from register_disk() to add_disk() and make it
  the only one (all other usages use the initially determined value).

These changes clean up the code and will help disk->part dereference
fix and extended block device numbers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:05 +02:00
Tejun Heo
cf771cb5a7 block: make variable and argument names more consistent
In hd_struct, @partno is used to denote partition number and a number
of other places use @part to denote hd_struct.  Functions use @part
and @index instead.  This causes confusion and makes it difficult to
use consistent variable names for hd_struct.  Always use @partno if a
variable represents partition number.

Also, print out functions use @f or @part for seq_file argument.  Use
@seqf uniformly instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:05 +02:00
Tejun Heo
310a2c1012 block: misc updates
This patch makes the following misc updates in preparation for
disk->part dereference fix and extended block devt support.

* implment part_to_disk()

* fix comment about gendisk->part indexing

* rename get_part() to disk_map_sector()

* don't use n which is always zero while printing disk information in
  diskstats_show()

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:04 +02:00
Tejun Heo
5a3ceb8616 driver-core: use klist for class device list and implement iterator
Iterating over entries using callback usually isn't too fun especially
when the entry being iterated over can't be manipulated freely.  This
patch converts class->p->class_devices to klist and implements class
device iterator so that the users can freely build their own control
structure.  The users are also free to call back into class code
without worrying about locking.

class_for_each_device() and class_find_device() are converted to use
the new iterators, so their users don't have to worry about locking
anymore either.

Note: This depends on klist-dont-iterate-over-deleted-entries patch
because class_intf->add/remove_dev() depends on proper synchronization
with device removal.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:04 +02:00
Tejun Heo
a1ed5b0cff klist: don't iterate over deleted entries
A klist entry is kept on the list till all its current iterations are
finished; however, a new iteration after deletion also iterates over
deleted entries as long as their reference count stays above zero.
This causes problems for cases where there are users which iterate
over the list while synchronized against list manipulations and
natuarally expect already deleted entries to not show up during
iteration.

This patch implements dead flag which gets set on deletion so that
iteration can skip already deleted entries.  The dead flag piggy backs
on the lowest bit of knode->n_klist and only visible to klist
implementation proper.

While at it, drop klist_iter->i_head as it's redundant and doesn't
offer anything in semantics or performance wise as klist_iter->i_klist
is dereferenced on every iteration anyway.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:04 +02:00
Jens Axboe
5b99c2ffa9 block: make bi_phys_segments an unsigned int instead of short
raid5 can overflow with more than 255 stripes, and we can increase it
to an int for free on both 32 and 64-bit archs due to the padding.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:03 +02:00
Mikulas Patocka
5df97b91b5 drop vmerge accounting
Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:03 +02:00
Mikulas Patocka
b8b3e16cfe block: drop virtual merging accounting
Remove virtual merge accounting.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:03 +02:00
Fernando Luis Vázquez Cao
766ca4428d virtio_blk: use a wrapper function to access io context information of IO requests
struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about future changes in block layer
structures or what the currently active IO controller is.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
David Woodhouse
1a8e2bddd5 Kill REQ_TYPE_FLUSH
It was only used by ps3disk, and it should probably have been
REQ_TYPE_LINUX_BLOCK + REQ_LB_OP_FLUSH.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
David Woodhouse
e17fc0a1cc Allow elevators to sort/merge discard requests
But blkdev_issue_discard() still emits requests which are interpreted as
soft barriers, because naïve callers might otherwise issue subsequent
writes to those same sectors, which might cross on the queue (if they're
reallocated quickly enough).

Callers still _can_ issue non-barrier discard requests, but they have to
take care of queue ordering for themselves.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
David Woodhouse
d30a2605be Add BLKDISCARD ioctl to allow userspace to discard sectors
We may well want mkfs tools to use this to mark the whole device as
unwanted before they format it, for example.

The ioctl takes a pair of uint64_ts, which are start offset and length
in _bytes_. Although at the moment it might make sense for them both to
be in 512-byte sectors, I don't want to limit the ABI to that.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
David Woodhouse
27b29e86bf blktrace: support discard requests
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:01 +02:00
David Woodhouse
eae9acd13a Support 'discard sectors' operation in translation layer support core
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:01 +02:00
David Woodhouse
fb2dce862d Add 'discard' request handling
Some block devices benefit from a hint that they can forget the contents
of certain sectors. Add basic support for this to the block core, along
with a 'blkdev_issue_discard()' helper function which issues such
requests.

The caller doesn't get to provide an end_io functio, since
blkdev_issue_discard() will automatically split the request up into
multiple bios if appropriate. Neither does the function wait for
completion -- it's expected that callers won't care about when, or even
_if_, the request completes. It's only a hint to the device anyway. By
definition, the file system doesn't _care_ about these sectors any more.

[With feedback from OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> and
Jens Axboe <jens.axboe@oracle.com]

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:01 +02:00
David Woodhouse
d628eaef31 Fix up comments about matching flags between bio and rq
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:01 +02:00
Jens Axboe
a9c701e594 block: use bio_has_data() to check for data carrying bio
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:00 +02:00
Jens Axboe
7a67f63b32 block: add bio_has_data() to detect whether a bio carries data or not
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:00 +02:00
Lennert Buytenhek
396138f03f dsa: add support for Trailer tagging format
This adds support for the Trailer switch tagging format.  This is
another tagging that doesn't explicitly mark tagged packets with a
distinct ethertype, so that we need to add a similar hack in the
receive path as for the Original DSA tagging format.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Byron Bradley <byron.bbradley@gmail.com>
Tested-by: Tim Ellis <tim.ellis@mac.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:24:16 -07:00
Lennert Buytenhek
cf85d08fdf dsa: add support for original DSA tagging format
Most of the DSA switches currently in the field do not support the
Ethertype DSA tagging format that one of the previous patches added
support for, but only the original DSA tagging format.

The original DSA tagging format carries the same information as the
Ethertype DSA tagging format, but with the difference that it does not
have an ethertype field.  In other words, when receiving a packet that
is tagged with an original DSA tag, there is no way of telling in
eth_type_trans() that this packet is in fact a DSA-tagged packet.

This patch adds a hook into eth_type_trans() which is only compiled in
if support for a switch chip that doesn't support Ethertype DSA is
selected, and which checks whether there is a DSA switch driver
instance attached to this network device which uses the old tag format.
If so, it sets the protocol field to ETH_P_DSA without looking at the
packet, so that the packet ends up in the right place.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Nicolas Pitre <nico@marvell.com>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:19:56 -07:00
Lennert Buytenhek
91da11f870 net: Distributed Switch Architecture protocol support
Distributed Switch Architecture is a protocol for managing hardware
switch chips.  It consists of a set of MII management registers and
commands to configure the switch, and an ethernet header format to
signal which of the ports of the switch a packet was received from
or is intended to be sent to.

The switches that this driver supports are typically embedded in
access points and routers, and a typical setup with a DSA switch
looks something like this:

	+-----------+       +-----------+
	|           | RGMII |           |
	|           +-------+           +------ 1000baseT MDI ("WAN")
	|           |       |  6-port   +------ 1000baseT MDI ("LAN1")
	|    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
	|           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
	|           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
	|           |       |           |
	+-----------+       +-----------+

The switch driver presents each port on the switch as a separate
network interface to Linux, polls the switch to maintain software
link state of those ports, forwards MII management interface
accesses to those network interfaces (e.g. as done by ethtool) to
the switch, and exposes the switch's hardware statistics counters
via the appropriate Linux kernel interfaces.

This initial patch supports the MII management interface register
layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and
supports the "Ethertype DSA" packet tagging format.

(There is no officially registered ethertype for the Ethertype DSA
packet format, so we just grab a random one.  The ethertype to use
is programmed into the switch, and the switch driver uses the value
of ETH_P_EDSA for this, so this define can be changed at any time in
the future if the one we chose is allocated to another protocol or
if Ethertype DSA gets its own officially registered ethertype, and
everything will continue to work.)

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Nicolas Pitre <nico@marvell.com>
Tested-by: Byron Bradley <byron.bbradley@gmail.com>
Tested-by: Tim Ellis <tim.ellis@mac.com>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:15:19 -07:00
Lennert Buytenhek
2e88810329 phylib: add mdiobus_{read,write}
Add mdiobus_{read,write} routines to allow direct reading/writing
of registers on an mii bus without having to go through the PHY
abstraction, and make phy_{read,write} use these primitives.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 16:38:41 -07:00
Lennert Buytenhek
46abc02175 phylib: give mdio buses a device tree presence
Introduce the mdio_bus class, and give each 'struct mii_bus' its own
'struct device', so that mii_bus objects are represented in the device
tree and can be found by querying the device tree.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 16:33:40 -07:00
Lennert Buytenhek
298cf9beb9 phylib: move to dynamic allocation of struct mii_bus
This patch introduces mdiobus_alloc() and mdiobus_free(), and
makes all mdio bus drivers use these functions to allocate their
struct mii_bus'es dynamically.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Andy Fleming <afleming@freescale.com>
2008-10-08 16:29:57 -07:00
Lennert Buytenhek
18ee49ddb0 phylib: rename mii_bus::dev to mii_bus::parent
In preparation of giving mii_bus objects a device tree presence of
their own, rename struct mii_bus's ->dev argument to ->parent, since
having a 'struct device *dev' that points to our parent device
conflicts with introducing a 'struct device dev' representing our own
device.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Andy Fleming <afleming@freescale.com>
2008-10-08 16:27:49 -07:00
J. Bruce Fields
107e0008df Merge branch 'from-tomtucker' into for-2.6.28 2008-10-08 18:22:18 -04:00
Ingo Molnar
cdbb92b31d Merge branch 'linus' into core/rcu 2008-10-09 00:17:25 +02:00
David S. Miller
4dd565134e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/e1000e/ich8lan.c
	drivers/net/e1000e/netdev.c
2008-10-08 14:56:41 -07:00
Trond Myklebust
19d771f3ca NFS: Save padding bytes in struct nfs4_setclientid
Peter Staubach suggested reducing NFS4_SETCLIENTID_NAMELEN by one byte so
as to avoid 7 bytes of unnecessary padding.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-08 13:54:52 -04:00
David S. Miller
364ae953a4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2008-10-08 09:50:38 -07:00
Jan Engelhardt
916a917dfe netfilter: xtables: provide invoked family value to extensions
By passing in the family through which extensions were invoked, a bit
of data space can be reclaimed. The "family" member will be added to
the parameter structures and the check functions be adjusted.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:20 +02:00
Jan Engelhardt
a2df1648ba netfilter: xtables: move extension arguments into compound structure (6/6)
This patch does this for target extensions' destroy functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt
af5d6dc200 netfilter: xtables: move extension arguments into compound structure (5/6)
This patch does this for target extensions' checkentry functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt
7eb3558655 netfilter: xtables: move extension arguments into compound structure (4/6)
This patch does this for target extensions' target functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt
6be3d8598e netfilter: xtables: move extension arguments into compound structure (3/6)
This patch does this for match extensions' destroy functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt
9b4fce7a35 netfilter: xtables: move extension arguments into compound structure (2/6)
This patch does this for match extensions' checkentry functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:18 +02:00
Jan Engelhardt
f7108a20de netfilter: xtables: move extension arguments into compound structure (1/6)
The function signatures for Xtables extensions have grown over time.
It involves a lot of typing/replication, and also a bit of stack space
even if they are not used. Realize an NFWS2008 idea and pack them into
structs. The skb remains outside of the struct so gcc can continue to
apply its optimizations.

This patch does this for match extensions' match functions.

A few ambiguities have also been addressed. The "offset" parameter for
example has been renamed to "fragoff" (there are so many different
offsets already) and "protoff" to "thoff" (there is more than just one
protocol here, so clarify).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:18 +02:00
Jan Engelhardt
367c679007 netfilter: xtables: do centralized checkentry call (1/2)
It used to be that {ip,ip6,etc}_tables called extension->checkentry
themselves, but this can be moved into the xtables core.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:17 +02:00
Jan Engelhardt
66bff35b72 netfilter: remove unused Ebtables functions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:16 +02:00
Jan Engelhardt
043ef46c76 netfilter: move Ebtables to use Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:15 +02:00
Jan Engelhardt
2d06d4a5cc netfilter: change Ebtables function signatures to match Xtables's
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:15 +02:00
Jan Engelhardt
001a18d369 netfilter: add dummy members to Ebtables code to ease transition to Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:14 +02:00
Jan Engelhardt
0ac6ab1f79 netfilter: Change return types of targets/watchers for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:13 +02:00
Jan Engelhardt
8cc784eec6 netfilter: change return types of match functions for ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:13 +02:00
Jan Engelhardt
19eda879a1 netfilter: change return types of check functions for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:13 +02:00
Jan Engelhardt
18219d3f7d netfilter: ebtables: do centralized size checking
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:13 +02:00
KOVACS Krisztian
e84392707e netfilter: iptables TPROXY target
The TPROXY target implements redirection of non-local TCP/UDP traffic to local
sockets. Additionally, it's possible to manipulate the packet mark if and only
if a socket has been found. (We need this because we cannot use multiple
targets in the same iptables rule.)

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:12 +02:00
Alexey Dobriyan
3bb0d1c00f netfilter: netns nf_conntrack: GRE conntracking in netns
* make keymap list per-netns
* per-netns keymal lock (not strictly necessary)
* flush keymap at netns stop and module unload.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:10 +02:00
Alexey Dobriyan
48dc7865aa netfilter: netns: remove nf_*_net() wrappers
Now that dev_net() exists, the usefullness of them is even less. Also they're
a big problem in resolving circular header dependencies necessary for
NOTRACK-in-netns patch. See below.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:01 +02:00
Jan Engelhardt
7e9c6eeb13 netfilter: Introduce NFPROTO_* constants
The netfilter subsystem only supports a handful of protocols (much
less than PF_*) and even non-PF protocols like ARP and
pseudo-protocols like PF_BRIDGE. By creating NFPROTO_*, we can earn a
few memory savings on arrays that previously were always PF_MAX-sized
and keep the pseudo-protocols to ourselves.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:00 +02:00
Jan Engelhardt
e948b20a71 netfilter: rename ipt_recent to xt_recent
Like with other modules (such as ipt_state), ipt_recent.h is changed
to forward definitions to (IOW include) xt_recent.h, and xt_recent.c
is changed to use the new constant names.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:00 +02:00
Jan Engelhardt
76108cea06 netfilter: Use unsigned types for hooknum and pf vars
and (try to) consistently use u_int8_t for the L3 family.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:00 +02:00
Ingo Molnar
990d0f2ced Merge branches 'sched/devel', 'sched/cpu-hotplug', 'sched/cpusets' and 'sched/urgent' into sched/core 2008-10-08 11:31:02 +02:00
Rami Rosen
b8bae41ed6 ipv4: add mc_count to in_device.
This patch add mc_count to struct in_device and updates
increment/decrement/initilaize of this field in IPv4 and in IPv6.

- Also printing the vfs /proc entry (/proc/net/igmp) is adjusted to
use the new mc_count.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 15:34:37 -07:00
Chuck Lever
d1ce02e168 NFS: SETCLIENTID truncates client ID and netid
The sc_name field is currently 56 bytes long.  This is not large enough
to hold a pair of IPv6 addresses, the authentication type, the protocol
name, and a uniquifier number.  The maximum possible size of the name
string using IPv6 addresses is just under 110 bytes, so I increased the
size of the sc_name field to accomodate this maximum.

In addition, the strings in the nfs4_setclientid structure are
constructed with scnprintf(), which wants to terminate its output with
'\0'.  The sc_netid field was large enough only for a three byte netid
string and a '\0' so inet6 netids were being truncated.  Perhaps we
don't need the overhead of scnprintf() to do a simple string copy, but
I fixed this by increasing the size of the buffer by one byte.

Since all three of the string buffers in nfs4_setclientid are
constructed with scnprintf(), I increased the size of all three by one
byte to document the requirement, although I don't think either the
universal address field or the name field will be so small that these
strings get truncated in this way.

The size of the Linux client's client ID on the wire will be larger
than before.  RFC 3530 suggests the size limit for client IDs is 1024,
and we are still well below that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-07 18:18:48 -04:00
Richard Kennedy
9fa8d66f1e NFS: remove 8 bytes of padding from struct nfs_fattr on 64 bit builds
remove 8 bytes of padding from struct nfs_fattr on 64 bit builds

This also removes padding from several nfs structures, including
16 bytes from  nfs4_opendata, nfs4_createdata,nfs3_createdata
& 8 bytes from nfs_read_data,nfs_write_data,nfs_removeres,nfs4_closedata

This also reduces the reported stack usage of many nfs functions (30+).

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
----

This patch is against the latest git 2.6.27-rc4.
I've built & run this on my AMD64 desktop, & successfully run _simple_
tests with a  64 bit client => 32 bit server & 32 bit client to 64 bit
server.

On fedora with gcc (GCC) 4.3.0 20080428 (Red Hat 4.3.0-8) checkpatch
reports 33 functions with reduced stack usage.
e.g.
__nfs_revalidate_inode [nfs] 216 => 200
_nfs4_proc_access [nfs] 304 => 288
_nfs4_proc_link [nfs] 536 => 504
_nfs4_proc_remove [nfs] 304 => 288
_nfs4_proc_rename [nfs] 584 => 552
nfs3_proc_access [nfs] 272 => 256
nfs3_proc_getacl [nfs] 384 => 368
nfs3_proc_link [nfs] 496 => 464
etc
I can supply the complete list if anyone is interested.

regards
Richard
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-07 18:17:51 -04:00
Trond Myklebust
691beb13cd NFS: Allow concurrent inode revalidation
Currently, if two processes are both trying to revalidate metadata for the
same inode, they will find themselves being serialised. There is no good
justification for this now that we have improved our ability to detect
stale attribute data, so we should remove that serialisation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-07 17:59:43 -04:00
Ilpo Järvinen
33f5f57eeb tcp: kill pointless urg_mode
It all started from me noticing that this urgent check in
tcp_clean_rtx_queue is unnecessarily inside the loop. Then
I took a longer look to it and found out that the users of
urg_mode can trivially do without, well almost, there was
one gotcha.

Bonus: those funny people who use urg with >= 2^31 write_seq -
snd_una could now rejoice too (that's the only purpose for the
between being there, otherwise a simple compare would have done
the thing). Not that I assume that the rest of the tcp code
happily lives with such mind-boggling numbers :-). Alas, it
turned out to be impossible to set wmem to such numbers anyway,
yes I really tried a big sendfile after setting some wmem but
nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf...
So I hacked a bit variable to long and found out that it seems
to work... :-)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 14:43:06 -07:00
Peter Zijlstra
654bed16cf net: packet split receive api
Add some packet-split receive hooks.

For one this allows to do NUMA node affine page allocs. Later on these
hooks will be extended to do emergency reserve allocations for
fragments.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07 14:22:33 -07:00
Trond Myklebust
4eec952e42 NFS: Add options for finer control of the lookup cache
Add the flag NFS_MOUNT_LOOKUP_CACHE_NONEG to turn off the caching of
negative dentries. In reality what we do is to force
nfs_lookup_revalidate() to always discard negative dentries.

Add the flag NFS_MOUNT_LOOKUP_CACHE_NONE for enforcing stricter
revalidation of dentries. It forces the revalidate code to always do a
lookup instead of just checking the cached mtime of the parent directory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-07 17:22:20 -04:00
Russell King
cc513ac0f2 Merge branch 'viper-for-rmk' of git://www.misterjones.org/linux-2.6-arm
Merge branch 'pxa-viper' into pxa-machines

Conflicts:

	arch/arm/mach-pxa/Makefile
	drivers/pcmcia/Kconfig
	drivers/pcmcia/Makefile
2008-10-07 19:08:32 +01:00
Russell King
1543966a07 Merge branch 'pxa-palm' into pxa-machines
Conflicts:

	drivers/mfd/Kconfig
	drivers/pcmcia/Makefile
2008-10-07 19:07:22 +01:00
Arjan van de Ven
2075eb8d95 rangetimer: fix x86 build failure for the !HRTIMERS case
the timer peek function was on the wrong side of an ifdef,
breaking for the !HRTIMERs case. Just provide an empty inline
for that case since it doesn't make sense in that scenario.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-10-07 10:57:54 -07:00
Geert Uytterhoeven
1b483a6a7b powerpc: Remove remains of /proc/ppc_htab
commit 14cf11af6c ("powerpc: Merge enough to
start building in arch/powerpc.") unwired /proc/ppc_htab, and commit
917f0af9e5 ("powerpc: Remove arch/ppc and
include/asm-ppc") removed the rest of the /proc/ppc_htab support, but there are
still a few references left. Kill them for good.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-10-07 14:26:19 +11:00
Trond Myklebust
1daef0a868 NFS: Clean up nfs_sb_active/nfs_sb_deactive
Instead of causing umount requests to block on server->active_wq while the
asynchronous sillyrename deletes are executing, we can use the sb->s_active
counter to obtain a reference to the super_block, and then release that
reference in nfs_async_unlink_release().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-06 20:08:26 -04:00
Tom Tucker
146b6df6a5 svcrdma: Modify the RPC recv path to use FRMR when available
RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using
an FRMR to map the data sink improves NFSRDMA security on transports that
place the RDMA_READ data sink LKEY on the wire because the valid lifetime
of the MR is only the duration of the RDMA_READ. The LKEY is invalidated
when the last RDMA_READ WR completes.

Mapping the data sink also allows for very large amounts to data to be
fetched with a single WR, so if the client is also using FRMR, the entire
RPC read-list can be fetched with a single WR.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-10-06 14:46:01 -05:00
Tom Tucker
e118321062 svcrdma: Add a service to register a Fast Reg MR with the device
Fast Reg MR introduces a new WR type. Add a service to register the
region with the adapter and update the completion handling to support
completions with a NULL WR context.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-10-06 14:45:49 -05:00
Tom Tucker
64be8608c1 svcrdma: Add FRMR get/put services
Add services for the allocating, freeing, and unmapping Fast Reg MR. These
services will be used by the transport connection setup, send and receive
routines.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-10-06 14:45:18 -05:00
Thomas Gleixner
1e19b16a30 AMD IOMMU: use iommu_device_max_index, fix
include/linux/iommu-helper.h has no header guards, which breaks
sparc64 build. Add them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-06 16:51:30 +02:00
Ingo Molnar
2c10c22af0 Merge branch 'linus' into sched/devel 2008-10-06 08:13:18 +02:00
Arnaud Ebalard
13c1d18931 xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)
Provides implementation of the enhancements of XFRM/PF_KEY MIGRATE mechanism
specified in draft-ebalard-mext-pfkey-enhanced-migrate-00. Defines associated
PF_KEY SADB_X_EXT_KMADDRESS extension and XFRM/netlink XFRMA_KMADDRESS
attribute.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05 13:33:42 -07:00
Rémi Denis-Courmont
02a47617cd Phonet: implement GPRS virtual interface over PEP socket
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05 11:16:16 -07:00
Rémi Denis-Courmont
9641458d3e Phonet: Pipe End Point for Phonet Pipes protocol
This protocol provides some connection handling and negotiated
congestion control. Nokia cellular modems use it for bulk transfers.
It provides packet boundaries (hence SOCK_SEQPACKET). Congestion
control is per packet rather per byte, so we do not re-use the
generic socket memory accounting.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05 11:15:13 -07:00
Borislav Petkov
f20f258603 ide-cd: temporary tray close fix
This one fixes http://bugzilla.kernel.org/show_bug.cgi?id=11602.

A more generic fix for drives which cannot autoclose tray will follow.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
[bart: add an extra parentheses for consistency with the rest of kernel code]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-05 18:23:27 +02:00
Chuck Lever
2937391385 NLM: Remove unused argument from svc_addsock() function
Clean up: The svc_addsock() function no longer uses its "proto"
argument, so remove it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-04 17:12:27 -04:00
Chuck Lever
26a4140923 NLM: Remove "proto" argument from lockd_up()
Clean up: Now that lockd_up() starts listeners for both transports, the
"proto" argument is no longer needed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-04 17:12:27 -04:00
Andrew Morton
897312bd24 include/linux/stacktrace.h: declare struct task_struct
include/linux/stacktrace.h:13: warning:
 'struct task_struct' declared inside parameter list

(This might be a hard error on sparc64, which uses this header and has
-Werror)

Reported-by: "Randy.Dunlap" <rdunlap@xenotime.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-03 18:22:18 -07:00
Josef Bacik
68c9d702bb generic block based fiemap implementation
Any block based fs (this patch includes ext3) just has to declare its own
fiemap() function and then call this generic function with its own
get_block_t. This works well for block based filesystems that will map
multiple contiguous blocks at one time, but will work for filesystems that
only map one block at a time, you will just end up with an "extent" for each
block. One gotcha is this will not play nicely where there is hole+data
after the EOF. This function will assume its hit the end of the data as soon
as it hits a hole after the EOF, so if there is any data past that it will
not pick that up. AFAIK no block based fs does this anyway, but its in the
comments of the function anyway just in case.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
2008-10-03 17:32:43 -04:00
Mark Fasheh
c4b929b85b vfs: vfs-level fiemap interface
Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
inode operation.

Userspace can get extent information on a file via fiemap ioctl. As input,
the fiemap ioctl takes a struct fiemap which includes an array of struct
fiemap_extent (fm_extents). Size of the extent array is passed as
fm_extent_count and number of extents returned will be written into
fm_mapped_extents. Offset and length fields on the fiemap structure
(fm_start, fm_length) describe a logical range which will be searched for
extents. All extents returned will at least partially contain this range.
The actual extent offsets and ranges returned will be unmodified from their
offset and range on-disk.

The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
is set. If errno is equal to EBADR, then fm_flags will contain those flags
which were passed in which the kernel did not understand. On all other
errors, the contents of fm_extents is undefined.

As fiemap evolved, there have been many authors of the vfs patch. As far as
I can tell, the list includes:
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>
Eric Sandeen <sandeen@redhat.com>
Mark Fasheh <mfasheh@suse.com>

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: linux-api@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
2008-10-08 19:44:18 -04:00
Chuck Lever
9a38a83880 lockd: Remove unused fields in the nlm_reboot structure
The nlm_reboot structure is used to store information provided by the
NSM_NOTIFY procedure.  This procedure is not specified by the NLM or NSM
protocols, other than to say that the procedure can be used to transmit
information private to a particular NLM/NSM implementation.

For Linux, the callback arguments include the name of the monitored host,
the new NSM state of the host, and a 16-byte private opaque.

As a clean up, remove the unused fields and the server-side XDR logic that
decodes them.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
b85e467634 lockd: Add helper to sanity check incoming NOTIFY requests
lockd accepts SM_NOTIFY calls only from a privileged process on the
local system.  If lockd uses an AF_INET6 listener, the sender's address
(ie the local rpc.statd) will be the IPv6 loopback address, not the
IPv4 loopback address.

Make sure the privilege test in nlmsvc_proc_sm_notify() and
nlm4svc_proc_sm_notify() works for both AF_INET and AF_INET6 family
addresses by refactoring the test into a helper and adding support for
IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
dcff09f124 lockd: change nlmclnt_grant() to take a "struct sockaddr *"
Adjust the signature and callers of nlmclnt_grant() to pass a "struct
sockaddr *" instead of a "struct sockaddr_in *" in order to support IPv6
addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
6bfbe8af46 lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
Fix up nlmsvc_lookup_host() to pass AF_INET6 source addresses to
nlm_lookup_host().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:35 -04:00
Chuck Lever
d7d204403b lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
Pass a struct sockaddr * and a length to nlmclnt_lookup_host() to
accomodate non-AF_INET family addresses.

As a side benefit, eliminate the hostname_len argument, as the hostname
is always NUL-terminated.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 17:02:34 -04:00
Tom Tucker
0d3ebb9ae9 svcrdma: Add Fast Reg MR Data Types
Add data types to track Fast Reg Memory Regions. The core data type is
svc_rdma_fastreg_mr that associates a device MR with a host kva and page
list. A field is added to the WR context to keep track of the FRMR
used to map the local memory for an RPC.

An FRMR list and spin lock are added to the transport instance to keep
track of all FRMR allocated for the transport. Also added are device
capability flags to indicate what the memory registration
capabilities are for the underlying device and whether or not fast
memory registration is supported.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-10-03 15:39:58 -05:00
J. Bruce Fields
b2b5028905 lockd: move grace period checks to common code
Do all the grace period checks in svclock.c.  This simplifies the code a
bit, and will ease some later changes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:19 -04:00
J. Bruce Fields
af558e33be nfsd: common grace period control
Rewrite grace period code to unify management of grace period across
lockd and nfsd.  The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it.  This creates a slight race condition, since
the enforcement is not coordinated.  It's also more complicated than
necessary.

Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.

We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:02 -04:00
James Bottomley
3c9f3681d0 [SCSI] lib: add generic helper to print sizes rounded to the correct SI range
This patch adds the ability to print sizes in either units of 10^3 (SI)
or 2^10 (Binary) units.  It rounds up to three significant figures and
can be used for either memory or storage capacities.

Oh, and I'm fully aware that 64 bits is only 16EiB ... the Zetta and
Yotta units are added for future proofing against the day we have 128
bit computers ...

[fujita.tomonori@lab.ntt.co.jp: fix missed unsigned long long cast]
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 11:46:14 -05:00
Paul E. McKenney
2133b5d7ff rcu: RCU-based detection of stalled CPUs for Classic RCU
This patch adds stalled-CPU detection to Classic RCU.  This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled.

This is a debugging feature to detect infinite loops in kernel code, not
something that non-kernel-hackers would be expected to care about.

This feature can detect looping CPUs in !PREEMPT builds and looping CPUs
with preemption disabled in PREEMPT builds.  This is essentially a port of
this functionality from the treercu patch, replacing the stall debug patch
that is already in tip/core/rcu (commit 67182ae1c4).

The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-03 10:36:08 +02:00
Ingo Molnar
b5259d9442 Merge commit 'v2.6.27-rc8' into core/rcu 2008-10-03 10:34:36 +02:00
Nick Piggin
4b19de6d1c mm: tiny-shmem nommu fix
The previous patch db203d53d4 ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).

However do_truncate is what caused the lock ordering error, due to it
taking i_mutex.  In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.

Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-02 15:53:13 -07:00
Marek Vašut
4e9687d9c8 [ARM] 5248/1: wm97xx generic battery driver
This patch adds generic battery driver for wm97xx chips.

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-10-02 22:48:34 +01:00
Ingo Molnar
d6d5aeb661 Merge commit 'v2.6.27-rc8' into genirq 2008-10-02 10:21:26 +02:00
KOVACS Krisztian
f5715aea45 ipv4: Implement IP_TRANSPARENT socket option
This patch introduces the IP_TRANSPARENT socket option: enabling that
will make the IPv4 routing omit the non-local source address check on
output. Setting IP_TRANSPARENT requires NET_ADMIN capability.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:30:02 -07:00
Herbert Xu
12a169e7d8 ipsec: Put dumpers on the dump list
Herbert Xu came up with the idea and the original patch to make
xfrm_state dump list contain also dumpers:

As it is we go to extraordinary lengths to ensure that states
don't go away while dumpers go to sleep.  It's much easier if
we just put the dumpers themselves on the list since they can't
go away while they're going.

I've also changed the order of addition on new states to prevent
a never-ending dump.

Timo Teräs improved the patch to apply cleanly to latest tree,
modified iteration code to be more readable by using a common
struct for entries in the list, implemented the same idea for
xfrm_policy dumping and moved the af_key specific "last" entry
caching to af_key.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:03:24 -07:00
David S. Miller
b262e60309 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath9k/core.c
	drivers/net/wireless/ath9k/main.c
	net/core/dev.c
2008-10-01 06:12:56 -07:00
Lennert Buytenhek
04a4bb55bc net: add skb_recycle_check() to enable netdriver skb recycling
This patch adds skb_recycle_check(), which can be used by a network
driver after transmitting an skb to check whether this skb can be
recycled as a receive buffer.

skb_recycle_check() checks that the skb is not shared or cloned, and
that it is linear and its head portion large enough (as determined by
the driver) to be recycled as a receive buffer.  If these conditions
are met, it does any necessary reference count dropping and cleans
up the skbuff as if it just came from __alloc_skb().

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 02:33:12 -07:00
Remi Denis-Courmont
6e50e8a213 phonet: Protect if_phonet.h against multiple inclusions.
From: Remi Denis-Courmont <remi.denis-courmont@nokia.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 01:30:19 -07:00
Paul Mundt
bbfbd8b151 sh: Move the shared INTC code out to drivers/sh/
The INTC code will be re-used across different architectures, so move
this out to drivers/sh/ and include/linux/sh_intc.h respectively.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-10-01 16:13:54 +09:00
Ingo Molnar
24268245d8 x86: add PCI IDs for AMD Barcelona PCI devices
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Cc: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-01 09:05:42 +02:00
Mathieu Desnoyers
1c50b728c3 rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched()
Add rcu_read_lock_sched() and rcu_read_unlock_sched() to rcupdate.h to match the
recently added write-side call_rcu_sched() and rcu_barrier_sched(). They also
match the no-so-recently-added synchronize_sched().

It will help following matching use of the update/read lock primitives. Those
new read lock will replace preempt_disable()/enable() used in pair with
RCU-classic synchronization.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-30 12:08:41 +02:00
Rémi Denis-Courmont
a57334e95e Phonet: declare headers
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-30 02:53:18 -07:00
Stephen Hemminger
cf04a4c764 netdev: use const for some name functions
dev_change_name and netdev_drivername should use const char on
parameters that are read-only input values. The strcpy to newname is
not needed since newname is not used later in function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-30 02:22:14 -07:00
Benny Halevy
d5b337b487 nfsd: use nfs client rpc callback program
since commit ff7d9756b5
"nfsd: use static memory for callback program and stats"
do_probe_callback uses a static callback program
(NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog
as passed in by the client in setclientid (4.0)
or create_session (4.1).

This patches introduces rpc_create_args.prognumber that allows
overriding program->number when creating rpc_clnt.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:40 -04:00
Chuck Lever
781b61a6f4 lockd: Teach nlm_cmp_addr() to support AF_INET6 addresses
Update the nlm_cmp_addr() helper to support AF_INET6 as well as AF_INET
addresses.  New version takes two "struct sockaddr *" arguments instead of
"struct sockaddr_in *" arguments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:39 -04:00
Chuck Lever
7e9d7746bf NSM: Use sockaddr_storage for sm_addr field
To store larger addresses in the nsm_handle structure, make sm_addr a
sockaddr_storage.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:39 -04:00
Chuck Lever
90151e6e4d lockd: Use sockaddr_storage for h_saddr field
To store larger addresses in the nlm_host structure, make h_saddr a
sockaddr_storage.  And let's call it something more self-explanatory:
"saddr" could easily be mistaken for "server address".

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:39 -04:00
Chuck Lever
b4ed58fd34 lockd: Use sockaddr_storage + length for h_addr field
To store larger addresses in the nlm_host structure, make h_addr a
sockaddr_storage, and add an address length field.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:39 -04:00
Chuck Lever
5344b12d4f SUNRPC: Make svc_addr's argument a constant
Clean up: Add extra type safety and squelch a few compiler complaints
in upcoming patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:38 -04:00
Chuck Lever
1b333c54a1 lockd: address-family independent printable addresses
Knowing which source address is used for communicating with remote NLM
services can be helpful for debugging configuration problems on hosts
with multiple addresses.

Keep the dprintk debugging here, but adapt it so it displays AF_INET6
addresses properly.  There are also a couple of dprintk clean-ups as
well.

At some point we will aggregate the helpers that display presentation
format addresses into a single set of shared helpers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:38 -04:00
Chuck Lever
a26cfad6e0 SUNRPC: Support IPv6 when registering kernel RPC services
In order to advertise NFS-related services on IPv6 interfaces via
rpcbind, the kernel RPC server implementation must use
rpcb_v4_register() instead of rpcb_register().

A new kernel build option allows distributions to use the legacy
v2 call until they integrate an appropriate user-space rpcbind
daemon that can support IPv6 RPC services.

I tried adding some automatic logic to fall back if registering
with a v4 protocol request failed, but there are too many corner
cases.  So I just made it a compile-time switch that distributions
can throw when they've replaced portmapper with rpcbind.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:38 -04:00
Chuck Lever
14aeb2118d SUNRPC: Simplify rpcb_register() API
Bruce suggested there's no need to expose the difference between an error
sending the PMAP_SET request and an error reply from the portmapper to
rpcb_register's callers.  The user space equivalent of rpcb_register() is
pmap_set(3), which returns a bool_t : either the PMAP set worked, or it
didn't.  Simple.

So let's remove the "*okay" argument from rpcb_register() and
rpcb_v4_register(), and simply return an error if any part of the call
didn't work.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 18:13:37 -04:00
Thomas Petazzoni
bfcd17a6c5 Configure out file locking features
This patch adds the CONFIG_FILE_LOCKING option which allows to remove
support for advisory locks. With this patch enabled, the flock()
system call, the F_GETLK, F_SETLK and F_SETLKW operations of fcntl()
and NFS support are disabled. These features are not necessarly needed
on embedded systems. It allows to save ~11 Kb of kernel code and data:

   text          data     bss     dec     hex filename
1125436        118764  212992 1457192  163c28 vmlinux.old
1114299        118564  212992 1445855  160fdf vmlinux
 -11137    -200       0  -11337   -2C49 +/-

This patch has originally been written by Matt Mackall
<mpm@selenic.com>, and is part of the Linux Tiny project.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: matthew@wil.cx
Cc: linux-fsdevel@vger.kernel.org
Cc: mpm@selenic.com
Cc: akpm@linux-foundation.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 17:56:57 -04:00
J. Bruce Fields
04716e6621 nfsd: permit unauthenticated stat of export root
RFC 2623 section 2.3.2 permits the server to bypass gss authentication
checks for certain operations that a client may perform when mounting.
In the case of a client that doesn't have some form of credentials
available to it on boot, this allows it to perform the mount unattended.
(Presumably real file access won't be needed until a user with
credentials logs in.)

Being slightly more lenient allows lots of old clients to access
krb5-only exports, with the only loss being a small amount of
information leaked about the root directory of the export.

This affects only v2 and v3; v4 still requires authentication for all
access.

Thanks to Peter Staubach testing against a Solaris client, which
suggesting addition of v3 getattr, to the list, and to Trond for noting
that doing so exposes no additional information.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
2008-09-29 17:56:56 -04:00
Chuck Lever
e851db5b05 SUNRPC: Add address family field to svc_serv data structure
Introduce and initialize an address family field in the svc_serv structure.

This field will determine what family to use for the service's listener
sockets and what families are advertised via the local rpcbind daemon.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29 17:56:56 -04:00
Thomas Gleixner
ccc7dadf73 hrtimer: prevent migration of per CPU hrtimers
Impact: per CPU hrtimers can be migrated from a dead CPU

The hrtimer code has no knowledge about per CPU timers, but we need to
prevent the migration of such timers and warn when such a timer is
active at migration time.

Explicitely mark the timers as per CPU and use a more understandable
mode descriptor for the interrupts safe unlocked callback mode, which
is used by hrtimer_sleeper and the scheduler code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-29 17:09:14 +02:00
Thomas Gleixner
b00c1a99e7 hrtimer: mark migration state
Impact: during migration active hrtimers can be seen as inactive

The migration code removes the hrtimers from the queues of the dead
CPU and sets the state temporary to INACTIVE. The enqueue code sets it
to ACTIVE/PENDING again.

Prevent that the wrong state can be seen by using a separate migration
state bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-29 17:09:14 +02:00
Richard Kennedy
6866e7bc83 libata: reorder ata_device to remove 8 bytes of padding on 64 bits
reduce size by 8 bytes from 1160 to 1152 allowing it to fit in 1 fewer
cachelines.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:28:36 -04:00
Elias Oltmanns
45fabbb77b libata: Implement disk shock protection support
On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
FEATURE as specified in ATA-7 is issued to the device and processing of
the request queue is stopped thereafter until the specified timeout
expires or user space asks to resume normal operation. This is supposed
to prevent the heads of a hard drive from accidentally crashing onto the
platter when a heavy shock is anticipated (like a falling laptop
expected to hit the floor). In fact, the whole port stops processing
commands until the timeout has expired in order to avoid any resets due
to failed commands on another device.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:27:54 -04:00
Elias Oltmanns
ea6ce53cd5 [libata] Introduce ata_id_has_unload()
Add a function to check an ATA device's id for head unload support as
specified in ATA-7.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:25:50 -04:00
Tejun Heo
b1c72916ab libata: implement slave_link
Explanation taken from the comment of ata_slave_link_init().

 In libata, a port contains links and a link contains devices.  There
 is single host link but if a PMP is attached to it, there can be
 multiple fan-out links.  On SATA, there's usually a single device
 connected to a link but PATA and SATA controllers emulating TF based
 interface can have two - master and slave.

 However, there are a few controllers which don't fit into this
 abstraction too well - SATA controllers which emulate TF interface
 with both master and slave devices but also have separate SCR
 register sets for each device.  These controllers need separate links
 for physical link handling (e.g. onlineness, link speed) but should
 be treated like a traditional M/S controller for everything else
 (e.g. command issue, softreset).

 slave_link is libata's way of handling this class of controllers
 without impacting core layer too much.  For anything other than
 physical link handling, the default host link is used for both master
 and slave.  For physical link handling, separate @ap->slave_link is
 used.  All dirty details are implemented inside libata core layer.
 From LLD's POV, the only difference is that prereset, hardreset and
 postreset are called once more for the slave link, so the reset
 sequence looks like the following.

 prereset(M) -> prereset(S) -> hardreset(M) -> hardreset(S) ->
 softreset(M) -> postreset(M) -> postreset(S)

 Note that softreset is called only for the master.  Softreset resets
 both M/S by definition, so SRST on master should handle both (the
 standard method will work just fine).

As slave_link excludes PMP support and only code paths which deal with
the attributes of physical link are affected, all the changes are
localized to libata.h, libata-core.c and libata-eh.c.

 * ata_is_host_link() updated so that slave_link is considered as host
   link too.

 * iterator extended to iterate over the slave_link when using the
   underbarred version.

 * force param handling updated such that devno 16 is mapped to the
   slave link/device.

 * ata_link_on/offline() updated to return the combined result from
   master and slave link.  ata_phys_link_on/offline() are the direct
   versions.

 * EH autopsy and report are performed separately for master slave
   links.  Reset is udpated to implement the above described reset
   sequence.

Except for reset update, most changes are minor, many of them just
modifying dev->link to ata_dev_phys_link(dev) or using phys online
test instead.

After this update, LLDs can take full advantage of per-dev SCR
registers by simply turning on slave link.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:25:28 -04:00
Tejun Heo
b5b3fa386b libata: misc updates to prepare for slave link
* Add ATA_EH_ALL_ACTIONS.

* Make sata_link_{on|off}_line() return bool instead of int.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:22:32 -04:00
Tejun Heo
aadffb682c libata: reimplement link iterator
Implement __ata_port_next_link() and reimplement
__ata_port_for_each_link() and ata_port_for_each_link() using it.
This removes relatively large inlined code and makes iteration easier
to extend.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:22:30 -04:00
Tejun Heo
82ef04fb4c libata: make SCR access ops per-link
Logically, SCR access ops should take @link; however, there was no
compelling reason to convert all SCR access ops when adding @link
abstraction as there's one-to-one mapping between a port and a non-PMP
link.  However, that assumption won't hold anymore with the scheduled
addition of slave link.

Make SCR access ops per-link.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-29 00:22:28 -04:00
Frank Mayhar
7086efe1c1 timers: fix itimer/many thread hang, v3
- fix UP lockup
- another set of UP/SMP cleanups and simplifications

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-27 20:04:45 +02:00
Alan Cox
e416de5e61 Export the ROM enable/disable helpers
.... so that they can be used by MTD map drivers. Lets us close #9420

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-09-26 18:59:05 -06:00
Suresh Siddha
379daf6290 IO resources, x86: ioremap sanity check to catch mapping requests exceeding the BAR sizes
Go through the iomem resource tree to check if any of the ioremap()
requests span more than any slot in the iomem resource tree and do
a WARN_ON() if we hit this check.

This will raise a red-flag, if some driver is mapping more than what
is needed. And hopefully identify possible corruptions much earlier.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-26 09:42:20 +02:00
David S. Miller
db4148da2c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-09-25 13:16:16 -07:00
Eric Miao
ff7a4c7130 [ARM] corgi_lcd: use GPIO API for BACKLIGHT_ON and BACKLIGHT_CONT
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-25 09:38:14 +01:00
Jeff Garzik
ae19161e28 Merge branch 'for-2.6.28' of git://git.marvell.com/mv643xx_eth into upstream-next 2008-09-24 20:40:52 -04:00
David Howells
b4f151ff89 MN10300: Move asm-arm/cnt32_to_63.h to include/linux/
Move asm-arm/cnt32_to_63.h to include/linux/ so that MN10300 can make
use of it too.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-24 16:38:17 -07:00
Dhananjay Phadke
040dec3b37 netxen: add pci ids
Define old and new pci vendor and device ids.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-24 18:54:46 -04:00
YanBo
79617deeeb mac80211: mesh portal functionality support
Currently the mesh code doesn't support bridging mesh point interfaces
with wired ethernet or AP to construct an MPP or MAP. This patch adds
code to support the "6 address frame format packet" functionality to
mesh point interfaces. Now the mesh network can be used as backhaul
for end to end communication.

Signed-off-by: Li YanBo <dreamfly281@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-24 16:18:02 -04:00
Roman Zippel
d40e944c25 ntp: improve adjtimex frequency rounding
Change PPM_SCALE_INV_SHIFT so that it doesn't throw away any input bits
(19 is the amount of the factor 2 in PPM_SCALE), the output frequency
can then be calculated back to its input value, as the inverse divide
produce a slightly larger value, which is then correctly rounded by the
final shift.

Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-24 17:33:13 +02:00
Richard Kennedy
1b02469088 hrtimer: reorder struct hrtimer to save 8 bytes on 64bit builds
reorder struct hrtimer to save 8 bytes on 64 bit builds when
CONFIG_TIMER_STATS selected.  (also removes 8 bytes from signal_struct)

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-24 15:58:47 +02:00
Oleg Nesterov
5a9fa73072 posix-timers: kill ->it_sigev_signo and ->it_sigev_value
With the recent changes ->it_sigev_signo and ->it_sigev_value are only
used in sys_timer_create(), kill them.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: mingo@elte.hu
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-24 15:45:48 +02:00
Ingo Molnar
07bbc16a86 Merge branch 'timers/urgent' into x86/xen
Conflicts:
	arch/x86/kernel/process_32.c
	arch/x86/kernel/process_64.c

Manual merge:

	arch/x86/kernel/smpboot.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-23 23:26:42 +02:00
Eric Miao
bfdcaa3b68 lcd: add corgibl_limit_intensity() to corgi_lcd
This is not generic enough, added here for backward compatibility.
And make this an individual commit so future revert will be a bit
easier.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-23 22:04:31 +01:00
Eric Miao
b18250a8f6 lcd: add SPI-based LCD and backlight driver for SHARP corgi/spitz
The driver is based on different source files including corgi_ssp.c,
corgi_lcd.c and corgi_bl.c, previously authored by Richard Purdie
and many others.

The LCD and Backlight device actually share the same SPI device, so
they are made into this single driver.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-23 22:04:30 +01:00
Eric Miao
faa312da9c lcd: allow lcd device to handle mode change events
Some LCD panels are capable of different resolutions, and is allowed
to change at run-time, so to make "struct lcd_device" to be able to
handle mode change events here.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-23 22:01:33 +01:00
Linus Torvalds
e002bcc2f8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: fix compiler warnings in pci_get_subsys()
  PCI: Fix pcie_aspm=force
2008-09-23 12:15:50 -07:00
Kirill A. Shutemov
c32a162fd4 smb.h: do not include linux/time.h in userspace
linux/time.h conflicts with time.h from glibc

It breaks building smbmount from samba.  It's regression introduced by
commit 76308da (" smb.h: uses struct timespec but didn't include
linux/time.h").

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: <stable@kernel.org>             [2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-23 08:09:13 -07:00
Ingo Molnar
63e5c39859 Merge branches 'sched/urgent' and 'sched/rt' into sched/devel 2008-09-23 16:23:05 +02:00
Frank Mayhar
bb34d92f64 timers: fix itimer/many thread hang, v2
This is the second resubmission of the posix timer rework patch, posted
a few days ago.

This includes the changes from the previous resubmittion, which addressed
Oleg Nesterov's comments, removing the RCU stuff from the patch and
un-inlining the thread_group_cputime() function for SMP.

In addition, per Ingo Molnar it simplifies the UP code, consolidating much
of it with the SMP version and depending on lower-level SMP/UP handling to
take care of the differences.

It also cleans up some UP compile errors, moves the scheduler stats-related
macros into kernel/sched_stats.h, cleans up a merge error in
kernel/fork.c and has a few other minor fixes and cleanups as suggested
by Oleg and Ingo. Thanks for the review, guys.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-23 13:38:44 +02:00
David S. Miller
1164f52a24 net: Add skb_queue_walk_from() and skb_queue_walk_from_safe().
These will be used by TCP write queue handling and elsewhere.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 00:49:44 -07:00
David S. Miller
249c8b42c7 net: Add skb_queue_next().
A lot of code wants to iterate over an SKB queue at the top level using
it's own control structure and iterator scheme.

Provide skb_queue_next(), which is only valid to invoke if
skb_queue_is_last() returns false.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 00:44:42 -07:00
David S. Miller
fc7ebb212d net: Add skb_queue_is_last().
Several bits of code want to know "is this the last SKB in
a queue", and all of them implement this by hand.

Provide an common interface to make this check.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 00:34:07 -07:00
David S. Miller
1d4a31dde9 net: Fix bus in SKB queue splicing interfaces.
Handle the case of head being non-empty, by adding list->qlen
to head->qlen instead of using direct assignment.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 21:57:21 -07:00
Stephen Hemminger
0b815a1a6d net: network device name ifalias support
This patch add support for keeping an additional character alias
associated with an network interface. This is useful for maintaining
the SNMP ifAlias value which is a user defined value. Routers use this
to hold information like which circuit or line it is connected to. It
is just an arbitrary text label on the network device.

There are two exposed interfaces with this patch, the value can be
read/written either via netlink or sysfs.

This could be maintained just by the snmp daemon, but it is more
generally useful for other management tools, and the kernel is good
place to act as an agreed upon interface to store it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 21:28:11 -07:00
Remi Denis-Courmont
be0c52bfed Phonet: emit errors when a packet cannot be delivered locally
When there is no listener socket for a received packet, send an error
back to the sender.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:09:13 -07:00
Remi Denis-Courmont
5f77076d75 Phonet: provide MAC header operations
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:08:04 -07:00
Remi Denis-Courmont
ba113a94b7 Phonet: common socket glue
This provides the socket API for the Phonet protocols family.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:05:19 -07:00
Remi Denis-Courmont
bce7b15426 Phonet: global definitions
Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 19:51:15 -07:00
Herbert Xu
5c1824587f ipsec: Fix xfrm_state_walk race
As discovered by Timo Teräs, the currently xfrm_state_walk scheme
is racy because if a second dump finishes before the first, we
may free xfrm states that the first dump would walk over later.

This patch fixes this by storing the dumps in a list in order
to calculate the correct completion counter which cures this
problem.

I've expanded netlink_cb in order to accomodate the extra state
related to this.  It shouldn't be a big deal since netlink_cb
is kmalloced for each dump and we're just increasing it by 4 or
8 bytes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 19:48:19 -07:00
Thomas Renninger
a0ad05c75a Introduce FW_BUG, FW_WARN and FW_INFO to consistenly tell users about BIOS bugs
The idea is to add this to printk after the severity:
printk(KERN_ERR FW_BUG "This is not our fault, BIOS developer: fix it by
simply add ...\n");

If a Firmware issue should be hidden, because it is
work-arounded, but you still want to see something popping up e.g.
for info only:
printk(KERN_INFO FW_INFO "This is done stupid, we can handle it,
but it should better be avoided in future\n");

or on the Linuxfirmwarekit to tell vendors that they did something
stupid or wrong without bothering the user:
printk(KERN_INFO FW_BUG "This is done stupid, we can handle it,
but it should better be avoided in future\n");

Some use cases:
  - If a user sees a [Firmware Bug] message in the kernel
    he should first update the BIOS before wasting time with
    debugging and submiting on old firmware code to mailing
    lists.

  - The linuxfirmwarekit (http://www.linuxfirmwarekit.org)
    tries to detect firmware bugs. It currently is doing that
    in userspace which results in:
        - Huge test scripts that could be a one liner in the kernel
        - A lot of BIOS bugs are already absorbed by the kernel

What do we need such a stupid linuxfirmwarekit for?

  - Vendors: Can test their BIOSes for Linux compatibility.
    There will be the time when vendors realize that the test utils
    on Linux are more strict and using them increases the qualitity
    and stability of their products.

  - Vendors: Can easily fix up their BIOSes and be more Linux
    compatible by:
    dmesg |grep "Firmware Bug"
    and send the result to their BIOS developer colleagues who should
    know what the messages are about and how to fix them, without
    the need of studying kernel code.

  - Distributions: can do a first automated HW/BIOS checks.
    This can then be done without the need of asking kernel developers
    who need to dig down the code and explain the details.
    Certification can/will just be rejected until
    dmesg |grep "Firmware Bug" is empty.

  - Thus this can be used as an instrument to enforce cleaner BIOS
    code. Currently every stupid Windows ACPI bug is
    re-implemented in Linux which is a rather unfortunate situation.
    We already have the power to avoid this in e.g. memory
    or cpu hot-plug ACPI implementations, because Linux certification
    is a must for most vendors in the server area.
    Working towards being able to do that in the laptop area
    (vendors are starting to look at Linux here also and will use this tool)
    is the goal. At least provide them a tool to make it as easy
    for this guys (e.g. not needing to browse kernel code) as possible.

  - The ordinary Linux user: can go into the next shop, boots the
    firmwarekit on his most preferred machines. He chooses one without
    BIOS bugs. Unsupported HW is ok, he likes to try out latest projects
    which might support them or likes to dig on it on his own, but he
    hates to workaround broken BIOSes like hell.

I double checked with the firmwarekit.
There they have:
So the mapping generally is (also depending on how likely the BIOS is
to blame, this could sometimes be difficult):
FW_INFO  = INFO
FW_WARN  = WARN
FW_BUG   = FAIL

For more info about the linuxfirmwarekit and why this is needed
can be found here:
http://www.linuxfirmwarekit.org

While severity matches with the firmwarekit, it might be tricky
to hide messages from the user.
E.g. we recently found out that on HP BIOSes negative temperatures
are returned, which seem to indicate that the thermal zone is
invalid.
We can work around that gracefully by ignoring the thermal zone
and we do not want to bother the ordinary user with a frightening
message: Firmware Bug: thermal management absolutely broken
but want to hide it from the user.

But in the linuxfirmwarekit this should be shown as a real
show stopper (the temperatures could really be wrong,
broken thermal management is one of the worst things
that can happen and the BIOS guys of the machine must
implement this properly).

It is intended to do that (hide it from the user with
KERN_INFO msg, but still print it as a BIOS bug) by:
printk(KERN_INFO FW_BUG "Negativ temperature values detected.
Try to workarounded, BIOS must get fixed\n");
Hope that works out..., no idea how to better hide it
as printk is the only way to easily provide this functionality.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-09-22 18:42:51 -04:00
FUJITA Tomonori
d26dbc5cf9 iommu: export iommu_area_reserve helper function
x86 has set_bit_string() that does the exact same thing that
set_bit_area() in lib/iommu-helper.c does.

This patch exports set_bit_area() in lib/iommu-helper.c as
iommu_area_reserve(), converts GART, Calgary, and AMD IOMMU to use it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 16:47:50 +02:00
Peter Zijlstra
15afe09bf4 sched: wakeup preempt when small overlap
Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.

The difference seems to come from different preemption agressiveness,
which affects the cache footprint of the workload and its effective
cache trashing.

Aggresively preempt a task if its avg overlap is very small, this should
avoid the task going to sleep and find it still running when we schedule
back to it - saving a wakeup.

Reported-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 16:28:32 +02:00
Mark McLoughlin
b91c4996df hrtimer: remove hrtimer_clock_base::reprogram()
hrtimer_clock_base::reprogram() also appears to never
have been used, so remove it.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 13:52:42 +02:00
Mark McLoughlin
d7cfb60c5c hrtimer: remove hrtimer_clock_base::get_softirq_time()
Peter Zijlstra noticed this 8 months ago and I just noticed
it again.

hrtimer_clock_base::get_softirq_time() is currently unused
in the entire tree. In fact, looking at the logs, it appears
as if it was never used. Remove it.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 13:52:42 +02:00
David S. Miller
38783e6713 isdn: isdn_ppp: Use SKB list facilities instead of home-grown implementation.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 01:15:02 -07:00
Steven Whitehouse
6d80c39f91 GFS2: Add UUID to GFS2 sb
This patch adds a UUID to the GFS2 sb structure. This field is not
actually referenced from kernel space at all, but is added for
completeness and due to the userland tools which get their on-disk
structure information from the gfs2_ondisk.h header file.

Since we have to be backwards compatible, we will assume that any GFS2
sb for which the UUID is all 0 does not have a UUID as such.

We should then be (after some userland changes) able to support the -U
mount option. This addresses Fedora bugzilla #242689

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-22 07:29:31 +01:00
David S. Miller
67fed45930 net: Add new interfaces for SKB list light-weight init and splicing.
This will be used by subsequent changesets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-21 22:36:24 -07:00
James Morris
ab2b49518e Merge branch 'master' into next
Conflicts:

	MAINTAINERS

Thanks for breaking my tree :-)

Signed-off-by: James Morris <jmorris@namei.org>
2008-09-21 17:41:56 -07:00
Ilpo Järvinen
0e1c54c2a4 tcp: reorganize retransmit code loops
Both loops are quite similar, so they can be combined
with little effort. As a result, forward_skb_hint becomes
obsolete as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:24:21 -07:00
Ilpo Järvinen
006f582c73 tcp: convert retransmit_cnt_hint to seqno
Main benefit in this is that we can then freely point
the retransmit_skb_hint to anywhere we want to because
there's no longer need to know what would be the count
changes involve, and since this is really used only as a
terminator, unnecessary work is one time walk at most,
and if some retransmissions are necessary after that
point later on, the walk is not full waste of time
anyway.

Since retransmit_high must be kept valid, all lost
markers must ensure that.

Now I also have learned how those "holes" in the
rexmittable skbs can appear, mtu probe does them. So
I removed the misleading comment as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:20:20 -07:00
Linus Torvalds
5a0cd4eb66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop()
  RDMA/nes: Fix client side QP destroy
  IB/mlx4: Fix up fast register page list format
  mlx4_core: Set RAE and init mtt_sz field in FRMR MPT entries
2008-09-19 16:18:21 -07:00
David S. Miller
d950f264ff Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-09-19 16:17:12 -07:00
FUJITA Tomonori
07a2c01a0c convert swiotlb to use dma_get_mask
swiotlb can use dma_get_mask() instead of the homegrown function.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 10:20:41 +02:00
Lennert Buytenhek
4fd5f812c2 phylib: allow incremental scanning of an mii bus
This patch splits the bus scanning code in mdiobus_register() off
into a separate function, and makes this function available for
calling from external code.  This allows incrementally scanning an
mii bus, e.g. as information about which addresses are 'safe' to
scan becomes available.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Acked-by: Andy Fleming <afleming@freescale.com>
2008-09-19 05:13:54 +02:00
Scott Feldman
01f2e4ead2 enic: add Cisco 10G Ethernet NIC driver
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-18 11:34:53 -04:00
Chris Snook
452c1ce218 atl2: add atl2 driver
Driver for Atheros L2 10/100 network device. Includes necessary
changes for Kconfig, Makefile, and pci_ids.h.

Signed-off-by: Chris Snook <csnook@redhat.com>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-18 11:34:52 -04:00
Eric Miao
6ae19b04ab Input: ads7846 - introduce .gpio_pendown to get pendown state
The GPIO connected to ADS7846 nPENIRQ signal is usually used to get
the pendown state as well. Introduce a .gpio_pendown, and use this
to decide the pendown state if .get_pendown_state is NULL.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-09-17 17:33:37 +01:00
David Vrabel
b60066c141 uwb: add symlinks in sysfs between radio controllers and PALs
Add a facility for PALs to have symlinks to their radio controller
(and vice-versa) and make WUSB host controllers use this.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:35 +01:00
Inaky Perez-Gonzalez
c7f736484f wusb: add the Wireless USB include files.
Common header files derived from the WUSB 1.0 specification.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:29 +01:00
David Vrabel
8f1b678ab9 uwb: add the driver to enumerate WHCI capabilities
This enumerates the capabilties of a WHCI device, adding a umc device for
each one.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:26 +01:00
David Vrabel
da389eac31 uwb: add the umc bus
The UMC bus is used for the capabilities exposed by a UWB Multi-interface
Controller as described in the WHCI specification.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:25 +01:00
Inaky Perez-Gonzalez
34e95e41f1 uwb: add the uwb include files
Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:23 +01:00
David Vrabel
ccbe329bcd bitmap: add bitmap_copy_le()
bitmap_copy_le() copies a bitmap, putting the bits into little-endian
order (i.e., each unsigned long word in the bitmap is put into
little-endian order).

The UWB stack used bitmaps to manage Medium Access Slot availability,
and these bitmaps need to be written to the hardware in LE order.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
2008-09-17 16:54:22 +01:00
David Miller
ef3d7714f6 Fix PNP build failure, bugzilla #11276
This fill fix the following regression list entry:

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11276
Subject		: build error: CONFIG_OPTIMIZE_INLINING=y causes gcc 4.2 to do stupid things
Submitter	: Randy Dunlap <randy.dunlap@oracle.com>
Date		: 2008-08-06 17:18 (38 days old)
References	: http://marc.info/?l=linux-kernel&m=121804329014332&w=4
		  http://lkml.org/lkml/2008/7/22/353
Handled-By	: Bjorn Helgaas <bjorn.helgaas@hp.com>
Patch		: http://lkml.org/lkml/2008/7/22/364

with what I believe is a better fix than the one referenced
in the regression entry above.

These PNP header interfaces try to work in such a way that
you can reference some of them even if PNP is not enabled,
and the compiler was expected to optimize everything away.

Which is mostly fine, except that there was one interface
for which there was not provided an inline "NOP" implementation.

Once we add that, all of these compile failures cannot handle
any more.

pnp: Provide NOP inline implementation of pnp_get_resource() when !PNP

Fixes kernel bugzilla #11276.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-16 19:35:05 -07:00
Greg KH
b08508c40a PCI: fix compiler warnings in pci_get_subsys()
pci_get_subsys() changed in 2.6.26 so that the from pointer is modified
when the call is being invoked, so fix up the 'const' marking of it that
the compiler is complaining about.

Reported-by: Rufus & Azrael <rufus-azrael@numericable.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-09-16 15:52:08 -07:00
David S. Miller
2e57572a50 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Conflicts:

	arch/sparc64/kernel/pci_psycho.c
2008-09-16 14:11:43 -07:00
Ingo Molnar
e3bbaa3cb6 Merge commit 'v2.6.27-rc6' into x86/memory-corruption-check 2008-09-16 09:34:23 +02:00
Vladimir Sokolovsky
29bdc88384 IB/mlx4: Fix up fast register page list format
Byte swap the addresses in the page list for fast register work requests
to big endian to match what the HCA expectx.  Also, the addresses must
have the "present" bit set so that the HCA knows it can access them.
Otherwise the HCA will fault the first time it accesses the memory
region.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-09-15 14:25:23 -07:00
Luis R. Rodriguez
b2e1b30290 cfg80211: Add new wireless regulatory infrastructure
This adds the new wireless regulatory infrastructure. The
main motiviation behind this was to centralize regulatory
code as each driver was implementing their own regulatory solution,
and to replace the initial centralized code we have where:

* only 3 regulatory domains are supported: US, JP and EU
* regulatory domains can only be changed through module parameter
* all rules were built statically in the kernel

We now have support for regulatory domains for many countries
and regulatory domains are now queried through a userspace agent
through udev allowing distributions to update regulatory rules
without updating the kernel.

Each driver can regulatory_hint() a regulatory domain
based on either their EEPROM mapped regulatory domain value to a
respective ISO/IEC 3166-1 country code or pass an internally built
regulatory domain. We also add support to let the user set the
regulatory domain through userspace in case of faulty EEPROMs to
further help compliance.

Support for world roaming will be added soon for cards capable of
this.

For more information see:

http://wireless.kernel.org/en/developers/Regulatory/CRDA

For now we leave an option to enable the old module parameter,
ieee80211_regdom, and to build the 3 old regdomains statically
(US, JP and EU). This option is CONFIG_WIRELESS_OLD_REGULATORY.
These old static definitions and the module parameter is being
scheduled for removal for 2.6.29. Note that if you use this
you won't make use of a world regulatory domain as its pointless.
If you leave this option enabled and if CRDA is present and you
use US or JP we will try to ask CRDA to update us a regulatory
domain for us.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:19 -04:00
Ingo Molnar
83bd6998b0 Merge commit 'v2.6.27-rc6' into timers/hpet 2008-09-14 18:24:00 +02:00
Jeremy Fitzhardinge
8308c54d7e generic: redefine resource_size_t as phys_addr_t
There's no good reason why a resource_size_t shouldn't just be a
physical address, so simply redefine it in terms of phys_addr_t.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 17:24:27 +02:00
Jeremy Fitzhardinge
947d0496cf generic: make PFN_PHYS explicitly return phys_addr_t
PFN_PHYS, as its name suggests, turns a pfn into a physical address.
However, it is a macro which just operates on its argument without
modifying its type.  pfns are typed unsigned long, but an unsigned
long may not be long enough to hold a physical address (32-bit systems
with more than 32 bits of physcial address).

Make sure we cast to phys_addr_t to return a complete result.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 17:24:26 +02:00
Jeremy Fitzhardinge
600715dcdf generic: add phys_addr_t for holding physical addresses
Add a kernel-wide "phys_addr_t" which is guaranteed to be able to hold
any physical address.  By default it equals the word size of the
architecture, but a 32-bit architecture can set ARCH_PHYS_ADDR_T_64BIT
if it needs a 64-bit phys_addr_t.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 17:24:25 +02:00
Ingo Molnar
9dfed08eb4 Merge commit 'v2.6.27-rc6' into core/resources 2008-09-14 17:23:29 +02:00
Ingo Molnar
5ce73a4a5a timers: fix itimer/many thread hang, cleanups
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 17:11:46 +02:00
Ingo Molnar
0a8eaa4f9b timers: fix itimer/many thread hang, fix #2
fix the UP build:

In file included from arch/x86/kernel/asm-offsets_32.c:9,
                 from arch/x86/kernel/asm-offsets.c:3:
include/linux/sched.h: In function ‘thread_group_cputime_clone_thread’:
include/linux/sched.h:2272: warning: no return statement in function returning non-void
include/linux/sched.h: In function ‘thread_group_cputime_account_user’:
include/linux/sched.h:2284: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h:2284: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h: In function ‘thread_group_cputime_account_system’:
include/linux/sched.h:2291: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h:2291: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
include/linux/sched.h: In function ‘thread_group_cputime_account_exec_runtime’:
include/linux/sched.h:2298: error: invalid type argument of ‘->’ (have ‘struct task_cputime’)
distcc[14501] ERROR: compile arch/x86/kernel/asm-offsets.c on a/30 failed
make[1]: *** [arch/x86/kernel/asm-offsets.s] Error 1

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 17:02:43 +02:00
FUJITA Tomonori
589fc9a6e2 iommu: add dma_get_mask helper function
Several IOMMUs do the same thing to get the dma_mask of a device. This
adds a helper function to do the same thing to sweep them.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 16:42:37 +02:00
FUJITA Tomonori
eecfffc154 iommu: add iommu_device_max_index IOMMU helper function
This function helps IOMMUs to know the highest address that a device
can access to.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 16:42:36 +02:00
Frank Mayhar
f06febc96b timers: fix itimer/many thread hang
Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads.  It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."

Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine.  (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.)  To do this, at each tick we now update fields in
signal_struct as well as task_struct.  The run_posix_cpu_timers() function
uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
	cputime_t utime;
	cputime_t stime;
	unsigned long long sum_exec_runtime;
};

This is included in the structure "thread_group_cputime," which is a new
substructure of signal_struct and which varies for uniprocessor versus
multiprocessor kernels.  For uniprocessor kernels, it uses "task_cputime" as
a simple substructure, while for multiprocessor kernels it is a pointer:

struct thread_group_cputime {
	struct task_cputime totals;
};

struct thread_group_cputime {
	struct task_cputime *totals;
};

We also add a new task_cputime substructure directly to signal_struct, to
cache the earliest expiration of process-wide timers, and task_cputime also
replaces the it_*_expires fields of task_struct (used for earliest expiration
of thread timers).  The "thread_group_cputime" structure contains process-wide
timers that are updated via account_user_time() and friends.  In the non-SMP
case the structure is a simple aggregator; unfortunately in the SMP case that
simplicity was not achievable due to cache-line contention between CPUs (in
one measured case performance was actually _worse_ on a 16-cpu system than
the same test on a 4-cpu system, due to this contention).  For SMP, the
thread_group_cputime counters are maintained as a per-cpu structure allocated
using alloc_percpu().  The timer functions update only the timer field in
the structure corresponding to the running CPU, obtained using per_cpu_ptr().

We define a set of inline functions in sched.h that we use to maintain the
thread_group_cputime structure and hide the differences between UP and SMP
implementations from the rest of the kernel.  The thread_group_cputime_init()
function initializes the thread_group_cputime structure for the given task.
The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the
out-of-line function thread_group_cputime_alloc_smp() to allocate and fill
in the per-cpu structures and fields.  The thread_group_cputime_free()
function, also a no-op for UP, in SMP frees the per-cpu structures.  The
thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls
thread_group_cputime_alloc() if the per-cpu structures haven't yet been
allocated.  The thread_group_cputime() function fills the task_cputime
structure it is passed with the contents of the thread_group_cputime fields;
in UP it's that simple but in SMP it must also safely check that tsk->signal
is non-NULL (if it is it just uses the appropriate fields of task_struct) and,
if so, sums the per-cpu values for each online CPU.  Finally, the three
functions account_group_user_time(), account_group_system_time() and
account_group_exec_runtime() are used by timer functions to update the
respective fields of the thread_group_cputime structure.

Non-SMP operation is trivial and will not be mentioned further.

The per-cpu structure is always allocated when a task creates its first new
thread, via a call to thread_group_cputime_clone_thread() from copy_signal().
It is freed at process exit via a call to thread_group_cputime_free() from
cleanup_signal().

All functions that formerly summed utime/stime/sum_sched_runtime values from
from all threads in the thread group now use thread_group_cputime() to
snapshot the values in the thread_group_cputime structure or the values in
the task structure itself if the per-cpu structure hasn't been allocated.

Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit.
The run_posix_cpu_timers() function has been split into a fast path and a
slow path; the former safely checks whether there are any expired thread
timers and, if not, just returns, while the slow path does the heavy lifting.
With the dedicated thread group fields, timers are no longer "rebalanced" and
the process_timer_rebalance() function and related code has gone away.  All
summing loops are gone and all code that used them now uses the
thread_group_cputime() inline.  When process-wide timers are set, the new
task_cputime structure in signal_struct is used to cache the earliest
expiration; this is checked in the fast path.

Performance

The fix appears not to add significant overhead to existing operations.  It
generally performs the same as the current code except in two cases, one in
which it performs slightly worse (Case 5 below) and one in which it performs
very significantly better (Case 2 below).  Overall it's a wash except in those
two cases.

I've since done somewhat more involved testing on a dual-core Opteron system.

Case 1: With no itimer running, for a test with 100,000 threads, the fixed
	kernel took 1428.5 seconds, 513 seconds more than the unfixed system,
	all of which was spent in the system.  There were twice as many
	voluntary context switches with the fix as without it.

Case 2: With an itimer running at .01 second ticks and 4000 threads (the most
	an unmodified kernel can handle), the fixed kernel ran the test in
	eight percent of the time (5.8 seconds as opposed to 70 seconds) and
	had better tick accuracy (.012 seconds per tick as opposed to .023
	seconds per tick).

Case 3: A 4000-thread test with an initial timer tick of .01 second and an
	interval of 10,000 seconds (i.e. a timer that ticks only once) had
	very nearly the same performance in both cases:  6.3 seconds elapsed
	for the fixed kernel versus 5.5 seconds for the unfixed kernel.

With fewer threads (eight in these tests), the Case 1 test ran in essentially
the same time on both the modified and unmodified kernels (5.2 seconds versus
5.8 seconds).  The Case 2 test ran in about the same time as well, 5.9 seconds
versus 5.4 seconds but again with much better tick accuracy, .013 seconds per
tick versus .025 seconds per tick for the unmodified kernel.

Since the fix affected the rlimit code, I also tested soft and hard CPU limits.

Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer
	running), the modified kernel was very slightly favored in that while
	it killed the process in 19.997 seconds of CPU time (5.002 seconds of
	wall time), only .003 seconds of that was system time, the rest was
	user time.  The unmodified kernel killed the process in 20.001 seconds
	of CPU (5.014 seconds of wall time) of which .016 seconds was system
	time.  Really, though, the results were too close to call.  The results
	were essentially the same with no itimer running.

Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds
	(where the hard limit would never be reached) and an itimer running,
	the modified kernel exhibited worse tick accuracy than the unmodified
	kernel: .050 seconds/tick versus .028 seconds/tick.  Otherwise,
	performance was almost indistinguishable.  With no itimer running this
	test exhibited virtually identical behavior and times in both cases.

In times past I did some limited performance testing.  those results are below.

On a four-cpu Opteron system without this fix, a sixteen-thread test executed
in 3569.991 seconds, of which user was 3568.435s and system was 1.556s.  On
the same system with the fix, user and elapsed time were about the same, but
system time dropped to 0.007 seconds.  Performance with eight, four and one
thread were comparable.  Interestingly, the timer ticks with the fix seemed
more accurate:  The sixteen-thread test with the fix received 149543 ticks
for 0.024 seconds per tick, while the same test without the fix received 58720
for 0.061 seconds per tick.  Both cases were configured for an interval of
0.01 seconds.  Again, the other tests were comparable.  Each thread in this
test computed the primes up to 25,000,000.

I also did a test with a large number of threads, 100,000 threads, which is
impossible without the fix.  In this case each thread computed the primes only
up to 10,000 (to make the runtime manageable).  System time dominated, at
1546.968 seconds out of a total 2176.906 seconds (giving a user time of
629.938s).  It received 147651 ticks for 0.015 seconds per tick, still quite
accurate.  There is obviously no comparable test without the fix.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 16:25:35 +02:00
Ingo Molnar
6e03f99803 Merge branch 'linus' into x86/iommu
Conflicts:
	lib/swiotlb.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 14:07:00 +02:00
Linus Torvalds
7c22a3d853 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] LBA28/LBA48 off-by-one bug in ata.h
  sata_inic162x: enable LED blinking
  ata: duplicate variable sparse warning
2008-09-13 14:48:14 -07:00
Alex Dubov
8e82f8c34b memstick: fix MSProHG 8-bit interface mode support
- 8-bit interface mode never worked properly.  The only adapter I have
  which supports the 8b mode (the Jmicron) had some problems with its
  clock wiring and they discovered it only now.  We also discovered that
  ProHG media is more sensitive to the ordering of initialization
  commands.

- Make the driver fall back to highest supported mode instead of always
  falling back to serial.  The driver will attempt the switch to 8b mode
  for any new MSPro card, but not all of them support it.  Previously,
  these new cards ended up in serial mode, which is not the best idea
  (they work fine with 4b, after all).

- Edit some macros for better conformance to Sony documentation

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-13 14:41:52 -07:00
Mel Gorman
5bead2a068 mm: mark the correct zone as full when scanning zonelists
The iterator for_each_zone_zonelist() uses a struct zoneref *z cursor when
scanning zonelists to keep track of where in the zonelist it is.  The
zoneref that is returned corresponds to the the next zone that is to be
scanned, not the current one.  It was intended to be treated as an opaque
list.

When the page allocator is scanning a zonelist, it marks elements in the
zonelist corresponding to zones that are temporarily full.  As the
zonelist is being updated, it uses the cursor here;

  if (NUMA_BUILD)
        zlc_mark_zone_full(zonelist, z);

This is intended to prevent rescanning in the near future but the zoneref
cursor does not correspond to the zone that has been found to be full.
This is an easy misunderstanding to make so this patch corrects the
problem by changing zoneref cursor to be the current zone being scanned
instead of the next one.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-13 14:41:52 -07:00
Hiroshi DOYU
dea420ce0e include/linux/ioport.h: add missing macro argument for devm_release_* family
akpm: these have no callers at this time, but they shall soon, so let's
get them right.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Hiroshi DOYU <Hiroshi.DOYU@nokia.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-13 14:41:50 -07:00
Taisuke Yamada
97b697a11b [libata] LBA28/LBA48 off-by-one bug in ata.h
I recently bought 3 HGST P7K500-series 500GB SATA drives and
had trouble accessing the block right on the LBA28-LBA48 border.
Here's how it fails (same for all 3 drives):

  # dd if=/dev/sdc bs=512 count=1 skip=268435455 > /dev/null
  dd: reading `/dev/sdc': Input/output error
  0+0 records in
  0+0 records out
  0 bytes (0 B) copied, 0.288033 seconds, 0.0 kB/s
  # dmesg
  ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  ata1.00: BMDMA stat 0x25
  ata1.00: cmd c8/00:08:f8:ff:ff/00:00:00:00:00/ef tag 0 dma 4096 in
  res 51/04:08:f8:ff:ff/00:00:00:00:00/ef Emask 0x1 (device error)
  ata1.00: status: { DRDY ERR }
  ata1.00: error: { ABRT }
  ata1.00: configured for UDMA/33
  ata1: EH complete
  ...

After some investigations, it turned out this seems to be caused
by misinterpretation of the ATA specification on LBA28 access.
Following part is the code in question:

  === include/linux/ata.h ===
  static inline int lba_28_ok(u64 block, u32 n_block)
  {
    /* check the ending block number */
    return ((block + n_block - 1) < ((u64)1 << 28)) && (n_block <= 256);
  }

HGST drive (sometimes) fails with LBA28 access of {block = 0xfffffff,
n_block = 1}, and this behavior seems to be comformant. Other drives,
including other HGST drives are not that strict, through.

>From the ATA specification:
(http://www.t13.org/Documents/UploadedDocuments/project/d1410r3b-ATA-ATAPI-6.pdf)

  8.15.29  Word (61:60): Total number of user addressable sectors
  This field contains a value that is one greater than the total number
  of user addressable sectors (see 6.2). The maximum value that shall
  be placed in this field is 0FFFFFFFh.

So the driver shouldn't use the value of 0xfffffff for LBA28 request
as this exceeds maximum user addressable sector. The logical maximum
value for LBA28 is 0xffffffe.

The obvious fix is to cut "- 1" part, and the patch attached just do
that. I've been using the patched kernel for about a month now, and
the same fix is also floating on the net for some time. So I believe
this fix works reliably.

Just FYI, many Windows/Intel platform users also seems to be struck
by this, and HGST has issued a note pointing to Intel ICH8/9 driver.

  "28-bit LBA command is being used to access LBAs 29-bits in length"
http://www.hitachigst.com/hddt/knowtree.nsf/cffe836ed7c12018862565b000530c74/b531b8bce8745fb78825740f00580e23

Also, *BSDs seems to have similar fix included sometime around ~2004,
through I have not checked out exact portion of the code.

Signed-off-by: Taisuke Yamada <tai@rakugaki.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-09-13 16:46:15 -04:00
Alexander Duyck
ca9b0e27e0 pkt_action: add new action skbedit
This new action will have the ability to change the priority and/or
queue_mapping fields on an sk_buff.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 16:30:20 -07:00
Alexander Duyck
92651940ab pkt_sched: Add multiqueue scheduler support
This patch is intended to add a qdisc to support the new tx multiqueue
architecture by providing a band for each hardware queue.  By doing
this it is possible to support a different qdisc per physical hardware
queue.

This qdisc uses the skb->queue_mapping to select which band to place
the traffic onto.  It then uses a round robin w/ a check to see if the
subqueue is stopped to determine which band to dequeue the packet from.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 16:29:34 -07:00
David S. Miller
c655705037 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-09-11 15:46:02 -07:00
Johannes Berg
44d414dbff mac80211: move some HT code out of mlme.c
Some of the HT code in mlme.c is misplaced:
 * constants/definitions belong to the ieee80211.h header
 * code being used in other modes as well shouldn't be there

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-11 15:53:37 -04:00
Tomas Winkler
00c5ae2fa0 mac80211: change MIMO_PS to SM_PS
This patch follows 11n spec naming more rigorously replacing MIMO_PS
with SM_PS (Spatial Multiplexing Power Save).

(Originally submitted as 4 patches, "mac80211: change MIMO_PS to SM_PS",
"iwlwifi: change MIMO_PS to SM_PS", "ath9k: change MIMO_PS to SM_PS",
and "iwlwifi: remove double definition of SM PS". -- JWL)

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-11 15:53:31 -04:00
Arjan van de Ven
2e94d1f71f hrtimer: peek at the timer queue just before going idle
As part of going idle, we already look at the time of the next timer event to determine
which C-state to select etc.

This patch adds functionality that causes the timers that are past their
soft expire time, to fire at this time, before we calculate the next wakeup
time. This functionality will thus avoid wakeups by running timers before
going idle rather than specially waking up for it.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-11 07:17:49 -07:00
Jens Axboe
2dc75d3c3b block: disable sysfs parts of the disk command filter
We still have life time issues with the sysfs command filter kobject,
so disable it for 2.6.27 release. We can revisit this and make it work
properly for 2.6.28, for 2.6.27 release it's too risky.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-09-11 14:20:23 +02:00
David S. Miller
a40c24a133 net: Add SKB DMA mapping helper functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 04:51:14 -07:00
David S. Miller
271bff7afb net: Add DMA mapping tokens to skb_shared_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 04:48:58 -07:00
Ingo Molnar
09b22a2f67 Merge commit 'v2.6.27-rc6' into sched/devel 2008-09-11 13:37:28 +02:00
Eric Miao
4d5975e501 Input: ads7846 - introduce .gpio_pendown to get pendown state
The GPIO connected to ADS7846 nPENIRQ signal is usually used to get
the pendown state as well. Introduce a .gpio_pendown, and use this
to decide the pendown state if .get_pendown_state is NULL.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-09-10 12:13:28 -04:00
Ingo Molnar
3ce9bcb583 Merge branch 'core/xen' into x86/xen 2008-09-10 14:05:45 +02:00
Jeremy Fitzhardinge
f7d0b926ac mm: define USE_SPLIT_PTLOCKS rather than repeating expression
Define USE_SPLIT_PTLOCKS as a constant expression rather than repeating
"NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS" all over the place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 14:04:59 +02:00
Ingo Molnar
59c37bf892 Merge commit 'v2.6.27-rc6' into x86/unify-cpu-detect
Conflicts:
	arch/x86/kernel/cpu/amd.c
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/cpu/common_64.c
	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 14:00:45 +02:00
FUJITA Tomonori
636dc67cbf add is_buffer_dma_capable helper function
is_buffer_dma_capable helper function is to see if a memory region is
DMA-capable or not. The arugments are the dma_mask (or
coherent_dma_mask) of a device and the address and size of a memory
region.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 11:33:43 +02:00
Ingo Molnar
e92b4fdacc Merge commit 'v2.6.27-rc6' into x86/iommu 2008-09-10 11:32:52 +02:00
Ingo Molnar
429b022af4 Merge commit 'v2.6.27-rc6' into core/rcu 2008-09-10 08:35:40 +02:00
Marc Zyngier
fb683f1627 Export smc91x led definitions
Now that we can configure smc91x leds from its platform data,
it seems rather useful to move the led definitions to the
externally visible header file.

Signed-off-by: Marc Zyngier <marc.zyngier@altran.com>
2008-09-09 17:41:42 +02:00
Gerrit Renker
410e27a49b This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"
as it accentally contained the wrong set of patches. These will be
submitted separately.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-09 13:27:22 +02:00
David S. Miller
0a68a20cc3 Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp
Conflicts:

	net/dccp/input.c
	net/dccp/options.c
2008-09-08 17:28:59 -07:00
David S. Miller
17dce5dfe3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	net/mac80211/mlme.c
2008-09-08 16:59:05 -07:00
Linus Torvalds
e1d7bf1499 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: arch_reinit_sched_domains() must destroy domains to force rebuild
  sched, cpuset: rework sched domains and CPU hotplug handling (v4)
2008-09-08 15:47:21 -07:00
Jeremy Fitzhardinge
9c3254ad42 x86: fix compile error with corruption checking disabled
Fix compile error:

 arch/x86/mm/init_32.c: In function 'mem_init':
 arch/x86/mm/init_32.c:908: error: implicit declaration of function 'start_periodic_check_for_corruption'

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 20:03:06 +02:00
Manfred Spraul
e545a6140b kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
Right now, there is no notifier that is called on a new cpu, before the new
cpu begins processing interrupts/softirqs.
Various kernel function would need that notification, e.g. kvm works around
by calling smp_call_function_single(), rcu polls cpu_online_map.

The patch adds a CPU_STARTING notification. It also adds a helper function
that sends the message to all cpu_chain handlers.

Tested on x86-64.
All other archs are untested. Especially on sparc, I'm not sure if I got
it right.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 19:25:24 +02:00
Theodore Ts'o
05496769e5 jbd2: clean up how the journal device name is printed
Calculate the journal device name once and stash it away in the
journal_s structure.  This avoids needing to call bdevname()
everywhere and reduces stack usage by not needing to allocate an
on-stack buffer.  In addition, we eliminate the '/' that can appear in
device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that
can cause problems when creating proc directory names, and include the
inode number to support ocfs2 which creates multiple journals with
different inode numbers.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-16 14:36:17 -04:00
Arjan van de Ven
4ce105d30e hrtimer: incorporate feedback from Peter Zijlstra
(based on  lkml review)
* use rt_task()
* task_nice() has a sign

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-07 15:31:39 -07:00
Arjan van de Ven
da8f2e170e hrtimer: add a hrtimer_start_range() function
this patch adds a _range version of hrtimer_start() so that range timers
can be created; the hrtimer_start() function is just a wrapper around this.

In addition, hrtimer_start_expires() will now preserve existing ranges.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-07 10:58:01 -07:00
Russell King
b0dbcf511c [NET] smc91x: provide configurable leds
This patch provides a mechanism for platforms to be able to supply the
LED configuration via platform data, rather than having to hard code
it in smc91x.h.

Acked-by: Eric Miao <eric.y.miao@gmail.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-07 17:32:58 +01:00
Hugh Dickins
bb577f980e x86: add periodic corruption check
Perodically check for corruption in low phusical memory.  Don't bother
checking at fault time, since it won't show anything useful.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-07 17:40:00 +02:00
Jeremy Fitzhardinge
5394f80f92 x86: check for and defend against BIOS memory corruption
Some BIOSes have been observed to corrupt memory in the low 64k.  This
change:
 - Reserves all memory which does not have to be in that area, to
   prevent it from being used as general memory by the kernel.  Things
   like the SMP trampoline are still in the memory, however.
 - Clears the reserved memory so we can observe changes to it.
 - Adds a function check_for_bios_corruption() which checks and reports on
   memory becoming unexpectedly non-zero.  Currently it's called in the
   x86 fault handler, and the powermanagement debug output.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-07 17:39:59 +02:00
Linus Torvalds
f532522565 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clocksource, acpi_pm.c: check for monotonicity
  clocksource, acpi_pm.c: use proper read function also in errata mode
  ntp: fix calculation of the next jiffie to trigger RTC sync
  x86: HPET: read back compare register before reading counter
  x86: HPET fix moronic 32/64bit thinko
  clockevents: broadcast fixup possible waiters
  HPET: make minimum reprogramming delta useful
  clockevents: prevent endless loop lockup
  clockevents: prevent multiple init/shutdown
  clockevents: enforce reprogram in oneshot setup
  clockevents: prevent endless loop in periodic broadcast handler
  clockevents: prevent clockevent event_handler ending up handler_noop
2008-09-06 19:33:26 -07:00
Ingo Molnar
291c54ff76 Merge branch 'sched/cpuset' into sched/urgent 2008-09-06 21:03:16 +02:00
Alexey Dobriyan
978b0116cd softirq: allocate less vectors
We don't need whole 32 of them, only NR_SOFTIRQS.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:04:36 +02:00
Max Krasnyansky
dfb512ec48 sched: arch_reinit_sched_domains() must destroy domains to force rebuild
What I realized recently is that calling rebuild_sched_domains() in
arch_reinit_sched_domains() by itself is not enough when cpusets are enabled.
partition_sched_domains() code is trying to avoid unnecessary domain rebuilds
and will not actually rebuild anything if new domain masks match the old ones.

What this means is that doing
     echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
on a system with cpusets enabled will not take affect untill something changes
in the cpuset setup (ie new sets created or deleted).

This patch fixes restore correct behaviour where domains must be rebuilt in
order to enable MC powersaving flags.

Test on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.
Also tested on dual-core Core2 laptop. Lockdep is happy and things are working
as expected.

Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 19:22:15 +02:00
Arjan van de Ven
2ec02270c0 hrtimer: another build fix
More randconfig testing

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-06 09:36:56 -07:00
Arjan van de Ven
584fb4a764 hrtimer: fix build bug found by Ingo
in some randconfig configurations, hrtimers are used even though
the hrtimer config if off; and it broke the build due to some of
the new functions being on the wrong side of the ifdef.

This patch moves the functions to the other side of the ifdef, fixing
the build bug.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-06 08:32:57 -07:00
Ingo Molnar
7f79d852ed Merge branch 'linus' into sched/devel 2008-09-06 16:51:57 +02:00
Ingo Molnar
77dd3b3bd2 Merge branch 'linus' into timers/ntp 2008-09-06 15:31:03 +02:00
Arjan van de Ven
6976675d94 hrtimer: create a "timer_slack" field in the task struct
We want to be able to control the default "rounding" that is used by
select() and poll() and friends. This is a per process property
(so that we can have a "nice" like program to start certain programs with
a looser or stricter rounding) that can be set/get via a prctl().

For this purpose, a field called "timer_slack_ns" is added to the task
struct. In addition, a field called "default_timer_slack"ns" is added
so that tasks easily can temporarily to a more/less accurate slack and then
back to the default.

The default value of the slack is set to 50 usec; this is significantly less
than 2.6.27's average select() and poll() timing error but still allows
the kernel to group timers somewhat to preserve power behavior. Applications
and admins can override this via the prctl()

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:35:30 -07:00
Arjan van de Ven
654c8e0b1c hrtimer: turn hrtimers into range timers
this patch turns hrtimers into range timers; they have 2 expire points
1) the soft expire point
2) the hard expire point

the kernel will do it's regular best effort attempt to get the timer run
at the hard expire point. However, if some other time fires after the soft
expire point, the kernel now has the freedom to fire this timer at this point,
and thus grouping the events and preventing a power-expensive wakeup in the
future.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:35:27 -07:00
Arjan van de Ven
799b64de25 hrtimer: rename the "expires" struct member to avoid accidental usage
To catch code that still touches the "expires" memory directly, rename it
to have the compiler complain rather than get nasty, hard to explain,
runtime behavior

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:35:25 -07:00
Arjan van de Ven
63ca243b27 hrtimer: add abstraction functions for accessing the "expires" member
In order to be able to turn hrtimers into range based, we need to provide
accessor functions for getting to the "expires" ktime_t member of the
struct hrtimer.

This patch adds a set of accessors for this purpose:
* hrtimer_set_expires
* hrtimer_set_expires_tv64
* hrtimer_add_expires
* hrtimer_add_expires_ns
* hrtimer_get_expires
* hrtimer_get_expires_tv64
* hrtimer_get_expires_ns
* hrtimer_expires_remaining
* hrtimer_start_expires

No users of these new accessors are added yet; these follow in later patches.
Hopefully this patch can even go into 2.6.27-rc so that the conversions will
not have a bottleneck in -next

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:35:05 -07:00
Arjan van de Ven
8ff3e8e85f select: switch select() and poll() over to hrtimers
With lots of help, input and cleanups from Thomas Gleixner

This patch switches select() and poll() over to hrtimers.

The core of the patch is replacing the "s64 timeout" with a
"struct timespec end_time" in all the plumbing.

But most of the diffstat comes from using the just introduced helpers:
	poll_select_set_timeout
	poll_select_copy_remaining
	timespec_add_safe
which make manipulating the timespec easier and less error-prone.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-05 21:35:03 -07:00
Thomas Gleixner
be5dad20a5 select: add a poll specific struct to the restart_block union
with hrtimer poll/select, the signal restart data no longer is a single
long representing a jiffies count, but it becomes a second/nanosecond pair
that also needs to encode if there was a timeout at all or not.

This patch adds a struct to the restart_block union for this purpose

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:35:01 -07:00
Thomas Gleixner
b773ad40ac select: add poll_select_set_timeout() and poll_select_copy_remaining() helpers
This patch adds 2 helpers that will be used for the hrtimer based select/poll:

poll_select_set_timeout() is a helper that takes a timeout (as a second, nanosecond
pair) and turns that into a "struct timespec" that represents the absolute end time.
This is a common operation in the many select() and poll() variants and needs various,
common, sanity checks.

poll_select_copy_remaining() is a helper that takes care of copying the remaining
time to userspace, as select(), pselect() and ppoll() do. This function comes in
both a natural and a compat implementation (due to datastructure differences).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:34:59 -07:00
Thomas Gleixner
df0cc0539b select: add a timespec_add_safe() function
For the select() rework, it's important to be able to add timespec
structures in an overflow-safe manner.

This patch adds a timespec_add_safe() function for this which is similar in
operation to ktime_add_safe(), but works on a struct timespec.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-09-05 21:34:57 -07:00
Arjan van de Ven
7bb67439bf select: Introduce a hrtimeout function
This patch adds a schedule_hrtimeout() function, to be used by select() and
poll() in a later patch. This function works similar to schedule_timeout()
in most ways, but takes a timespec rather than jiffies.

With a lot of contributions/fixes from Thomas

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-05 21:34:53 -07:00
Roland McGrath
22f30168d2 tracehook: comment pasto fixes
Fix some pasto's in comments in the new linux/tracehook.h and
asm-generic/syscall.h files.

Reported-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-05 14:39:38 -07:00
Li Zefan
11d55d2cba res_counter: fix off-by-one bug in setting limit
I found we can no longer set limit to 0 with 2.6.27-rcX:
 # mount -t cgroup -omemory xxx /mnt
 # mkdir /mnt/0
 # echo 0 > /mnt/0/memory.limit_in_bytes
 bash: echo: write error: Device or resource busy

It turned out 'limit' can't be set to 'usage', which is wrong IMO.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-05 14:39:37 -07:00
Linus Torvalds
7f621861fb Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: fix process time monotonicity
  sched_clock: fix NOHZ interaction
2008-09-05 14:37:15 -07:00
Linus Torvalds
6f74b1849b Merge git://git.infradead.org/~dwmw2/dwmw2-2.6.27
* git://git.infradead.org/~dwmw2/dwmw2-2.6.27:
  Revert "[ARM] use the new byteorder headers"
  Fix conditional export of kvh.h and a.out.h to userspace.
  [MTD] [NAND] tmio_nand: fix base address programming
2008-09-05 14:31:54 -07:00
Linus Torvalds
14408c4f41 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (98 commits)
  V4L/DVB (8881): gspca: After 'while (retry--) {...}', retry will be -1 but not 0.
  V4L/DVB (8880): PATCH: Fix parents on some webcam drivers
  V4L/DVB (8877): b2c2 and bt8xx: udelay to mdelay
  V4L/DVB (8876): budget: udelay changed to mdelay
  V4L/DVB (8874): gspca: Adjust hstart for sn9c103/ov7630 and update usb-id's.
  V4L/DVB (8873): gspca: Bad image offset with rev012a of spca561 and adjust exposure.
  V4L/DVB (8872): gspca: Bad image format and offset with rev072a of spca561.
  V4L/DVB (8870): gspca: Fix dark room problem with sonixb.
  V4L/DVB (8869): gspca: Move the Sonix webcams with TAS5110C1B from sn9c102 to gspca.
  V4L/DVB (8868): gspca: Support for vga modes with sif sensors in sonixb.
  V4L/DVB (8844): dabusb_fpga_download(): fix a memory leak
  V4L/DVB (8843): tda10048_firmware_upload(): fix a memory leak
  V4L/DVB (8842): vivi_release(): fix use-after-free
  V4L/DVB (8840): dib0700: add basic support for Hauppauge Nova-TD-500 (84xxx)
  V4L/DVB (8839): dib0700: add comment to identify 35th USB id pair
  V4L/DVB (8837): dvb: fix I2C adapters name size
  V4L/DVB (8835): gspca: Same pixfmt as the sn9c102 driver and raw Bayer added in sonixb.
  V4L/DVB (8834): gspca: Have a bigger buffer for sn9c10x compressed images.
  V4L/DVB (8833): gspca: Cleanup the sonixb code.
  V4L/DVB (8832): gspca: Bad pixelformat of vc0321 webcams.
  ...
2008-09-05 14:29:50 -07:00
Linus Torvalds
54e2a3270f Merge branch 'core/debugobjects' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core/debugobjects' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: fix lockdep warning
2008-09-05 14:28:19 -07:00
Luis R. Rodriguez
f59ac04816 cfg80211: keep track of supported interface modes
It is obviously good for userspace to know up front which
interface modes a given piece of hardware might support (even
if adding such an interface might fail later because of
concurrency issues), so let's make cfg80211 aware of that.
For good measure, disallow adding interfaces in all other
modes so drivers don't forget to announce support for one mode
when they add it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stephen Blackheath <tramp.enshrine.stephen@blacksapphire.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-05 16:17:42 -04:00
Joerg Roedel
ca1af29a73 x86, pci: add northbridge pci ids for fam 0x11 processors
The PCI device ids for AMD family 0x11 processors are missing in pci_ids.h.
This patch adds them.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 19:11:43 +02:00
Balbir Singh
49048622ea sched: fix process time monotonicity
Spencer reported a problem where utime and stime were going negative despite
the fixes in commit b27f03d4bd. The suspected
reason for the problem is that signal_struct maintains it's own utime and
stime (of exited tasks), these are not updated using the new task_utime()
routine, hence sig->utime can go backwards and cause the same problem
to occur (sig->utime, adds tsk->utime and not task_utime()). This patch
fixes the problem

TODO: using max(task->prev_utime, derived utime) works for now, but a more
generic solution is to implement cputime_max() and use the cputime_gt()
function for comparison.

Reported-by: spencer@bluehost.com
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 18:14:35 +02:00
Khem Raj
afbc8d8e72 Fix conditional export of kvh.h and a.out.h to userspace.
Some architectures have moved the asm/ into arch/ and some have not.
This patch checks for a.out.h and kvh.h in both places before exporting
the corresponding file from linux/

[dwmw2: simplified a little]
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-09-05 15:44:31 +01:00
Venkatesh Pallipadi
7c1e768974 clockevents: prevent clockevent event_handler ending up handler_noop
There is a ordering related problem with clockevents code, due to which
clockevents_register_device() called after tickless/highres switch
will not work. The new clockevent ends up with clockevents_handle_noop as
event handler, resulting in no timer activity.

The problematic path seems to be

* old device already has hrtimer_interrupt as the event_handler
* new clockevent device registers with a higher rating
* tick_check_new_device() is called
  * clockevents_exchange_device() gets called
    * old->event_handler is set to clockevents_handle_noop
  * tick_setup_device() is called for the new device
    * which sets new->event_handler using the old->event_handler which is noop.

Change the ordering so that new device inherits the proper handler.

This does not have any issue in normal case as most likely all the clockevent
devices are setup before the highres switch. But, can potentially be affecting
some corner case where HPET force detect happens after the highres switch.
This was a problem with HPET in MSI mode code that we have been experimenting
with.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 11:11:51 +02:00
Ingo Molnar
d3d0ba7b8f Merge commit '63cc8c75156462d4b42cbdd76c293b7eee7ddbfe':
"percpu: introduce DEFINE_PER_CPU_PAGE_ALIGNED() macro"

into x86/core

Conflicts:
	arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:24:30 +02:00
Lennert Buytenhek
ac840605f3 mv643xx_eth: remove force_phy_addr field
Currently, there are two different fields in the
mv643xx_eth_platform_data struct that together describe the PHY
address -- one field (phy_addr) has the address of the PHY, but if
that address is zero, a second field (force_phy_addr) needs to be
set to distinguish the actual address zero from a zero due to not
having filled in the PHY address explicitly (which should mean
'use the default PHY address').

If we are a bit smarter about the encoding of the phy_addr field,
we can avoid the need for a second field -- this patch does that.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05 06:33:59 +02:00
Lennert Buytenhek
fc0eb9f226 mv643xx_eth: smi sharing is a per-unit property, not a per-port one
Which top-level unit's SMI interface to use should be a property of
the top-level unit, not of the individual ports.  This patch moves the
->shared_smi pointer from the per-port platform data to the global
platform data.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05 06:33:58 +02:00
Lennert Buytenhek
f7981c1c67 mv643xx_eth: require contiguous receive and transmit queue numbering
Simplify receive and transmit queue handling by requiring the set
of queue numbers to be contiguous starting from zero.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05 06:33:58 +02:00
Mauro Carvalho Chehab
db0a2901a3 Merge branch 'fixes_stg' of ../git_old into fixes 2008-09-04 16:24:02 -03:00
Ingo Molnar
8040d77688 Merge branch 'core/resources' into x86/core 2008-09-04 21:04:04 +02:00
Yinghai Lu
268364a0f4 IO resources: add reserve_region_with_split()
add reserve_region_with_split() to not lose e820 reserved entries if
they overlap with existing IO regions:

with test case by extend 0xe0000000 - 0xeffffff to 0xdd800000 -
we get:
	e0000000-efffffff : PCI MMCONFIG 0
		 e0000000-efffffff : reserved

and in /proc/iomem we get:
	found conflict for reserved [dd800000, efffffff], try to reserve with split
	    __reserve_region_with_split: (PCI Bus #80) [dd000000, ddffffff], res: (reserved) [dd800000, efffffff]
	    __reserve_region_with_split: (PCI Bus #00) [de000000, dfffffff], res: (reserved) [de000000, efffffff]
	initcall pci_subsys_init+0x0/0x121 returned 0 after 381 msecs
in dmesg

various fixes and improvements suggested by Linus.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:02:44 +02:00
H. Peter Anvin
0ccd8c39bc Merge branch 'linus' into x86/core 2008-09-04 08:09:09 -07:00
H. Peter Anvin
7203781c98 Merge branch 'x86/cpu' into x86/core
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 08:08:42 -07:00
David Woodhouse
aa7a7fb399 Define and use PCI_DEVICE_ID_MARVELL_88ALP01_CCIC for CAFÉ camera driver
Also, stop looking at the NAND controller (0x4100) and checking the
device class. For a while during development, all three functions on the
chip had the same ID. We made them fix that fairly promptly, and we can
forget about it now.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
2008-09-04 09:46:00 +01:00
David Woodhouse
514fca4373 [MTD] [NAND] Define and use PCI_DEVICE_ID_MARVELL_88ALP01_NAND for CAFÉ
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-09-04 09:45:38 +01:00
David Woodhouse
8c5eb88058 Use PCI_DEVICE_ID_88ALP01 for CAFÉ chip, rather than PCI_DEVICE_ID_CAFE.
Probably better to use the official designation.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
2008-09-04 09:45:34 +01:00
Tomasz Grobelny
d6da3511d6 dccp: Policy-based packet dequeueing infrastructure
This patch adds a generic infrastructure for policy-based dequeueing of 
TX packets and provides two policies:
 * a simple FIFO policy (which is the default) and
 * a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options). 

The priority policy uses skb->priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb->priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.

Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:39 +02:00
Gerrit Renker
e7937772d7 dccp: Extend CCID packet dequeueing interface
This extends the packet dequeuing interface of dccp_write_xmit() to allow
 1. CCIDs to take care of timing when the next packet may be sent;
 2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).

The main purpose is to take CCID2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).
The interface can also be used to support other CCIDs.

The mode of operation for (2) is as follows:
 * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
 * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full), 
 * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
 * dccp_write_xmit() returns without further action;
 * after some time the wait-condition for CCID becomes true,
 * that CCID schedules the tasklet,
 * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
 * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
 * packet is sent, and possibly more (since dccp_write_xmit() loops).

Code reuse: the taskled function calls dccp_write_xmit(), the timer function
            reduces to a wrapper around the same code.

If the tasklet finds that the socket is locked, it re-schedules the tasklet
function (not the tasklet) after one jiffy.

Changed DCCP_BUG to dccp_pr_debug when transmit_skb returns an error (e.g. when a
local qdisc is used, NET_XMIT_DROP=1 can be returned for many packets).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:38 +02:00
Gerrit Renker
c2f42077bd dccp ccid-2: Schedule Sync as out-of-band mechanism
The problem with Ack Vectors is that 

  i) their length is variable and can in principle grow quite large,
 ii) it is hard to predict exactly how large they will be.

Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.

The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS (previous patch), and to defer any larger Ack Vectors onto
a separate Sync - but only if indeed there is no space left on the skb.

This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.

It can thus serve other parts of the DCCP code as well.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:37 +02:00
Gerrit Renker
f10ecaee6d dccp: Replace magic CCID-specific numbers by symbolic constants
The constants DCCPO_{MIN,MAX}_CCID_SPECIFIC are nowhere used in the code, but
instead for the CCID-specific options numbers are used.

This patch unifies the use of CCID-specific option numbers, by adding symbolic
names reflecting the definitions in RFC 4340, 10.3.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:34 +02:00
Gerrit Renker
0a4822679d dccp: Initialisation and type-checking of feature sysctls
This patch takes care of initialising and type-checking sysctls related to
feature negotiation. Type checking is important since some of the sysctls
now directly act on the feature-negotiation process.

The sysctls are initialised with the known default values for each feature.
For the type-checking the value constraints from RFC 4340 are used:

 * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes),
   tested and confirmed that it works up to 4294967295 - for Gbps speed;
 * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer);
 * CCIDs are between 0 .. 255;
 * request_retries, retries1, retries2 also between 0..255 for good measure;
 * tx_qlen is checked to be non-negative;
 * sync_ratelimit remains as before.

Further changes:
----------------
Performed s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:32 +02:00
Gerrit Renker
51c7d4fa26 dccp: Implement both feature-local and feature-remote Sequence Window feature
This adds full support for local/remote Sequence Window feature, from which the 
  * sequence-number-validity (W) and 
  * acknowledgment-number-validity (W') windows 
derive as specified in RFC 4340, 7.5.3. 

Specifically, the following changes are introduced:
  * integrated new socket fields into dccp_sk;
  * updated the update_gsr/gss routines with regard to these fields;
  * updated handler code: the Sequence Window feature is located at the TX side,
    so the local feature is meant if the handler-rx flag is false;
  * the initialisation of `rcv_wnd' in reqsk is removed, since
    - rcv_wnd is not used by the code anywhere;
    - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
    - dccp_check_req checks the Ack number validity more rigorously;
  * the `struct dccp_minisock' became empty and is now removed.

Until the handshake completes with activating negotiated values, the local/remote
Sequence-Window values are undefined and thus can not reliably be estimated.
This issue is addressed in a separate patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:32 +02:00
Gerrit Renker
5d3dac267a dccp: Initialisation framework for feature negotiation
This initialises feature negotiation from two tables, which are initialised
from sysctls. 

As a novel feature, specifics of the implementation (e.g. currently short
seqnos and ECN are not supported) are advertised for robustness.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
b235dc4abb dccp ccid-2: Phase out the use of boolean Ack Vector sysctl
This removes the use of the sysctl and the minisock variable for the Send Ack
Vector feature, which is now handled fully dynamically via feature negotiation;
i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
RFC 4341, 4.).

Using a sysctl in parallel to this implementation would open the door to
crashes, since much of the code relies on tests of the boolean minisock /
sysctl variable. Thus, this patch replaces all tests of type

	if (dccp_msk(sk)->dccpms_send_ack_vector)
		/* ... */
with
	if (dp->dccps_hc_rx_ackvec != NULL)
		/* ... */

The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
negotiation concluded that Ack Vectors are to be used on the half-connection.
Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
so that the test is a valid one.

The activation handler for Ack Vectors is called as soon as the feature
negotiation has concluded at the
 * server when the Ack marking the transition RESPOND => OPEN arrives;
 * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.

Adding the sequence number of the Response packet to the Ack Vector has been 
removed, since
 (a) connection establishment implies that the Response has been received;
 (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
     this entry will always be ignored;
 (c) it can not be used for anything useful - to detect loss for instance, only
     packets received after the loss can serve as pseudo-dupacks.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
68e074bfce dccp: Remove manual influence on NDP Count feature
Updating the NDP count feature is handled automatically now:
 * for CCID-2 it is disabled, since the code does not use NDP counts;
 * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths.

Allowing the user to change NDP values leads to unpredictable and failing
behaviour, since it is then possible to disable NDP counts even when they
are needed (e.g. in CCID-3).

This means that only those user settings are sensible that agree with the
values for Send NDP Count implied by the choice of CCID. But those settings
are already activated by the feature negotiation (CCID dependency tracking),
hence this form of support is redundant.

At startup the initialisation of the NDP count feature is with the default
value of 0, which is done implicitly by the zeroing-out of the socket when
it is allocated. If the choice of CCID or feature negotiation enables NDP
count, this will then be updated via the NDP activation handler.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
78673e24df dccp: Remove obsolete parts of the old CCID interface
The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector
case, their value equals initially that of the sysctl, but at the end of
feature negotiation may be something different.

The old interface removed by this patch thus has been replaced by the newer
interface to dynamically query the currently loaded CCIDs earlier in this
patch set.

Also removed the constructors for the TX CCID and the RX CCID, since the
switch rx/non-rx is done by the handler in minisocks.c (and the handler is
the only place in the code where CCIDs are loaded).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:31 +02:00
Gerrit Renker
fade756f18 dccp: Set per-connection CCIDs via socket options
With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which
overrides the defaults set by the global sysctl variables for TX/RX CCIDs.

To make full use of this facility, the remaining patches of this patch set are
needed, which track dependencies and activate negotiated feature values.

Note on the maximum number of CCIDs that can be registered:
-----------------------------------------------------------
The maximum number of CCIDs that can be registered on the socket is constrained
by the space in a Confirm/Change feature negotiation option. 

The space in these in turn depends on the size of header options as defined
in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from
ackvec.h into linux/dccp.h, clarifying its purpose.

Relative to this size, the maximum number of CCID identifiers that can be 
present in a Confirm option (which always consumes 1 byte more than a Change
option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the
CCID-feature-type and one for the selected value.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:28 +02:00
Gerrit Renker
17c30b40ed dccp: Deprecate Ack Ratio sysctl
This patch deprecates the Ack Ratio sysctl, since
 * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
 * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
 * even if it would work in CCID-2, there is no point for a user to change it:
   - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
   - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts 
     (since waiting for Acks which will never arrive in this window),
   - cwnd is not a user-configurable value.	

The only reasonable place for Ack Ratio is to print it for debugging. It is
planned to do this later on, as part of e.g. dccp_probe.

With this patch Ack Ratio is now under full control of feature negotiation:
 * Ack Ratio is resolved as a dependency of the selected CCID;
 * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
   the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
   Ratio 2 for both endpoints";
 * what happens then is part of another patch set, since it concerns the 
   dynamic update of Ack Ratio while the connection is in full flight.

Thanks to Tomasz Grobelny for discussion leading up to this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2008-09-04 07:45:28 +02:00
Gerrit Renker
20f41eee82 dccp: Feature negotiation for minimum-checksum-coverage
This provides feature negotiation for server minimum checksum coverage
which so far has been missing.

Since sender/receiver coverage values range only from 0...15, their
type has also been reduced in size from u16 to u4.

Feature-negotiation options are now generated for both sender and receiver
coverage, i.e. when the peer has `forgotten' to enable partial coverage
then feature negotiation will automatically enable (negotiate) the partial
coverage value for this connection.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:27 +02:00
Gerrit Renker
668144f7b4 dccp: Deprecate old setsockopt framework
The previous setsockopt interface, which passed socket options via struct 
dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
ugly code since the old approach did not distinguish between NN and SP values.

This patch removes the old setsockopt interface and replaces it with two new
functions to register NN/SP values for feature negotiation. These are 
essentially wrappers around the internal __feat_register functions, with 
checking added to avoid
 * wrong usage (type);
 * changing values while the connection is in progress.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:27 +02:00
Gerrit Renker
71bb49596b dccp: Query supported CCIDs
This provides a data structure to record which CCIDs are locally supported
and three accessor functions:
 - a test function for internal use which is used to validate CCID requests
   made by the user;
 - a copy function so that the list can be used for feature-negotiation;   
 - documented getsockopt() support so that the user can query capabilities.

The data structure is a table which is filled in at compile-time with the
list of available CCIDs (which in turn depends on the Kconfig choices).

Using the copy function for cloning the list of supported CCIDs is useful for
feature negotiation, since the negotiation is now with the full list of available
CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
will not fail if the peer requests to use CCID3 instead of CCID2. 

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:27 +02:00
Gerrit Renker
828755cee0 dccp: Per-socket initialisation of feature negotiation
This provides feature-negotiation initialisation for both DCCP sockets and
DCCP request_sockets, to support feature negotiation during connection setup.

It also resolves a FIXME regarding the congestion control initialisation.

Thanks to Wei Yongjun for help with the IPv6 side of this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:26 +02:00
Gerrit Renker
b4eec20637 dccp: Implement lookup table for feature-negotiation information
A lookup table for feature-negotiation information, extracted from RFC 4340/42,
is provided by this patch. All currently known features can be found in this 
table, along with their feature location, their default value, and type.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04 07:45:26 +02:00
Krzysztof Helt
64151ad5b3 rtc-m48t59: allow externally mapped ioaddr
Add support for externally mapped ioaddr.  This is required on sparc32
as the ioaddr must be mapped with of_ioremap().

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-03 15:41:57 -07:00
Krzysztof Helt
94fe7424a4 rtc-m48t59: add support for M48T02 and M48T59 chips
Add support for two compatible RTC:
- M48T08 which does not have alarm part,
- M48T08 which does not have alarm part and has
  only 2KB of NVRAM

These types covers all Mostek's RTC used in Sun UltraSparc workstations.

Tested on Sun Ultra60 with M48T59 RTC.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-03 15:39:11 -07:00
Jean-Francois Moine
20122542df V4L/DVB (8832): gspca: Bad pixelformat of vc0321 webcams.
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2008-09-03 18:37:45 -03:00
Hans de Goede
d6db35e89c V4L/DVB (8809): gspca: Revert commit 9a9335776548d01525141c6e8f0c12e86bbde982
the previous patch (sensor upside down).

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2008-09-03 18:37:22 -03:00
Hans de Goede
89a44b8a69 V4L/DVB (8720): gspca: V4L2_CAP_SENSOR_UPSIDE_DOWN added as a cap for some webcams.
This patch adds a V4L2_CAP_SENSOR_UPSIDE_DOWN flag to the capabilities flags,
and sets this flag for the Philips SPC200NC cam (which has its sensor installed
upside down). The same flag is also needed and added for the Philips SPC300NC.

Together with a patch to libv4l which adds flipping the image in software this
fixes the upside down display with the SPC200NC cam.

Signed-of-by: Hans de Goede <j.w.r.degoede@hhs.nl>

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2008-09-03 18:37:00 -03:00
Jean-Francois Moine
f75c4950bb V4L/DVB (8675): gspca: Pixmap PJPG (Pixart 73xx JPEG) added, generated by pac7311.
The JPEG frames generated by the Pixart 73xx have:
- special markers 'ff ff ff xx' every 1024/512 bytes,
- unused 8 bits at end of JPEG blocks,
and then ask for a new pixel format.

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-09-03 18:36:39 -03:00
Linus Torvalds
d26acd92fa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipsec: Fix deadlock in xfrm_state management.
  ipv: Re-enable IP when MTU > 68
  net/xfrm: Use an IS_ERR test rather than a NULL test
  ath9: Fix ath_rx_flush_tid() for IRQs disabled kernel warning message.
  ath9k: Incorrect key used when group and pairwise ciphers are different.
  rt2x00: Compiler warning unmasked by fix of BUILD_BUG_ON
  mac80211: Fix debugfs union misuse and pointer corruption
  wireless/libertas/if_cs.c: fix memory leaks
  orinoco: Multicast to the specified addresses
  iwlwifi: fix 64bit platform firmware loading
  iwlwifi: fix apm_stop (wrong bit polarity for FLAG_INIT_DONE)
  iwlwifi: workaround interrupt handling no some platforms
  iwlwifi: do not use GFP_DMA in iwl_tx_queue_init
  net/wireless/Kconfig: clarify the description for CONFIG_WIRELESS_EXT_SYSFS
  net: Unbreak userspace usage of linux/mroute.h
  pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock()
  ipv6: When we droped a packet, we should return NET_RX_DROP instead of 0
2008-09-02 21:02:14 -07:00
KOSAKI Motohiro
4b8561521d mm: show quicklist usage in /proc/meminfo
Quicklists can consume several GB of memory.  We should provide a means of
monitoring this.

After this patch is applied, /proc/meminfo will output the following:

% cat /proc/meminfo

MemTotal:      7715392 kB
MemFree:       5401600 kB
Buffers:         80384 kB
Cached:         300800 kB
SwapCached:          0 kB
Active:         235584 kB
Inactive:       262656 kB
SwapTotal:     2031488 kB
SwapFree:      2031488 kB
Dirty:            3520 kB
Writeback:           0 kB
AnonPages:      117696 kB
Mapped:          38528 kB
Slab:          1589952 kB
SReclaimable:    23104 kB
SUnreclaim:    1566848 kB
PageTables:      14656 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
WritebackTmp:        0 kB
CommitLimit:   5889152 kB
Committed_AS:   393152 kB
VmallocTotal: 17592177655808 kB
VmallocUsed:     29056 kB
VmallocChunk: 17592177626432 kB
Quicklists:     130944 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
HugePages_Surp:      0
Hugepagesize:    262144 kB

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-02 19:21:38 -07:00
Linus Torvalds
afa153fd7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide/Kconfig: mark ide-scsi as deprecated
  ide-disk: remove stale init_idedisk_capacity() documentation
  palm_bk3710: improve IDE registration
  ide: fix hwif_to_node()
  IDE: palm_bk3710: fix compile warning for unused variable
  IDE: compile fix for sff_dma_ops
2008-09-02 11:44:11 -07:00
Bartlomiej Zolnierkiewicz
96f80219b7 ide: fix hwif_to_node()
hwif_to_node() incorrectly assumes that hwif->dev always belongs to
a PCI device.  This results in ide-cs oopsing in init_irq() after
commit c56c5648a3 accidentally fixed
device tree registration for ide-cs.  Fix it by using dev_to_node().

Thanks to Martin Michlmayr and Larry Finger for help with debugging
the issue.

Reported-by: Martin Michlmayr <tbm@cyrius.com>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-09-02 20:18:47 +02:00
Kevin Hilman
71fc9fcc70 IDE: compile fix for sff_dma_ops
The sff_dma_ops struct should be wrapped by BLK_DEV_IDEDMA_SFF instead
of BLK_DEV_IDEDMA_PCI.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-09-02 20:18:46 +02:00
Linus Torvalds
e77295dc9e Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix buffer overrun decoding NFSv4 acl
  sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports
  nfsd: fix compound state allocation error handling
  svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet
2008-09-02 10:58:11 -07:00
David Woodhouse
9d7548d4ca Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-09-01 11:32:13 +01:00
Vegard Nossum
673d62cc5e debugobjects: fix lockdep warning
Daniel J. Blueman reported:
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.27-rc4-224c #1
> -------------------------------------------------------
> hald/4680 is trying to acquire lock:
>  (&n->list_lock){++..}, at: [<ffffffff802bfa26>] add_partial+0x26/0x80
>
> but task is already holding lock:
>  (&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>]
> debug_object_free+0x5c/0x120

We fix it by moving the actual freeing to outside the lock (the lock
now only protects the list).

The pool lock is also promoted to irq-safe (suggested by Dan). It's
necessary because free_pool is now called outside the irq disabled
region. So we need to protect against an interrupt handler which calls
debug_object_init().

[tglx@linutronix.de: added hlist_move_list helper to avoid looping
		     through the list twice]

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-09-01 09:47:16 +02:00
Matthew Garrett
942ed16194 power_supply: Add function to return system-wide power state
Certain drivers benefit from knowing whether the system is on ac or
battery, for instance when determining which backlight registers to
read. This adds a simple call to determine whether there's an online
power supply other than any batteries.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
2008-09-01 02:42:54 +04:00
David S. Miller
b171e19ed0 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/mlme.c
2008-08-29 23:06:00 -07:00
Linus Torvalds
bef69ea0dc Resource handling: add 'insert_resource_expand_to_fit()' function
Not used anywhere yet, but this complements the existing plain
'insert_resource()' functionality with a version that can expand the
resource we are adding in order to fix up any conflicts it has with
existing resources.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-29 20:25:20 -07:00
David S. Miller
7c19a3d280 net: Unbreak userspace usage of linux/mroute.h
Nothing in linux/pim.h should be exported to userspace.

This should fix the XORP build failure reported by
Jose Calhariz, the debain package maintainer.

Nothing originally in linux/mroute.h was exported to userspace
ever, but some of this stuff started to be when it was moved into
this new linux/pim.h, and that was wrong.  If we didn't provide these
definitions for 10 years we can reasonably expect that applications
defined this stuff locally or used GLIBC headers providing the
protocol definitions.  And as such the only result of this can
be conflict and userland build breakage.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-29 14:37:23 -07:00
Jouni Malinen
36aedc903e mac80211/cfg80211: HT capabilities for NEW_STA
Allow userspace (e.g., hostapd) to set HT capabilities for associated
STAs. This is based on a patch from Zhu Yi <yi.zhu@intel.com> (only
the NL80211_ATTR_HT_CAPABILITY for NEW_STA part is included here).

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:24:09 -04:00
Larry Finger
31ce12fb3e ssb: Clean up extraction of MAC addresses from SPROM
Only rev 1 and 2 ssb SPROMs have fields named et0mac and et1mac;
however, all of the extraction routines extract pseudo data for these
fields from regions that are all 1's resulting in a hardware address
of FF:FF:FF:FF:FF:FF. This patch forces such a fill at the beginning of
the data extraction process, and only does the formal extraction if the
SPROM rev is 1 or 2.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:24:07 -04:00
Larry Finger
095f695cbb ssb: Update for Rev. 5 SPROM
Although a revision 5 SPROM has not been seen in the wild, the
open-source portion of the MIPS driver 4.150.10.5 describes its
layout, which is mostly inherited from revision 4. This patch
implements the differences.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:24:05 -04:00
Sujith
18d7c65ba6 mac80211: Add an 802.11n definition
This patch adds a HT Capability (DSSS/CCK Mode in 40MHz BSS)
definition to ieee80211.h

Signed-off-by: Sujith Manoharan <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:24:04 -04:00
Jouni Malinen
9f1ba9062e mac80211/cfg80211: Add BSS configuration options for AP mode
This change adds a new cfg80211 command, NL80211_CMD_SET_BSS, to allow
AP mode BSS parameters to be changed from user space (e.g., hostapd).
The drivers using mac80211 are expected to be modified with separate
changes to use the new BSS info parameter for short slot time in the
bss_info_changed() handler.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:23:55 -04:00
Neil Horman
17f0f4a47d crypto: rng - RNG interface and implementation
This patch adds a random number generator interface as well as a
cryptographic pseudo-random number generator based on AES.  It is
meant to be used in cases where a deterministic CPRNG is required.

One of the first applications will be as an input in the IPsec IV
generation process.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-29 15:50:04 +10:00
Herbert Xu
73d3864a48 crypto: api - Use test infrastructure
This patch makes use of the new testing infrastructure by requiring
algorithms to pass a run-time test before they're made available to
users.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-29 15:49:57 +10:00
Herbert Xu
da7f033ddc crypto: cryptomgr - Add test infrastructure
This patch moves the newly created alg_test infrastructure into
cryptomgr.  This shall allow us to use it for testing at algorithm
registrations.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-29 15:49:55 +10:00
Linus Torvalds
9c2bdac40e Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  i2c: Prevent log spam on some DVB adapters
  i2c: Add missing kerneldoc descriptions
  i2c: Fix device_init_wakeup place
2008-08-28 12:28:50 -07:00
David Teigland
0f8e0d9a31 dlm: allow multiple lockspace creates
Add a count for lockspace create and release so that create can
be called multiple times to use the lockspace from different places.
Also add the new flag DLM_LSFL_NEWEXCL to create a lockspace with
the previous behavior of returning -EEXIST if the lockspace already
exists.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-08-28 11:49:15 -05:00
Jean Delvare
96e21e4fbc i2c: Add missing kerneldoc descriptions
Add missing kernel descriptions of struct i2c_driver members.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Brownell <david-b@pacbell.net>
2008-08-28 08:33:23 +02:00
Eric Paris
da31894ed7 securityfs: do not depend on CONFIG_SECURITY
Add a new Kconfig option SECURITYFS which will build securityfs support
but does not require CONFIG_SECURITY.  The only current user of
securityfs does not depend on CONFIG_SECURITY and there is no reason the
full LSM needs to be built to build this fs.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-28 10:47:42 +10:00
Linus Torvalds
5b51a7e9d8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] deal with the first call of ->show() generating no output
  [PATCH] fix ->llseek() for a bunch of directories
  [PATCH] fix regular readdir() and friends
  [PATCH] fix hpux_getdents()
  [PATCH] fix osf_getdirents()
  [PATCH] ntfs: use d_add_ci
  [PATCH] change d_add_ci argument ordering
  [PATCH] fix efs_lookup()
  [PATCH] proc: inode number fixlet
2008-08-27 14:31:44 -07:00
Jens Axboe
5168c47b4c block: remove blk_queue_tag_depth() and blk_queue_tag_queue()
They are unused and ->busy doesn't exist anymore.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27 09:50:20 +02:00
FUJITA Tomonori
4beab5c623 block: rename blk_scsi_cmd_filter to blk_cmd_filter
Technically, the cmd_filter would be applied to other protocols though
it's unlikely to happen. Putting SCSI stuff to request_queue is kinda
layer violation. So let's rename it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27 09:50:19 +02:00
FUJITA Tomonori
abf5439370 block: move cmdfilter from gendisk to request_queue
cmd_filter works only for the block layer SG_IO with SCSI block
devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
character devices (such as st). We hit a kernel crash with them.

The problem is that cmd_filter code accesses to gendisk (having struct
blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
SCSI block device files. With character device files, inode->i_bdev
leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
isn't safe.

SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
independent on any protocols. We shouldn't change ULDs to expose their
gendisk.

This patch moves struct blk_scsi_cmd_filter from gendisk to
request_queue, a common object, which eveyone can access to.

The user interface doesn't change; users can change the filters via
/sys/block/. gendisk has a pointer to request_queue so the cmd_filter
code accesses to struct blk_scsi_cmd_filter.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27 09:50:19 +02:00
Simon Horman
7fd1067851 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 into lvs-next-2.6 2008-08-27 15:11:37 +10:00
David Woodhouse
5770a3fb5f Fix userspace export of <linux/net.h>
Including <linux/fcntl.h> in the user-visible part of this header has
caused build regressions with headers from 2.6.27-rc. Move it down to
the #ifdef __KERNEL__ part, which is the only place it's needed. Move
some other kernel-only things down there too, while we're at it.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-26 10:37:20 -07:00
Kevin Diggs
65eb3dc609 sched: add kernel doc for the completion, fix kernel-doc-nano-HOWTO.txt
This patch adds kernel doc for the completion feature.

An error in the split-man.pl PERL snippet in kernel-doc-nano-HOWTO.txt is
also fixed.

Signed-off-by: Kevin Diggs <kevdig@hypersurf.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-26 10:26:54 +02:00
Ingo Molnar
3cf430b063 Merge branch 'linus' into sched/devel 2008-08-26 10:25:59 +02:00
Linus Torvalds
087713f454 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: fix userspace ABI breakage
  KVM: MMU: Fix torn shadow pte
  KVM: Use .fixup instead of .text.fixup on __kvm_handle_fault_on_reboot
2008-08-25 11:19:53 -07:00
Adrian Bunk
1327138e29 KVM: fix userspace ABI breakage
The following part of commit 9ef621d3be
(KVM: Support mixed endian machines) changed on the size of a struct
that is exported to userspace:

include/linux/kvm.h:

@@ -318,14 +318,14 @@ struct kvm_trace_rec {
 	__u32 vcpu_id;
 	union {
 		struct {
-			__u32 cycle_lo, cycle_hi;
+			__u64 cycle_u64;
 			__u32 extra_u32[KVM_TRC_EXTRA_MAX];
 		} cycle;
 		struct {
 			__u32 extra_u32[KVM_TRC_EXTRA_MAX];
 		} nocycle;
 	} u;
-};
+} __attribute__((packed));

Packing a struct was the correct idea, but it packed the wrong struct.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-08-25 17:28:25 +03:00
Rusty Russell
bf20029677 stop_machine: Remove deprecated stop_machine_run
Everyone should be using stop_machine() now.  The staged API
transition helped life in linux-next.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-08-26 00:19:27 +10:00
Ingo Molnar
e4f807c2b4 Merge branch 'linus' into x86/xen
Conflicts:
	arch/x86/kernel/paravirt.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:54:07 +02:00
Christoph Hellwig
e45b590b97 [PATCH] change d_add_ci argument ordering
As pointed out during review d_add_ci argument order should match d_add,
so switch the dentry and inode arguments.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25 01:18:05 -04:00
Adrian Bunk
7a8fc9b248 removed unused #include <linux/version.h>'s
This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23 12:14:12 -07:00
Jasper Bryant-Greene
fef1643bf0 move ETH_P_PAE from ieee80211_i.h to if_ether.h
ETH_P_PAE belongs in if_ether.h with the other ETH_P_* definitions. This
patch moves it there.

Signed-off-by: Jasper Bryant-Greene <jasper@amiton.co.nz>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:57 -04:00
Henrique de Moraes Holschuh
96c87607ac rfkill: introduce RFKILL_STATE_MAX
While it is interesting to not add last-enum-markers because it allows gcc
to warn us of switch() statements missing a valid state, we really should
be handling memory corruption on a rfkill state with default clauses,
anyway.

So add RFKILL_STATE_MAX and use it where applicable.  It makes for safer
code in the long run.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:57 -04:00
Henrique de Moraes Holschuh
77fba13ccc rfkill: add __must_check annotations
rfkill is not a small, mere detail in wireless support.  Once it starts
supporting rfkill and users start counting on that support, a wireless
device is at risk of operating in dangerous conditions should rfkill
support fail to properly activate.

Therefore, add the required __must_check annotations on some key functions
of the rfkill API, for which the wireless drivers absolutely MUST handle
the failure mode safely in order to avoid a potentially dangerous situation
where the wireless transmitter is left enabled when the user don't want it
to.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:57 -04:00
Henrique de Moraes Holschuh
9961920199 rfkill: add default global states (v2)
Add a second set of global states, "rfkill_default_states", to track the
state that will be used when the first rfkill class of a given type is
registered, and also to save "undo" information when rfkill_epo is called.

Add a new exported function, rfkill_set_default(), which can be used by
platform drivers to restore radio state saved by the platform across
reboots or shutdown.

Also, fix rfkill_epo to properly update rfkill_states, but still preserve a
copy of the state so that we can undo the effect of rfkill_epo later if we
want to.  Add rfkill_restore_states() to restore rfkill_states from the
copy.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:56 -04:00
Linus Torvalds
3ffc3f947d Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  nohz: fix wrong event handler after online an offlined cpu
2008-08-22 08:36:20 -07:00
Linus Torvalds
a7b354e868 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] pata_it821x: fix warning
  libata: Fix a large collection of DMA mode mismatches
  ahci: sis controllers actually can do PMP
  pata_via: clean up recent tf_load changes
  libata: restore SControl on detach
  libata: use ata_link_printk() when printing SError
  libata: always do follow-up SRST if hardreset returned -EAGAIN
  libata: fix EH action overwriting in ata_eh_reset()
  sata_mv: add the Gen IIE flag to the SoC devices.
  ata_piix: IDE Mode SATA patch for Intel Ibex Peak DeviceIDs
  ahci: RAID mode SATA patch for Intel Ibex Peak DeviceIDs
  sata_mv: don't issue two DMA commands concurrently
  libata: implement no[hs]rst force params
2008-08-22 08:22:33 -07:00
Alan Cox
b15b3ebae1 libata: Fix a large collection of DMA mode mismatches
Dave Müller sent a diff for the pata_oldpiix that highlighted a problem
where a lot of the ATA drivers assume dma_mode == 0 means "no DMA" while
the core code uses 0xFF.

This turns out to have other consequences such as code doing >= XFER_UDMA_0
also catching 0xFF as UDMAlots. Fortunately it doesn't generally affect
set_dma_mode, although some drivers call back into their own set mode code
from other points.

Having been through the drivers I've added helpers for using_udma/using_mwdma
dma_enabled so that people don't open code ranges that may change (eg if UDMA8
appears somewhere)

Thanks to David for the initial bits
[and added fix for pata_oldpiix from and signed-off-by Dave Mueller
 <dave.mueller@gmx.ch>  -jg]

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:27:49 -04:00
Tejun Heo
d127ea7b86 libata: restore SControl on detach
Save SControl during probing and restore it on detach.  This prevents
adjustments made by libata drivers to seep into the next driver which
gets attached (be it a libata one or not).

It's not clear whether SControl also needs to be restored on suspend.
The next system to have control (ACPI or kexec'd kernel) would
probably like to see the original SControl value but there's no
guarantee that a link is gonna keep working after SControl is adjusted
without a reset and adding a reset and modified recovery cycle soley
for this is an overkill.  For now, do it only for detach.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:19:46 -04:00
Tejun Heo
05944bdf6f libata: implement no[hs]rst force params
Implement force params nohrst, nosrst and norst.  This is to work
around reset related problems and ease debugging.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:07:43 -04:00
Roman Zippel
916c7a8551 ntp: fix ADJ_OFFSET_SS_READ bug and do_adjtimex() cleanup
Thanks to the review by Michael Kerrisk a bug in the recent
ADJ_OFFSET_SS_READ option was discovered, where the ntp time_offset was
inadvertently set by it.  This fixes this by making the adjtime code
more separate from the ntp_adjtime code (both of which really want to
be separate syscalls).

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 06:40:18 +02:00
Linus Torvalds
61311e1bbc Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  pnp: fix "add acpi:* modalias entries"
  UIO: generic irq handling for some uio platform devices
  UIO: uio_pdrv: fix license specification
  UIO: uio_pdrv: fix memory leak
  block: drop references taken by class_find_device()
  block: fix partial read() of /proc/{partitions,diskstats}
  PM: Remove WARN_ON from device_pm_add
  driver core: add init_name to struct device
  PM: don't skip device PM init when CONFIG_PM_SLEEP isn't set and CONFIG_PM is set
  driver model: anti-oopsing medicine
  dev_printk(): constify the `dev' argument
  drivers/base/driver.c: remove unused to_dev() macro
  Documentation: HOWTO-ja_JP-sync patch
  Japanese translation of Documentation/SubmitChecklist
  kobject: Replace ALL occurrences of '/' with '!' instead of only the first one.
2008-08-21 13:48:37 -07:00
Alan Stern
55151d7dab USB: Defer Set-Interface for suspended devices
This patch (as1128) fixes one of the problems related to the new PM
infrastructure.  We are not allowed to register new child devices
during the middle of a system sleep transition, but unbinding a USB
driver causes the core to automatically install altsetting 0 and
thereby create new endpoint pseudo-devices.

The patch fixes this problem (and the related problem that installing
altsetting 0 will fail if the device is suspended) by deferring the
Set-Interface call until some later time when it is legal and can
succeed.  Possible later times are: when a new driver is being probed
for the interface, and when the interface is being resumed.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-21 10:26:36 -07:00
Greg Kroah-Hartman
c906a48adc driver core: add init_name to struct device
This gives us a way to handle both the bus_id and init_name values being
used for a while during the transition period.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-21 10:15:37 -07:00
Jean Delvare
bf9ca69fc8 dev_printk(): constify the `dev' argument
Add const markings to dev_name and dev_driver_string to make it clear that
dev_printk doesn't modify dev.  This is a prerequisite to adding more
const markings to other functions make it clearer, which functions can
modify dev and which can't.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-21 10:15:36 -07:00
Miao Xie
3c4fbe5e01 nohz: fix wrong event handler after online an offlined cpu
On the tickless system(CONFIG_NO_HZ=y and CONFIG_HIGH_RES_TIMERS=n), after
I made an offlined cpu online, I found this cpu's event handler was
tick_handle_periodic, not tick_nohz_handler.

After debuging, I found this bug was caused by the wrong tick mode.  the
tick mode is not changed to NOHZ_MODE_INACTIVE when the cpu is offline.

This patch fixes this bug.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 09:54:06 +02:00
John Stultz
2d42244ae7 clocksource: introduce CLOCK_MONOTONIC_RAW
In talking with Josip Loncaric, and his work on clock synchronization (see
btime.sf.net), he mentioned that for really close synchronization, it is
useful to have access to "hardware time", that is a notion of time that is
not in any way adjusted by the clock slewing done to keep close time sync.

Part of the issue is if we are using the kernel's ntp adjusted
representation of time in order to measure how we should correct time, we
can run into what Paul McKenney aptly described as "Painting a road using
the lines we're painting as the guide".

I had been thinking of a similar problem, and was trying to come up with a
way to give users access to a purely hardware based time representation
that avoided users having to know the underlying frequency and mask values
needed to deal with the wide variety of possible underlying hardware
counters.

My solution is to introduce CLOCK_MONOTONIC_RAW.  This exposes a
nanosecond based time value, that increments starting at bootup and has no
frequency adjustments made to it what so ever.

The time is accessed from userspace via the posix_clock_gettime() syscall,
passing CLOCK_MONOTONIC_RAW as the clock_id.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 09:50:24 +02:00
John Stultz
1aa5dfb751 clocksource: keep track of original clocksource frequency
The clocksource frequency is represented by
clocksource->mult/2^(clocksource->shift).  Currently, when NTP makes
adjustments to the clock frequency, they are made directly to the mult
value.

This has the drawback that once changed, we cannot know what the orignal
mult value was, or how much adjustment has been applied.

This property causes problems in calculating proper ntp intervals when
switching back and forth between clocksources.

This patch separates the current mult value into a mult and mult_orig
pair.  The mult_orig value stays constant, while the ntp clocksource
adjustments are done only to the mult value.

This allows for correct ntp interval calculation and additionally lays the
groundwork for a new notion of time, what I'm calling the monotonic-raw
time, which is introduced in a following patch.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 09:50:23 +02:00
Ian Campbell
d847471d06 fbdefio: add set_page_dirty handler to deferred IO FB
Fixes kernel BUG at lib/radix-tree.c:473.

Previously the handler was incidentally provided by tmpfs but this was
removed with:

  commit 14fcc23fdc
  Author: Hugh Dickins <hugh@veritas.com>
  Date:   Mon Jul 28 15:46:19 2008 -0700

    tmpfs: fix kernel BUG in shmem_delete_inode

relying on this behaviour was incorrect in any case and the BUG also
appeared when the device node was on an ext3 filesystem.

v2: override a_ops at open() time rather than mmap() time to minimise
races per AKPM's concerns.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Jaya Kumar <jayakumar.lkml@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Kel Modderman <kel@otaku42.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: <stable@kernel.org> [14fcc23fd is in 2.6.25.14 and 2.6.26.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
Nick Piggin
479db0bf40 mm: dirty page tracking race fix
There is a race with dirty page accounting where a page may not properly
be accounted for.

clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.

page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty.  It uses page_check_address to
find the pte.  That function has a shortcut to avoid the ptl if the pte is
not present.  Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.

For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value.  There may also be other code in
core mm or in arch which do similar things.

The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy.  XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.

Fix this by having an option to always take ptl to check the pte in
page_check_address.

It's possible to retain this optimization for page_referenced and
try_to_unmap.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Carsten Otte <cotte@freenet.de>
Cc: Hugh Dickins <hugh@veritas.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
Ken Chen
2d70b68d42 fix setpriority(PRIO_PGRP) thread iterator breakage
When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP
process, only the task leader of the process is affected, all other
sibling LWP threads didn't receive the setting.  The problem was that the
iterator used in sys_setpriority() only iteartes over one task for each
process, ignoring all other sibling thread.

Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk
each thread of a process.  Convert 4 call sites in {set/get}priority and
ioprio_{set/get}.

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:32 -07:00
Marek Vašut
d9105c2b01 [ARM] 5184/1: Split ucb1400_ts into core and touchscreen
This patch splits ucb1400_ts into ucb1400_ts and ucb1400_core.
Since this chip supports more features than only touchscreen,
it was necessary to prepare it for feature addition. The
previous functionality is preserved by applying this patch.

[Build fixes for non-ARM by Stephen Rothwell and Takashi Iwai]

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-08-20 23:22:22 +01:00
Alexey Korolev
17c1d2be28 [MTD] [NAND] Fix missing kernel-doc
[Reported by Randy Dunlap]

Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-20 22:35:40 +01:00
David Woodhouse
e4464facd6 Reserve NFS fileid values for btrfs
Purely cosmetic for now, but we might as well get it merged ASAP.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 13:19:51 -07:00
Ingo Molnar
170465ee7f Merge branch 'linus' into x86/xen 2008-08-20 12:39:18 +02:00
Mingming Cao
1f7c14c62c percpu counter: clean up percpu_counter_sum_and_set()
percpu_counter_sum_and_set() and percpu_counter_sum() is the same except
the former updates the global counter after accounting.  Since we are
taking the fbc->lock to calculate the precise value of the counter in
percpu_counter_sum() anyway, it should simply set fbc->count too, as the
percpu_counter_sum_and_set() does.

This patch merges these two interfaces into one.
 
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09 12:50:59 -04:00
Linus Torvalds
ddd13dc606 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: add acpi_find_root_bridge_handle
  PCI: acpi_pcihp: run _OSC on a root bridge
  x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
  x86/PCI: allow scanning of 255 PCI busses
  x86, pci: detect end_bus_number according to acpi/e820 reserved, v2
  pci: debug extra pci bus resources
  pci: debug extra pci resources range
2008-08-19 13:55:47 -07:00
Linus Torvalds
4309e09242 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
  pkt_sched: Prevent livelock in TX queue running.
  Revert "pkt_sched: Add BH protection for qdisc_stab_lock."
  Revert "pkt_sched: Protect gen estimators under est_lock."
  pkt_sched: remove bogus block (cleanup)
  nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization
  netfilter: ctnetlink: sleepable allocation with spin lock bh
  netfilter: ctnetlink: fix sleep in read-side lock section
  netfilter: ctnetlink: fix double helper assignation for NAT'ed conntracks
  netfilter: ipt_addrtype: Fix matching of inverted destination address type
  dccp: Fix panic caused by too early termination of retransmission mechanism
  pkt_sched: Don't hold qdisc lock over qdisc_destroy().
  pkt_sched: Add lockdep annotation for qdisc locks
  pkt_sched: Never schedule non-root qdiscs.
  removed unused #include <version.h>
  rt2x00: Fix txdone_entry_desc_flags
  b43: Fix for another Bluetooth Coexistence SPROM Programming error for BCM4306
  mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE
  p54u: reset skb's data/tail pointer on requeue
  p54: move p54_vdcf_init to the right place.
  iwlwifi: fix printk newlines
  ...
2008-08-19 09:59:02 -07:00
Simon Horman
3f087668c4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-08-19 17:36:22 +10:00
Linus Torvalds
b689e83961 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ata: add missing ATA_* defines
  ata: add missing ATA_CMD_* defines
  ata: add missing ATA_ID_* defines (take 2)
  sgiioc4: fixup message on resource allocation failure
  ide-cd: use bcd2bin/bin2bcd
  cdrom: handle TOC
  gdrom: add dummy audio_ioctl handler
  viocd: add dummy audio ioctl handler
  cleanup powerpc/include/asm/ide.h
  drivers/ide/pci/: use __devexit_p()
2008-08-18 17:40:13 -07:00
Jiri Slaby
056c58e8eb PCI: add acpi_find_root_bridge_handle
Consolidate finding of a root bridge and getting its handle to the one
inline function. It's cut & pasted on multiple places. Use this new
inline in those.

Cc: kristen.c.accardi@intel.com
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-18 13:48:04 -07:00
Bartlomiej Zolnierkiewicz
b59116205c ata: add missing ATA_* defines
Add missing ATA_* defines to <linux/ata.h>.  Also add
ATAPI_{LFS,EOM,ILI,IO,CODE} defines while at it.

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Bartlomiej Zolnierkiewicz
476d9894dd ata: add missing ATA_CMD_* defines
Add missing ATA_CMD_* defines to <linux/ata.h>.  Also add
ATA_EXABYTE_ENABLE_NEST, SETFEATURES_AAM_* and ATA_SMART_*
defines while at it.

Partially based on earlier work by Chris Wedgwood.

Acked-by: Chris Wedgwood <cw@f00f.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Bartlomiej Zolnierkiewicz
37014c6407 ata: add missing ATA_ID_* defines (take 2)
Add missing ATA_ID_* defines and update {ata,atapi}_*()
inlines accordingly.  The currently unused defines are
needed for the forthcoming drivers/ide/ changes.

v2:
Add ATA_ID_SPG.

Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-18 21:40:05 +02:00
Paul E. McKenney
ded00a56e9 rcu: remove redundant ACCESS_ONCE definition from rcupreempt.c
Remove the redundant definition of ACCESS_ONCE() from rcupreempt.c in
favor of the one in compiler.h.  Also merge the comment header from
rcupreempt.c's definition into that in compiler.h.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:45:22 +02:00
Alexander Beregalov
5e186b57e7 security.h: fix build failure
security.h: fix build failure

include/linux/security.h: In function 'security_ptrace_traceme':
include/linux/security.h:1760: error: 'parent' undeclared (first use in this function)

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-17 22:47:30 +10:00
David Woodhouse
5e6b83ed8c Fix header export of videodev2.h, ivtv.h, ivtvfb.h
The exported copy of videodev2.h contains this line:

	#define #include <sys/time.h>

This is because for some reason it defines __user for itself -- despite
the fact that we remove all instances of __user when exporting headers.
_All_ pointers in userspace are user pointers. Fix it by removing the
unnecessary '#define __user' from the file.

The new headers ivtv.h and ivtvfb.h would have the same problem... if
whoever put them there had actually remembered to add them to the Kbuild
file while he was at it. Fix those too, and export them as was
presumably intended.

Note that includes of <linux/compiler.h> are also stripped by the header
export process, so those don't need to be conditional.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-16 16:46:57 -07:00
Hugh Dickins
605d9288b3 mm: VM_flags comment fixes
Try to comment away a little of the confusion between mm's vm_area_struct
vm_flags and vmalloc's vm_struct flags: based on an idea by Ulrich Drepper.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-16 16:45:56 -07:00
Rusty Russell
db543c1f97 net: skb_copy_datagram_from_iovec()
There's an skb_copy_datagram_iovec() to copy out of a paged skb, but
nothing the other way around (because we don't do that).

We want to allocate big skbs in tun.c, so let's add the function.
It's a carbon copy of skb_copy_datagram_iovec() with enough changes to
be annoying.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-15 19:52:30 -07:00
Mark McLoughlin
e3b9955697 tun: TUNGETIFF interface to query name and flags
Add a TUNGETIFF interface so that userspace can query a
tun/tap descriptor for its name and flags.

This is needed because it is common for one app to create
a tap interface, exec another app and pass it the file
descriptor for the interface. Without TUNGETIFF the spawned
app has no way of detecting wheter the interface has e.g.
IFF_VNET_HDR set.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-15 19:52:19 -07:00
Linus Torvalds
71ef2a46fc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  security: Fix setting of PF_SUPERPRIV by __capable()
2008-08-15 15:32:13 -07:00
Alan Cox
8c9a9dd0fa tty: remove resize window special case
This moves it to being a tty operation. That removes special cases and now
also means that resize can be picked up by um and other non vt consoles
which may have a resize operation.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 10:34:07 -07:00
Seth Heasley
89499759dc x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
This patch adds the Intel Ibex Peak (PCH) LPC and SMBus Controller DeviceIDs.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-15 09:15:47 -07:00
Steven Rostedt
dd0078f4f0 rcu: just rename call_rcu_bh instead of making it a macro
Seems that I found a box that has a config that passes call_rcu_bh as a
function pointer (see net/sctp/sm_make_chunk.c), so declaring the
call_rcu_bh has a macro function isn't good enough.

This patch makes it just another name of call_rcu for rcupreempt.

Signed-off-by: Steven Rostedt <srostedt@redhat.org>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:54:39 +02:00
Dave Chinner
be4de35263 completions: uninline try_wait_for_completion and completion_done
m68k fails to build with these functions inlined in completion.h.  Move
them out of line into sched.c and export them to avoid this problem.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:44 -07:00
Huang Ying
9bdeb7b5d3 kexec jump: __ftrace_enabled_save/restore
Add __ftrace_enabled_save/restore, used to disable ftrace for a while.
Now, this is used by kexec jump, which need a version without lock, for
general situation, a locked version should be used.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Huang Ying
ca195b7f6d kexec jump: remove duplication of kexec_restart_prepare()
Call kernel_restart_prepare() in kernel_kexec() instead of duplicating the
code.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Huang Ying
163f6876f5 kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform.  For example in kexec
jump, it is used for data and stack too.

[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Marcin Slusarz
ce289e8972 suspend: fix section mismatch warning - register_nosave_region
WARNING: vmlinux.o(.text+0xe684): Section mismatch in reference from the function register_nosave_region() to the function .init.text:__register_nosave_region()
  The function register_nosave_region() references
  the function __init __register_nosave_region().
  This is often because register_nosave_region lacks a __init
  annotation or the annotation of __register_nosave_region is wrong.

register_nosave_region calls __init function and is called only from
__init functions

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Richard Kennedy
3fb669dd6e reorder struct prop_local_single to remove padding on 64 bit builds
reorder structure to remove 8 bytes of padding on 64 bit builds

(also removes 8 bytes from task_struct)

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Cc: peterz@infradead.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:15:23 +02:00
Richard Kennedy
bee367ed06 sched: reorder struct sched_rt_entity to remove padding on 64 bit builds
remove 8 bytes of padding on 64 bit builds
(also removes 8 bytes from task_struct)

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:15:23 +02:00
Richard Kennedy
07dd20e032 sched: reorder signal_struct to remove 8 bytes on 64 bit builds
reorder structure to remove 8 bytes of padding on 64 bit builds

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:15:22 +02:00
Paul E. McKenney
34d7c2b38d rcu: remove list_for_each_rcu()
All of the in-tree uses of list_for_each_rcu() have been converted to
list_for_each_entry_rcu(), so list_for_each_rcu() can now be removed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:03:06 +02:00
Paul E. McKenney
ff9cf2ce7a rcu: fixes to include/linux/rcupreempt.h
Hello!

Compared tip/core/rcu to my latest patchset, and found the following
issues:

o	the memory barrier in rcu_exit_nohz() somehow got out of place
	(it is correct in mainline as of 2.6.26-rc7).

o	There is a duplicate declaration of rcu_dyntick_sched.

The attached patch fixes these.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:02:56 +02:00
Julius Volz
c1bc667e84 IPVS: Add genetlink interface definitions to ip_vs.h
Add IPVS Generic Netlink interface definitions to include/linux/ip_vs.h.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-08-15 09:26:14 +10:00
David Howells
5cd9c58fbe security: Fix setting of PF_SUPERPRIV by __capable()
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags.  This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

 (1) security_ptrace_may_access().  This passes judgement on whether one
     process may access another only (PTRACE_MODE_ATTACH for ptrace() and
     PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
     current is the parent.

 (2) security_ptrace_traceme().  This passes judgement on PTRACE_TRACEME only,
     and takes only a pointer to the parent process.  current is the child.

     In Smack and commoncap, this uses has_capability() to determine whether
     the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
     This does not set PF_SUPERPRIV.

Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

 (1) The OOM killer calls __capable() thrice when weighing the killability of a
     process.  All of these now use has_capability().

 (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
     whether the parent was allowed to trace any process.  As mentioned above,
     these have been split.  For PTRACE_ATTACH and /proc, capable() is now
     used, and for PTRACE_TRACEME, has_capability() is used.

 (3) cap_safe_nice() only ever saw current, so now uses capable().

 (4) smack_setprocattr() rejected accesses to tasks other than current just
     after calling __capable(), so the order of these two tests have been
     switched and capable() is used instead.

 (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
     receive SIGIO on files they're manipulating.

 (6) In smack_task_wait(), we let a process wait for a privileged process,
     whether or not the process doing the waiting is privileged.

I've tested this with the LTP SELinux and syscalls testscripts.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 22:59:43 +10:00
Ingo Molnar
51ca3c6791 Merge branch 'linus' into x86/core
Conflicts:
	arch/x86/kernel/genapic_64.c
	include/asm-x86/kvm_host.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 14:58:01 +02:00
Linus Torvalds
b635acec48 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (47 commits)
  usb: musb: pass configuration specifics via pdata
  usb: musb: fix hanging when rmmod gadget driver
  USB: Add MUSB and TUSB support
  USB: serial: remove CONFIG_USB_DEBUG from sierra and option drivers
  USB: Add vendor/product id of ZTE MF628 to option
  USB: quirk PLL power down mode
  USB: omap_udc: fix compilation with debug enabled
  usb: cdc-acm: drain writes on close
  usb: cdc-acm: stop dropping tx buffers
  usb: cdc-acm: bugfix release()
  usb gadget: issue notifications from ACM function
  usb gadget: remove needless struct members
  USB: sh: r8a66597-hcd: fix disconnect regression
  USB: isp1301: fix compilation
  USB: fix compiler warning fix
  usb-storage: unusual_devs entry for Nokia 5300
  USB: cdc-acm.c: Fix compile warnings
  USB: BandRich BandLuxe C150/C250 HSPA Data Card Driver
  USB: ftdi_sio: add support for PHI Fisco data cable (FT232BM based, VID/PID 0403:e40b)
  usb: isp1760: don't be noisy about short packets.
  ...
2008-08-13 20:50:10 -07:00
Linus Torvalds
9921b256bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  CRED: Introduce credential access wrappers
2008-08-13 20:49:37 -07:00
Linus Torvalds
7a49efae71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  netns: Fix crash by making igmp per namespace
  bnx2x: Version update
  bnx2x: Checkpatch compliance
  bnx2x: Spelling mistakes
  bnx2x: Minor code improvements
  bnx2x: Driver info
  bnx2x: 1G LED does not turn off
  bnx2x: 8073 PHY changes
  bnx2x: Change GPIO for any port
  bnx2x: Pause settings
  bnx2x: Link order with external PHY
  bnx2x: No LRO without Rx checksum
  bnx2x: Wrong structure size
  bnx2x: WoL capability
  bnx2x: Clearing MAC addresses filters
  bnx2x: Delay in while loops
  bnx2x: PBA Table Page Alignment Workaround
  bnx2x: Self-test false positive
  bnx2x: Memory allocation
  bnx2x: HW attention lock
  ...
2008-08-13 20:48:46 -07:00
Felipe Balbi
ca6d1b1333 usb: musb: pass configuration specifics via pdata
Use platform_data to pass musb configuration-specific
details to musb driver.

This patch will prevent that other platforms selecting
HAVE_CLK and enabling musb won't break tree building.

The other parts of it will come when linux-omap merge
up more omap2/3 board-files.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 17:33:01 -07:00
Felipe Balbi
550a7375fe USB: Add MUSB and TUSB support
This patch adds support for MUSB and TUSB controllers
integrated into omap2430 and davinci. It also adds support
for external tusb6010 controller.

Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 17:33:00 -07:00
Alan Stern
0282b7f2a8 usb-serial: don't release unregistered minors
This patch (as1121) fixes a bug in the USB serial core.  When a device
is unregistered, the core will give back its minors -- even if the
device hasn't been assigned any!

The patch reserves the highest minor value (255) to mean that no minor
was assigned.  It also removes some dead code and does a small style
fixup.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 17:32:50 -07:00
Alan Stern
f4f4d58734 USB: add missing kerneldoc line for "needs_binding"
This patch (as1117) adds a kerneldoc line for the "needs_binding"
field in struct usb_interface.  It was accidentally omitted when the
field was added.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 17:32:49 -07:00
David Howells
9e2b2dc413 CRED: Introduce credential access wrappers
The patches that are intended to introduce copy-on-write credentials for 2.6.28
require abstraction of access to some fields of the task structure,
particularly for the case of one task accessing another's credentials where RCU
will have to be observed.

Introduced here are trivial no-op versions of the desired accessors for current
and other tasks so that other subsystems can start to be converted over more
easily.

Wrappers are introduced into a new header (linux/cred.h) for UID/GID,
EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed
user_struct.  These wrappers are macros because the ordering between header
files mitigates against making them inline functions.

linux/cred.h is #included from linux/sched.h.

Further, XFS is modified such that it no longer defines and uses parameterised
versions of current_fs[ug]id(), thus getting rid of the namespace collision
otherwise incurred.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 09:35:23 +10:00
Linus Torvalds
9ea319b616 Merge git://oss.sgi.com:8090/xfs/linux-2.6
* git://oss.sgi.com:8090/xfs/linux-2.6: (45 commits)
  [XFS] Fix use after free in xfs_log_done().
  [XFS] Make xfs_bmap_*_count_leaves void.
  [XFS] Use KM_NOFS for debug trace buffers
  [XFS] use KM_MAYFAIL in xfs_mountfs
  [XFS] refactor xfs_mount_free
  [XFS] don't call xfs_freesb from xfs_unmountfs
  [XFS] xfs_unmountfs should return void
  [XFS] cleanup xfs_mountfs
  [XFS] move root inode IRELE into xfs_unmountfs
  [XFS] stop using file_update_time
  [XFS] optimize xfs_ichgtime
  [XFS] update timestamp in xfs_ialloc manually
  [XFS] remove the sema_t from XFS.
  [XFS] replace dquot flush semaphore with a completion
  [XFS] replace inode flush semaphore with a completion
  [XFS] extend completions to provide XFS object flush requirements
  [XFS] replace the XFS buf iodone semaphore with a completion
  [XFS] clean up stale references to semaphores
  [XFS] use get_unaligned_* helpers
  [XFS] Fix compile failure in xfs_buf_trace()
  ...
2008-08-13 15:17:49 -07:00
Tom Tucker
24b8b44780 svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet
RDMA_READ completions are kept on a separate queue from the general
I/O request queue. Since a separate lock is used to protect the RDMA_READ
completion queue, a race exists between the dto_tasklet and the
svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA
bit and adds I/O to the read-completion queue. Concurrently, the
recvfrom thread checks the generic queue, finds it empty and resets
the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue
the transport for I/O and cause the transport to "stall".

The fix is to protect both lists with the same lock and set the XPT_DATA
bit with this lock held.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-08-13 16:57:31 -04:00
David Chinner
39d2f1ab2a [XFS] extend completions to provide XFS object flush requirements
XFS object flushing doesn't quite match existing completion semantics.  It
mixed exclusive access with completion.  That is, we need to mark an object as
being flushed before flushing it to disk, and then block any other attempt to
flush it until the completion occurs.  We do this but adding an extra count to
the completion before we start using them.  However, we still need to
determine if there is a completion in progress, and allow no-blocking attempts
fo completions to decrement the count.

To do this we introduce:

int try_wait_for_completion(struct completion *x)
	returns a failure status if done == 0, otherwise decrements done
	to zero and returns a "started" status. This is provided
	to allow counted completions to begin safely while holding
	object locks in inverted order.

int completion_done(struct completion *x)
	returns 1 if there is no waiter, 0 if there is a waiter
	(i.e. a completion in progress).

This replaces the use of semaphores for providing this exclusion
and completion mechanism.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31816a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:40:43 +10:00
Bernhard Walle
31bad9246b firmware/memmap: cleanup
Various cleanup the drivers/firmware/memmap (after review by AKPM):

    - fix kdoc to conform to the standard
    - move kdoc from header to implementation files
    - remove superfluous WARN_ON() after kmalloc()
    - WARN_ON(x); if (!x) -> if(!WARN_ON(x))
    - improve some comments

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:31 -07:00
Harvey Harrison
bc2aa80e18 byteorder: add include/linux/byteorder.h to define endian helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:30 -07:00
Harvey Harrison
40c9f22210 byteorder: add a new include/linux/swab.h to define byteswapping functions
Collect the implementations from include/linux/byteorder/swab.h, swabb.h
in swab.h

The functionality provided covers:
u16 swab16(u16 val) - return a byteswapped 16 bit value
u32 swab32(u32 val) - return a byteswapped 32 bit value
u64 swab64(u64 val) - return a byteswapped 64 bit value
u32 swahw32(u32 val) - return a wordswapped 32 bit value
u32 swahb32(u32 val) - return a high/low byteswapped 32 bit value

Similar to above, but return swapped value from a naturally-aligned pointer
u16 swab16p(u16 *p)
u32 swab32p(u32 *p)
u64 swab64p(u64 *p)
u32 swahw32p(u32 *p)
u32 swahb32p(u32 *p)

Similar to above, but swap the value in-place (in-situ)
void swab16s(u16 *p)
void swab32s(u32 *p)
void swab64s(u64 *p)
void swahw32s(u32 *p)
void swahb32s(u32 *p)

Arches can override any of these with an optimized version by defining an
inline in their asm/byteorder.h (example given for swab16()):

u16 __arch_swab16() {}
 #define __arch_swab16 __arch_swab16

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:30 -07:00
Alexey Dobriyan
50ac2d694f seq_file: add seq_cpumask(), seq_nodemask()
Short enough reads from /proc/irq/*/smp_affinity return -EINVAL for no
good reason.

This became noticed with NR_CPUS=4096 patches, when length of printed
representation of cpumask becase 1152, but cat(1) continued to read with
1024-byte chunks.  bitmap_scnprintf() in good faith fills buffer, returns
1023, check returns -EINVAL.

Fix it by switching to seq_file, so handler will just fill buffer and
doesn't care about offsets, length, filling EOF and all this crap.

For that add seq_bitmap(), and wrappers around it -- seq_cpumask() and
seq_nodemask().

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Paul Jackson <pj@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:30 -07:00
Uwe Kleine-König
070cb06593 move kernel-doc comment for might_sleep directly before its defining block
Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:29 -07:00
Jean Delvare
1054635532 matrox maven: convert to a new-style i2c driver
The legacy i2c model is going away soon, so switch to the new model.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:29 -07:00
Jan Beulich
74768ed833 page allocator: use no-panic variant of alloc_bootmem() in alloc_large_system_hash()
..  since a failed allocation is being (initially) handled gracefully, and
panic()-ed upon failure explicitly in the function if retries with smaller
sizes failed.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:27 -07:00
Linus Torvalds
1c89ac5501 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  fix spinlock recursion in hvc_console
  stop_machine: remove unused variable
  modules: extend initcall_debug functionality to the module loader
  export virtio_rng.h
  lguest: use get_user_pages_fast() instead of get_user_pages()
  mm: Make generic weak get_user_pages_fast and EXPORT_GPL it
  lguest: don't set MAC address for guest unless specified
2008-08-12 08:40:19 -07:00
Linus Torvalds
88fa08f67b Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
  agp: fix SIS 5591/5592 wrong PCI id
  intel/agp: rewrite GTT on resume
  agp: use dev_printk when possible
  amd64-agp: run fallback when no bridges found, not when driver registration fails
  intel_agp: official name for GM45 chipset
2008-08-12 08:28:32 -07:00
David Woodhouse
742c52533b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-arm/arch-omap/onenand.h
2008-08-12 11:28:00 +01:00
Adrian Hunter
bb0eb217c9 [MTD] Define and use MTD_FAIL_ADDR_UNKNOWN instead of 0xffffffff
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-12 11:02:15 +01:00
Arjan van de Ven
59f9415ffb modules: extend initcall_debug functionality to the module loader
The kernel has this really nice facility where if you put "initcall_debug"
on the kernel commandline, it'll print which function it's going to
execute just before calling an initcall, and then after the call completes
it will

1) print if it had an error code

2) checks for a few simple bugs (like leaving irqs off)
and

3) print how long the init call took in milliseconds.

While trying to optimize the boot speed of my laptop, I have been loving
number 3 to figure out what to optimize...  ...  and then I wished that
the same thing was done for module loading.

This patch makes the module loader use this exact same functionality; it's
a logical extension in my view (since modules are just sort of late
binding initcalls anyway) and so far I've found it quite useful in finding
where things are too slow in my boot.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-08-12 17:52:54 +10:00
Christian Borntraeger
4bceba417a export virtio_rng.h
Hello Rusty,

The entropy device was added after we exported all virtio headers. This
patch adds virtio_rng.h to the exportable userspace headers.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-08-12 17:52:54 +10:00
Rusty Russell
912985dce4 mm: Make generic weak get_user_pages_fast and EXPORT_GPL it
Out of line get_user_pages_fast fallback implementation, make it a weak
symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST.

Export the symbol to modules so lguest can use it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-08-12 17:52:53 +10:00
Gerrit Renker
987c402ac3 skbuff: Code readability NiT
Inserting a space between the `-' improved the C readability (some languages
allow hyphens within functions and variable names, which is confusing).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-11 18:17:17 -07:00
Keith Packard
a8c84df9f7 intel/agp: rewrite GTT on resume
On my Intel chipset (965GM), the GTT is entirely erased across
suspend/resume.  This patch simply re-plays the current mapping at resume
time to restore the table.=20

I noticed this once I started relying on persistent GTT mappings across VT
switch in our GEM work -- the old X server and DRM code carefully unbind
all memory from the GTT on VT switch, but GEM does not bother.

I placed the list management and rewrite code in the generic layer on the
assumption that it will be needed on other hardware, but I did not add the
rewrite call to anything other than the Intel resume function.

Keep a list of current GATT mappings.  At resume time, rewrite them into
the GATT.  This is needed on Intel (at least) as the entire GATT is
cleared across suspend/resume.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-08-12 10:13:38 +10:00
Linus Torvalds
1ea2950884 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched, cpu hotplug: fix set_cpus_allowed() use in hotplug callbacks
  sched: fix mysql+oltp regression
  sched_clock: delay using sched_clock()
  sched clock: couple local and remote clocks
  sched clock: simplify __update_sched_clock()
  sched: eliminate scd->prev_raw
  sched clock: clean up sched_clock_cpu()
  sched clock: revert various sched_clock() changes
  sched: move sched_clock before first use
  sched: test runtime rather than period in global_rt_runtime()
  sched: fix SCHED_HRTICK dependency
  sched: fix warning in hrtick_start_fair()
2008-08-11 16:46:31 -07:00
Linus Torvalds
9b4d0bab32 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix debug_lock_alloc
  lockdep: increase MAX_LOCKDEP_KEYS
  generic-ipi: fix stack and rcu interaction bug in smp_call_function_mask()
  lockdep: fix overflow in the hlock shrinkage code
  lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]()
  lockdep: handle chains involving classes defined in modules
  mm: fix mm_take_all_locks() locking order
  lockdep: annotate mm_take_all_locks()
  lockdep: spin_lock_nest_lock()
  lockdep: lock protection locks
  lockdep: map_acquire
  lockdep: shrink held_lock structure
  lockdep: re-annotate scheduler runqueues
  lockdep: lock_set_subclass - reset a held lock's subclass
  lockdep: change scheduler annotation
  debug_locks: set oops_in_progress if we will log messages.
  lockdep: fix combinatorial explosion in lock subgraph traversal
2008-08-11 16:45:46 -07:00
Ingo Molnar
23a0ee908c Merge branch 'core/locking' into core/urgent 2008-08-12 00:11:49 +02:00
Linus Torvalds
10fec20ef5 Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd
* 'for-linus' of git://git.o-hand.com/linux-mfd:
  mfd: tc6393 cleanup and update
  mfd: have TMIO drivers and subdevices depend on ARM
  mfd: TMIO MMC driver
  mfd: driver for the TMIO NAND controller
  mfd: t7l66 MMC platform data
  mfd: tc6387 MMC platform data
  mfd: Fix 7l66 and 6387 according to the new mfd-core API
  mfd: Fix tc6393 according to the new tmio.h
  mfd: driver for the TC6387XB TMIO controller.
  mfd: driver for the T7L66XB TMIO SoC
  mfd: TMIO MMC structures and accessors.
2008-08-11 10:44:43 -07:00
Linus Torvalds
e2205a156f Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Remove include/linux/harrier_defs.h
  powerpc: Do not ignore arch/powerpc/include
  powerpc: Delete completed "ppc removal" task from feature removal file
  powerpc/mm: Fix attribute confusion with htab_bolt_mapping()
  powerpc/pci: Don't keep ISA memory hole resources in the tree
  powerpc: Zero fill the return values of rtas argument buffer
  powerpc/4xx: Update defconfig files for 2.6.27-rc1
  powerpc/44x: Incorrect NOR offset in Warp DTS
  powerpc/44x: Warp DTS changes for board updates
  powerpc/4xx: Cleanup Warp for i2c driver changes.
  powerpc/44x: Adjust warp-nand resource end address
2008-08-11 10:40:28 -07:00
Linus Torvalds
a7ef6a40f7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: Limit VPD length for Broadcom 5708S
  PCI PM: Export pci_pme_active to drivers
  PCI: remove duplicate symbol from pci_ids.h
  PCI: check the return value of device_create_bin_file() in pci_create_bus()
  PCI: fully restore MSI state at resume time
  DMA: make dma-coherent.c documentation kdoc-friendly
  PCI: make pci_register_driver() a macro
  PCI: add Broadcom 5708S to VPD length quirk
2008-08-11 10:38:36 -07:00
Ingo Molnar
e5f363e358 lockdep: increase MAX_LOCKDEP_KEYS
certain configs produce:

 [   70.076229] BUG: MAX_LOCKDEP_KEYS too low!
 [   70.080230] turning off the locking correctness validator.

tune them up.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 15:25:07 +02:00
Ingo Molnar
ced9cd40ac printk: robustify printk, fix
fix:

 include/linux/kernel.h: In function ‘printk_needs_cpu':
 include/linux/kernel.h:217: error: parameter name omitted

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 15:04:19 +02:00
Peter Zijlstra
b845b517b5 printk: robustify printk
Avoid deadlocks against rq->lock and xtime_lock by deferring the klogd
wakeup by polling from the timer tick.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 13:46:53 +02:00
Paul E. McKenney
67182ae1c4 rcu, debug: detect stalled grace periods
this is a diagnostic patch for Classic RCU.

The approach is to record a timestamp at the beginning
of the grace period (in rcu_start_batch()), then have
rcu_check_callbacks() complain if:

 1.	it is running on a CPU that has holding up grace periods for
 	a long time (say one second).  This will identify the culprit
 	assuming that the culprit has not disabled hardware irqs,
 	instruction execution, or some such.

 2.	it is running on a CPU that is not holding up grace periods,
 	but grace periods have been held up for an even longer time
 	(say two seconds).

It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter.

Rather than exponential backoff, it backs off to once per 30 seconds.
My feeling upon thinking on it was that if you have stalled RCU grace
periods for that long, a few extra printk() messages are probably the
least of your worries...

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: David Witbrodt <dawitbro@sbcglobal.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 13:35:18 +02:00
Ingo Molnar
c4c0c56a7a Merge branch 'linus' into core/rcu 2008-08-11 13:27:47 +02:00
Paul Mackerras
13fa00a878 powerpc: Remove include/linux/harrier_defs.h
It was only used by code in arch/ppc, and arch/ppc is gone, so remove
the unused harrier_defs.h as well.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-08-11 21:00:12 +10:00
Peter Zijlstra
b42e737e57 lockdep: fix overflow in the hlock shrinkage code
There is a overflow by 1 case in the new shrunken hlock code.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 12:34:42 +02:00
Ingo Molnar
3295f0ef9f lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]()
the names were too generic:

 drivers/uio/uio.c:87: error: expected identifier or '(' before 'do'
 drivers/uio/uio.c:87: error: expected identifier or '(' before 'while'
 drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 10:30:30 +02:00
Peter Zijlstra
b7d39aff91 lockdep: spin_lock_nest_lock()
Expose the new lock protection lock.

This can be used to annotate places where we take multiple locks of the
same class and avoid deadlocks by always taking another (top-level) lock
first.

NOTE: we're still bound to the MAX_LOCK_DEPTH (48) limit.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 09:30:24 +02:00
Peter Zijlstra
7531e2f34d lockdep: lock protection locks
On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote:

> On Fri, 1 Aug 2008, David Miller wrote:
> >
> > Taking more than a few locks of the same class at once is bad
> > news and it's better to find an alternative method.
>
> It's not always wrong.
>
> If you can guarantee that anybody that takes more than one lock of a
> particular class will always take a single top-level lock _first_, then
> that's all good. You can obviously screw up and take the same lock _twice_
> (which will deadlock), but at least you cannot get into ABBA situations.
>
> So maybe the right thing to do is to just teach lockdep about "lock
> protection locks". That would have solved the multi-queue issues for
> networking too - all the actual network drivers would still have taken
> just their single queue lock, but the one case that needs to take all of
> them would have taken a separate top-level lock first.
>
> Never mind that the multi-queue locks were always taken in the same order:
> it's never wrong to just have some top-level serialization, and anybody
> who needs to take <n> locks might as well do <n+1>, because they sure as
> hell aren't going to be on _any_ fastpaths.
>
> So the simplest solution really sounds like just teaching lockdep about
> that one special case. It's not "nesting" exactly, although it's obviously
> related to it.

Do as Linus suggested. The lock protection lock is called nest_lock.

Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything
that spills that it still up shit creek.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 09:30:24 +02:00
Peter Zijlstra
4f3e7524b2 lockdep: map_acquire
Most the free-standing lock_acquire() usages look remarkably similar, sweep
them into a new helper.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 09:30:23 +02:00
Dave Jones
f82b217e35 lockdep: shrink held_lock structure
struct held_lock {
        u64                        prev_chain_key;       /*     0     8 */
        struct lock_class *        class;                /*     8     8 */
        long unsigned int          acquire_ip;           /*    16     8 */
        struct lockdep_map *       instance;             /*    24     8 */
        int                        irq_context;          /*    32     4 */
        int                        trylock;              /*    36     4 */
        int                        read;                 /*    40     4 */
        int                        check;                /*    44     4 */
        int                        hardirqs_off;         /*    48     4 */

        /* size: 56, cachelines: 1 */
        /* padding: 4 */
        /* last cacheline: 56 bytes */
};

struct held_lock {
        u64                        prev_chain_key;       /*     0     8 */
        long unsigned int          acquire_ip;           /*     8     8 */
        struct lockdep_map *       instance;             /*    16     8 */
        unsigned int               class_idx:11;         /*    24:21  4 */
        unsigned int               irq_context:2;        /*    24:19  4 */
        unsigned int               trylock:1;            /*    24:18  4 */
        unsigned int               read:2;               /*    24:16  4 */
        unsigned int               check:2;              /*    24:14  4 */
        unsigned int               hardirqs_off:1;       /*    24:13  4 */

        /* size: 32, cachelines: 1 */
        /* padding: 4 */
        /* bit_padding: 13 bits */
        /* last cacheline: 32 bytes */
};

[mingo@elte.hu: shrunk hlock->class too]
[peterz@infradead.org: fixup bit sizes]
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2008-08-11 09:30:23 +02:00
Peter Zijlstra
64aa348edc lockdep: lock_set_subclass - reset a held lock's subclass
this can be used to reset a held lock's subclass, for arbitrary-depth
iterated data structures such as trees or lists which have per-node
locks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 09:30:21 +02:00
Ingo Molnar
cf206bffbb Merge branch 'linus' into sched/clock 2008-08-11 08:59:21 +02:00
Peter Zijlstra
c1955a3d47 sched_clock: delay using sched_clock()
Some arch's can't handle sched_clock() being called too early - delay
this until sched_clock_init() has been called.

Reported-by: Bill Gatliff <bgat@billgatliff.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
CC: Russell King - ARM Linux <linux@arm.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 08:59:03 +02:00
Ian Molton
25d6cbd840 mfd: tc6393 cleanup and update
This patchset cleans up the TC6393XB support.

* Add provision for the MMC subdevice
* Disable / enable clocks on suspend / resume
* Remove fragments of badly merged code (eg. linux/fb include etc.)
* Use a device specific clock name to break dependancy on ARM/PXA2XX
* Drop unnecessary resource names
* Switch to tmio_io* accessors

Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-08-10 23:32:07 +02:00
Ian Molton
cbdfb42639 mfd: driver for the TC6387XB TMIO controller.
This patch adds support for the TC6387XB. Unlike other TMIO devices this one
has only one subdevice and no interrupt mux, however using the MFD framework
allows it to share the TMIO MMC driver.

Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-08-10 20:09:55 +02:00
Ian Molton
1f192015ca mfd: driver for the T7L66XB TMIO SoC
This patchset provides support for the core functinality of the T7L66XB
SoC from Toshiba. Supported in this patchset is the IRQ MUX, MMC controller
and NAND flash controller.

Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-08-10 20:09:50 +02:00
Ian Molton
d3a2f71853 mfd: TMIO MMC structures and accessors.
Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-08-10 20:09:43 +02:00
Linus Torvalds
4fbb71597a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  SLUB: dynamic per-cache MIN_PARTIAL
  mm: unexport ksize
2008-08-09 16:21:33 -07:00
Randy Dunlap
6724cce8fb list.h: fix fatal kernel-doc error
Fix fatal multi-line kernel-doc error in list.h:
function short description must be on one line.

Error(linux-2.6.27-rc2-git3//include/linux/list.h:318): duplicate section name 'Description'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-08 16:17:16 -07:00
Linus Torvalds
49b75b87ce Merge branch 'for-linus-merged' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus-merged' of master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5177/1: arm/mach-sa1100/Makefile: remove CONFIG_SA1100_USB
  [ARM] 5166/1: magician: add MAINTAINERS entry
  [ARM] fix pnx4008 build errors
  [ARM] Fix SMP booting with non-zero PHYS_OFFSET
  [ARM] 5185/1: Fix spi num_chipselect for lubbock
  [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach
  [ARM] Add support for arch/arm/mach-*/include and arch/arm/plat-*/include
  [ARM] Remove asm/hardware.h, use asm/arch/hardware.h instead
  [ARM] Eliminate useless includes of asm/mach-types.h
  [ARM] Fix circular include dependency with IRQ headers
  avr32: Use <mach/foo.h> instead of <asm/arch/foo.h>
  avr32: Introduce arch/avr32/mach-*/include/mach
  avr32: Move include/asm-avr32 to arch/avr32/include/asm
  [ARM] sa1100_wdt: use reset_status to remember watchdog reset status
  [ARM] pxa: introduce reset_status and clear_reset_status for driver's usage
  [ARM] pxa: introduce reset.h for reset specific header information
2008-08-08 11:38:42 -07:00
Russell King
097d9eb537 Merge Linus' latest into master
Conflicts:

	drivers/watchdog/at91rm9200_wdt.c
	drivers/watchdog/davinci_wdt.c
	drivers/watchdog/ep93xx_wdt.c
	drivers/watchdog/ixp2000_wdt.c
	drivers/watchdog/ixp4xx_wdt.c
	drivers/watchdog/ks8695_wdt.c
	drivers/watchdog/omap_wdt.c
	drivers/watchdog/pnx4008_wdt.c
	drivers/watchdog/sa1100_wdt.c
	drivers/watchdog/wdt285.c
2008-08-08 19:18:18 +01:00
Linus Torvalds
f2d7499be1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (99 commits)
  pkt_sched: Fix actions referencing
  bnx2x: fix logical op
  tcp: (whitespace only) fix confusing indentation
  pkt_sched: Fix qdisc config when link is down.
  [Bluetooth] Add full quirk implementation for btusb driver
  [Bluetooth] Removal of unnecessary ignore module parameter
  [Bluetooth] Add parameters to control BNEP header compression
  ath9k: Revamp wireless mode usage
  ath9k: More unused macros
  ath9k: Remove a few unused macros and fix indentation
  ath9k: Use mac80211's band macros and remove enum hal_freq_band
  ath9k: Remove redundant data structure ath9k_txq_info
  ath9k: Cleanup data structures related to HW capabilities
  ath9k: work around gcc ICEs
  ath9k: Add new Atheros IEEE 802.11n driver
  ath5k: remove Atheros 11n devices from supported list
  list.h: add list_cut_position()
  list.h: Add list_splice_tail() and list_splice_tail_init()
  p54: swap short slot time dcf values
  rt2x00: Block all unsupported modes
  ...
2008-08-08 11:15:23 -07:00
Russell King
2727f226a6 [ARM] fix pnx4008 build errors
include/linux/i2c-pnx.h was missed when moving the include files.
Fix it now; it doesn't really need to include mach/i2c.h at all.
Successfully build tested with pnx4008_defconfig, which had
failed in linux-next.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-08-08 15:13:27 +01:00
Linus Torvalds
aeee90dfa0 Merge branch 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace
* 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  tracehook: fix CLONE_PTRACE
2008-08-07 18:14:24 -07:00
Linus Torvalds
273b257839 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mad: Test ib_create_send_mad() return with IS_ERR(), not == NULL
  IB/mlx4: Allow 4K messages for UD QPs
  mlx4_core: Add ethernet fields to CQE struct
  IB/ipath: Fix printk format warnings
  RDMA/cxgb3: Fix deadlock initializing iw_cxgb3 device
  RDMA/cxgb3: Fix up MW access rights
  RDMA/cxgb3: Fix QP capabilities
  RDMA/cma: Remove padding arrays by using struct sockaddr_storage
  IB/ipath: Use unsigned long for irq flags
  IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr()
2008-08-07 18:14:07 -07:00
Roland McGrath
5861bbfcc1 tracehook: fix CLONE_PTRACE
In the change in commit 09a05394fe, I
overlooked two nits in the logic and this broke using CLONE_PTRACE
when PTRACE_O_TRACE* are not being used.

A parent that is itself traced at all but not using PTRACE_O_TRACE*,
using CLONE_PTRACE would have its new child fail to be traced.

A parent that is not itself traced at all that uses CLONE_PTRACE
(which should be a no-op in this case) would confuse the bookkeeping
and lead to a crash at exit time.

This restores the missing checks and fixes both failure modes.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
2008-08-07 17:18:47 -07:00
Rafael J. Wysocki
5a6c9b60b4 PCI PM: Export pci_pme_active to drivers
Export pci_pme_active() to drivers, so that they can clear the
PME_status bit and disable PME# for their devices without involving
ACPI.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-07 15:33:36 -07:00
akpm@linux-foundation.org
7bed523a95 PCI: remove duplicate symbol from pci_ids.h
pci.ids.h: remove a duplicated symbol

Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Grant Coady <gcoady.lk@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-07 09:49:57 -07:00
Andrew Morton
bba8116586 PCI: make pci_register_driver() a macro
alpha:

CC [M]  drivers/usb/gadget/u_ether.o
In file included from include/asm/dma-mapping.h:7,
                 from include/linux/dma-mapping.h:52,
                 from include/linux/dmaengine.h:29,
                 from include/linux/skbuff.h:29,
                 from include/linux/if_ether.h:114,
                 from include/linux/etherdevice.h:27,
                 from drivers/usb/gadget/u_ether.c:29:
include/linux/pci.h: In function 'pci_register_driver':
include/linux/pci.h:673: error: 'KBUILD_MODNAME' undeclared (first use in this function)
include/linux/pci.h:673: error: (Each undeclared identifier is reported only once
include/linux/pci.h:673: error: for each function it appears in.)

Sam says:

The problem is that u_ether.o is used by two modules so when we build it
KBUILD_MODNAME is not defined because kbuild does not know what value to
use.

And in pci.h we have the following inline:

static inline int __must_check pci_register_driver(struct pci_driver *driver)
{
        return __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME);
}

And alpha uses dma-mapping.h to nullify a number of functions that seem to
require something from pci.h.

Making it a macro fixes this particular problem.  However, the underlying issue
of a file using KBUILD_MODNAME and being shared between multiple modules is
*not* addressed.  I guess the answer there is "don't do that".

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-07 06:52:01 -07:00
Luis R. Rodriguez
00e8a4da8c list.h: add list_cut_position()
This adds list_cut_position() which lets you cut a list into
two lists given a pivot in the list.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-07 09:49:42 -04:00
Luis R. Rodriguez
7d283aee50 list.h: Add list_splice_tail() and list_splice_tail_init()
If you are using linked lists for queues list_splice() will not do what
you would expect even if you use the elements passed reversed. We need
to handle these differently. We add list_splice_tail() and
list_splice_tail_init().

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-07 09:49:42 -04:00
David Woodhouse
c314dfdc35 [MTD] [NOR] Rename and export new cfi_qry_*() functions
They need to be exported, so let's give them less generic-sounding names
while we're at it.

Original export patch, along with the suggestion about the nomenclature,
from Stephen Rothwell.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-07 11:55:07 +01:00
Laurent Pinchart
fe41424855 dm9000: Support MAC address setting through platform data.
The dm9000 driver reads the chip's MAC address from the attached EEPROM. When
no EEPROM is present, or when the MAC address is invalid, it falls back to
reading the address from the chip.

This patch lets platform code set the desired MAC address through platform
data.

Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-07 02:22:54 -04:00
Brandon Philips
b11f8d8cc3 ethtool: Expand ethtool_cmd.speed to 32 bits
Introduce the speed_hi field to ethtool_cmd, using the reserved space,
to expand the speed field to 2^32 Megabits/second.

Making this field expansion now gives us plenty of time to fix up the
user-space pieces that use SIOCETHTOOL before hardware faster than 64
Gb/s is available.

Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-07 02:22:08 -04:00
Yevgeny Petrilin
f780a9f119 mlx4_core: Add ethernet fields to CQE struct
Add ethernet-related fields to struct mlx4_cqe so that the mlx4_en
ethernet NIC driver can share the same definition.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-08-06 20:14:06 -07:00
Anders Grafström
e93cafe45f [MTD] [NOR] cfi_cmdset_0001: Timeouts for erase, write and unlock operations
Timeouts are currently given by the typical operation time times 8.
It works in the general well-behaved case but not when an erase block is
failing. For erase operations, it seems that a failing erase block will
keep the device state machine in erasing state until the vendor
specified maximum timeout period has passed. By this time the driver
would have long since timed out, left erasing state and attempted
further operations which all fail. This patch implements timeouts using
values from the CFI Query structure when available.
The patch also sets a longer timeout for locking operations. The current
value used for locking/unlocking given by 1000000/HZ microseconds is too
short for devices like J3 and J5 Strataflash which have a typical clear
lock-bits time of 0.5 seconds.

Signed-off-by: Anders Grafström <grfstrm@users.sourceforge.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-06 09:44:54 +01:00
Alexey Korolev
2e489e077a [MTD] [NOR] Add qry_mode_on()/qry_omde_off() to deal with odd chips
There are some CFI chips which require non standard procedures to get 
into QRY mode. The possible way to support them would be trying 
different modes till QRY will be read. This patch introduce two new 
functions qry_mode_on qry_mode_off. qry_mode_on tries different commands 
in order switch chip into QRY mode.

So if we have one more "odd" chip - we just could add several lines to 
qry_mode_on. Also using these functions remove unnecessary code 
duplicaton in porbe procedure.

Currently there are two "odd" cases
1. Some old intel chips which require 0xFF before 0x98
2. ST M29DW chip which requires 0x98 to be sent at 0x555 (according to
CFI should be 0x55)

This patch is partialy based on the patch from Uwe
(see "[PATCH 2/4] [RFC][MTD] cfi_probe: remove Intel chip workaround"
thread )

Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: Alexander Belyakov <abelyako@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-06 09:43:58 +01:00
Linus Torvalds
e63e03273b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (78 commits)
  AX.25: Fix sysctl registration if !CONFIG_AX25_DAMA_SLAVE
  pktgen: mac count
  pktgen: random flow 
  bridge: Eliminate unnecessary forward delay
  bridge: fix compile warning in net/bridge/br_netfilter.c
  ipv4: remove unused field in struct flowi (include/net/flow.h).
  tg3: Fix 'scheduling while atomic' errors
  net: Kill plain NET_XMIT_BYPASS.
  net_sched: Add qdisc __NET_XMIT_BYPASS flag
  net_sched: Add qdisc __NET_XMIT_STOLEN flag
  iwl3945: fix merge mistake for packet injection
  iwlwifi: grap nic access before accessing periphery registers
  iwlwifi: decrement rx skb counter in scan abort handler
  iwlwifi: fix unhandled interrupt when HW rfkill is on
  iwlwifi: implement iwl5000_calc_rssi
  iwlwifi: memory allocation optimization
  iwlwifi: HW bug fixes
  p54: Fix potential concurrent access to private data
  rt2x00: Disable link tuning in rt2500usb
  iwlwifi: Don't use buffer allocated on the stack for led names
  ...
2008-08-05 19:37:42 -07:00
Richard Hughes
bf1db69fbf pm_qos: spelling fixes
A documentation cleanup patch.  With a minor tweak to clarify units for
kbs.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: mark gross <mgross@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-05 14:33:50 -07:00
Mark Asselstine
f6ac436dcc Remove the deprecated cli() sti() functions
These functions have been deprecated for some time now but remained until
all legacy callers could be removed.  With a few commits in 2.6.26 this
has happened so now we can remove these deprecated functions.

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-05 14:33:48 -07:00
Shadi Ammouri
60cadec9da spi: new orion_spi driver
This adds an SPI driver for the SPI controller found in various Marvell
Orion ARM SoCs.  It currently supports only one slave, which must use SPI
mode 0.

[dbrownell@users.sourceforge.net: cleanups, meet specs, pass "sparse"]
Signed-off-by: Shadi Ammouri <shadi@marvell.com>
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-05 14:33:46 -07:00
Bernhard Walle
c6e2bee26e kdump: report actual value of VMCOREINFO_OSRELEASE in VMCOREINFO
The current implementation reports the structure name as
VMCOREINFO_OSRELEASE in VMCOREINFO, e.g.

        VMCOREINFO_OSRELEASE=init_uts_ns.name.release

That doesn't make sense because it's always the same. Instead, use the
value, e.g.

        VMCOREINFO_OSRELEASE=2.6.26-rc3

That's also what the 'makedumpfile -g' does.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Ken'ichi Ohmichi" <oomichi@mxs.nes.nec.co.jp>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-05 14:33:46 -07:00
Adrian Bunk
c5bfc3757f ide: remove CONFIG_IDE_MAX_HWIFS
The benefits of a user settable CONFIG_IDE_MAX_HWIFS have become pretty 
tiny and are no longer considered worth the trouble of an own option.

Simply always #define MAX_HWIFS to 10.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-05 18:17:01 +02:00
Bartlomiej Zolnierkiewicz
39b986a6c7 ide: sanitize struct ide_port_ops documentation (take 2)
v2:
Add missing '@'-s.  (Noticed by Randy Dunlap)

Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-08-05 18:16:57 +02:00
David S. Miller
33e334950a Merge branch 'no-ath9k' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-08-05 01:28:35 -07:00
Pekka Enberg
5595cffc82 SLUB: dynamic per-cache MIN_PARTIAL
This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial
value that is calculated from object size. The bigger the object size, the more
pages we keep on the partial list.

I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example
script of the fio benchmarking tool. The script stresses the networking
subsystem which should also give a fairly good beating of kmalloc() et al.

To run the test yourself, first clone the fio repository:

  git clone git://git.kernel.dk/fio.git

and then run the following command n times on your machine:

  time ./fio examples/netio

The results on my 2-way 64-bit x86 machine are as follows:

  [ the minimum, maximum, and average are captured from 50 individual runs ]

                 real time (seconds)
                 min      max      avg      sd
  SLAB           22.76    23.38    22.98    0.17
  SLUB           22.80    25.78    23.46    0.72
  SLUB (dynamic) 22.74    23.54    23.00    0.20

                 sys time (seconds)
                 min      max      avg      sd
  SLAB           6.90     8.28     7.70     0.28
  SLUB           7.42     16.95    8.89     2.28
  SLUB (dynamic) 7.17     8.64     7.73     0.29

                 user time (seconds)
                 min      max      avg      sd
  SLAB           36.89    38.11    37.50    0.29
  SLUB           30.85    37.99    37.06    1.67
  SLUB (dynamic) 36.75    38.07    37.59    0.32

As you can see from the above numbers, this patch brings SLUB to the same level
as SLAB for this particular workload fixing a ~2% regression. I'd expect this
change to help similar workloads that allocate a lot of objects that are close
to the size of a page.

Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-08-05 09:28:47 +03:00
David S. Miller
cc6533e98a net: Kill plain NET_XMIT_BYPASS.
dst_input() was doing something completely absurd, looping
on skb->dst->input() if NET_XMIT_BYPASS was seen, but these
functions never return such an error.

And as a result plain ole' NET_XMIT_BYPASS has no more
references and can be completely killed off.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04 23:04:08 -07:00
Jarek Poplawski
378a2f090f net_sched: Add qdisc __NET_XMIT_STOLEN flag
Patrick McHardy <kaber@trash.net> noticed:
"The other problem that affects all qdiscs supporting actions is
TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS
even though the packet is not queued, corrupting upper qdiscs'
qlen counters."

and later explained:
"The reason why it translates it at all seems to be to not increase
the drops counter. Within a single qdisc this could be avoided by
other means easily, upper qdiscs would still increase the counter
when we return anything besides NET_XMIT_SUCCESS though.

This means we need a new NET_XMIT return value to indicate this to
the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN,
return that to upper qdiscs and translate it to NET_XMIT_SUCCESS
in dev_queue_xmit, similar to NET_XMIT_BYPASS."

David Miller <davem@davemloft.net> noticed:
"Maybe these NET_XMIT_* values being passed around should be a set of
bits. They could be composed of base meanings, combined with specific
attributes.

So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT"

The attributes get masked out by the top-level ->enqueue() caller,
such that the base meanings are the only thing that make their
way up into the stack. If it's only about communication within the
qdisc tree, let's simply code it that way."

This patch is trying to realize these ideas.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04 22:31:03 -07:00
Nick Piggin
ca5de404ff fs: rename buffer trylock
Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 21:56:09 -07:00
Nick Piggin
529ae9aaa0 mm: rename page trylock
Converting page lock to new locking bitops requires a change of page flag
operation naming, so we might as well convert it to something nicer
(!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked).

This also facilitates lockdeping of page lock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 21:31:34 -07:00
Linus Torvalds
2e1e9212ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (29 commits)
  sh: enable maple_keyb in dreamcast_defconfig.
  SH2(A) cache update
  nommu: Provide vmalloc_exec().
  add addrespace definition for sh2a.
  sh: Kill off ARCH_SUPPORTS_AOUT and remnants of a.out support.
  sh: define GENERIC_HARDIRQS_NO__DO_IRQ.
  sh: define GENERIC_LOCKBREAK.
  sh: Save NUMA node data in vmcore for crash dumps.
  sh: module_alloc() should be using vmalloc_exec().
  sh: Fix up __bug_table handling in module loader.
  sh: Add documentation and integrate into docbook build.
  sh: Fix up broken kerneldoc comments.
  maple: Kill useless private_data pointer.
  maple: Clean up maple_driver_register/unregister routines.
  input: Clean up maple keyboard driver
  maple: allow removal and reinsertion of keyboard driver module
  sh: /proc/asids depends on MMU.
  arch/sh/boards/mach-se/7343/irq.c: removed duplicated #include
  arch/sh/boards/board-ap325rxa.c: removed duplicated #include
  sh/boards/Makefile typo fix
  ...
2008-08-04 17:26:15 -07:00
Roland McGrath
115a326c1e tracehook: kerneldoc fix
My last change to tracehook.h made it confuse the kerneldoc parser.
Move the #define's before the comment so it's happy again.

Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 17:23:43 -07:00
Linus Torvalds
c635fd3d3d Merge git://git.infradead.org/users/dwmw2/random-2.6
* git://git.infradead.org/users/dwmw2/random-2.6:
  drivers/video/console/promcon.c: fix build error
  Fix IHEX firmware generation/loading
2008-08-04 17:03:56 -07:00
Linus Torvalds
82248a5e92 Merge git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6:
  Add DIP switch readout for HFC-4S IOB4ST
  Fix remaining big endian issue of hfcmulti
  mISDN cleanup user interface
  mISDN fix main ISDN Makefile
2008-08-04 17:00:37 -07:00
Linus Torvalds
1a3f7d98e5 Revert "UFS: add const to parser token table"
This reverts commit f9247273cb (and
fb2e405fc1 - "fix fs/nfs/nfsroot.c
compilation" - that fixed a missed conversion).

The changes cause problems for at least the sparc build.  Let's re-do
them when the exact issues are resolved.

Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 16:50:38 -07:00
Emmanuel Grumbach
98f7dfd86c mac80211: pass dtim_period to low level driver
This patch adds the dtim_period in ieee80211_bss_conf, this allows the low
level driver to know the dtim_period, and to plan power save accordingly.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-04 15:09:07 -04:00
Paul Mundt
617870632d maple: Kill useless private_data pointer.
We can simply wrap in to the dev_set/get_drvdata(), there's no reason
to track an extra level of private data on top of the struct device.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-08-04 10:58:24 +09:00
Paul Mundt
63870295de maple: Clean up maple_driver_register/unregister routines.
These were completely inconsistent. Clean these up to take a maple_driver
pointer directly for consistency.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-08-04 10:39:46 +09:00
Alexander Beregalov
cf368d2f9a drivers/video/console/promcon.c: fix build error
drivers/video/console/promcon.c:158: error: implicit declaration of
function 'con_protect_unimap'

Introduced by commit a29ccf6f82
("embedded: fix vc_translate operator precedence").

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-03 09:51:30 +01:00
Marc Zyngier
85ebd00334 Fix IHEX firmware generation/loading
Fix both the IHEX firmware generation (len field always null, and EOF
marker a byte too short) and loading (struct ihex_binrec needs to be
packed to reflect the on-disk structure).

Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-02 18:36:10 +01:00
Karsten Keil
ff4cc1de24 mISDN cleanup user interface
The channelmap should have the same size on 32 and 64 bit systems
and should not depend on endianess.
Thanks to David Woodhouse for spotting this.

Signed-off-by: Karsten Keil <kkeil@suse.de>
2008-08-02 16:28:50 +02:00
Tim Bird
4744b43431 embedded: fix vc_translate operator precedence
This fixes a bug in operator precedence in the newly introduced vc_translate
macro.  Without this fix, the translation of some characters on the
kernel console is garbled.

This patch was copied to the e-mail list previously for testing.  Now,
all reports confirm that it works, so this is an official post for
application.

Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-01 22:23:09 +01:00
Adrian Hunter
feb2f55db4 [MTD] [OneNAND] Add defines for HF and sync write
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-08-01 22:06:15 +01:00
Linus Torvalds
84ff7a0012 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: s390: Fix kvm on IBM System z10
  KVM: Advertise synchronized mmu support to userspace
  KVM: Synchronize guest physical memory map to host virtual memory map
  KVM: Allow browsing memslots with mmu_lock
  KVM: Allow reading aliases with mmu_lock
2008-08-01 12:48:16 -07:00
Linus Torvalds
3a4b7886ee Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  pata_it821x: Driver updates and reworking
  libata.h: replace __FUNCTION__ with __func__
  ata_piix: subsys 106b:00a3 is apple ich8m too
  libata-core: make sure that ata_force_tbl is freed in case of an error
  libata: update atapi disable handling
  pata_via: add VX800 flag; add function for fixing h/w bugs
  pata_ali: misplaced pci_dev_put()
2008-08-01 12:41:29 -07:00
Linus Torvalds
b8a327be3f Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull: (64 commits)
  [XFS] Remove vn_revalidate calls in xfs.
  [XFS] Now that xfs_setattr is only used for attributes set from ->setattr
  [XFS] xfs_setattr currently doesn't just handle the attributes set through
  [XFS] fix use after free with external logs or real-time devices
  [XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a
  [XFS] fix compilation without CONFIG_PROC_FS
  [XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/
  [XFS] fix mount option parsing in remount
  [XFS] Disable queue flag test in barrier check.
  [XFS] streamline init/exit path
  [XFS] Fix up problem when CONFIG_XFS_POSIX_ACL is not set and yet we still
  [XFS] Don't assert if trying to mount with blocksize > pagesize
  [XFS] Don't update mtime on rename source
  [XFS] Allow xfs_bmbt_split() to fallback to the lowspace allocator
  [XFS] Restore the lowspace extent allocator algorithm
  [XFS] use minleft when allocating in xfs_bmbt_split()
  [XFS] attrmulti cleanup
  [XFS] Check for invalid flags in xfs_attrlist_by_handle.
  [XFS] Fix CI lookup in leaf-form directories
  [XFS] Use the generic xattr methods.
  ...
2008-08-01 12:39:09 -07:00