linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-10-31 04:31:19 +00:00

Author	SHA1	Message	Date
Felipe Balbi	ca6d1b1333	usb: musb: pass configuration specifics via pdata Use platform_data to pass musb configuration-specific details to musb driver. This patch will prevent that other platforms selecting HAVE_CLK and enabling musb won't break tree building. The other parts of it will come when linux-omap merge up more omap2/3 board-files. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-08-13 17:33:01 -07:00
Felipe Balbi	550a7375fe	USB: Add MUSB and TUSB support This patch adds support for MUSB and TUSB controllers integrated into omap2430 and davinci. It also adds support for external tusb6010 controller. Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-08-13 17:33:00 -07:00
Alan Stern	0282b7f2a8	usb-serial: don't release unregistered minors This patch (as1121) fixes a bug in the USB serial core. When a device is unregistered, the core will give back its minors -- even if the device hasn't been assigned any! The patch reserves the highest minor value (255) to mean that no minor was assigned. It also removes some dead code and does a small style fixup. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-08-13 17:32:50 -07:00
Alan Stern	f4f4d58734	USB: add missing kerneldoc line for "needs_binding" This patch (as1117) adds a kerneldoc line for the "needs_binding" field in struct usb_interface. It was accidentally omitted when the field was added. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-08-13 17:32:49 -07:00
David Howells	9e2b2dc413	CRED: Introduce credential access wrappers The patches that are intended to introduce copy-on-write credentials for 2.6.28 require abstraction of access to some fields of the task structure, particularly for the case of one task accessing another's credentials where RCU will have to be observed. Introduced here are trivial no-op versions of the desired accessors for current and other tasks so that other subsystems can start to be converted over more easily. Wrappers are introduced into a new header (linux/cred.h) for UID/GID, EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed user_struct. These wrappers are macros because the ordering between header files mitigates against making them inline functions. linux/cred.h is #included from linux/sched.h. Further, XFS is modified such that it no longer defines and uses parameterised versions of current_fs[ug]id(), thus getting rid of the namespace collision otherwise incurred. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-08-14 09:35:23 +10:00
Linus Torvalds	9ea319b616	Merge git://oss.sgi.com:8090/xfs/linux-2.6 * git://oss.sgi.com:8090/xfs/linux-2.6: (45 commits) [XFS] Fix use after free in xfs_log_done(). [XFS] Make xfs_bmap__count_leaves void. [XFS] Use KM_NOFS for debug trace buffers [XFS] use KM_MAYFAIL in xfs_mountfs [XFS] refactor xfs_mount_free [XFS] don't call xfs_freesb from xfs_unmountfs [XFS] xfs_unmountfs should return void [XFS] cleanup xfs_mountfs [XFS] move root inode IRELE into xfs_unmountfs [XFS] stop using file_update_time [XFS] optimize xfs_ichgtime [XFS] update timestamp in xfs_ialloc manually [XFS] remove the sema_t from XFS. [XFS] replace dquot flush semaphore with a completion [XFS] replace inode flush semaphore with a completion [XFS] extend completions to provide XFS object flush requirements [XFS] replace the XFS buf iodone semaphore with a completion [XFS] clean up stale references to semaphores [XFS] use get_unaligned_ helpers [XFS] Fix compile failure in xfs_buf_trace() ...	2008-08-13 15:17:49 -07:00
Tom Tucker	24b8b44780	svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet RDMA_READ completions are kept on a separate queue from the general I/O request queue. Since a separate lock is used to protect the RDMA_READ completion queue, a race exists between the dto_tasklet and the svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA bit and adds I/O to the read-completion queue. Concurrently, the recvfrom thread checks the generic queue, finds it empty and resets the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue the transport for I/O and cause the transport to "stall". The fix is to protect both lists with the same lock and set the XPT_DATA bit with this lock held. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-08-13 16:57:31 -04:00
David Chinner	39d2f1ab2a	[XFS] extend completions to provide XFS object flush requirements XFS object flushing doesn't quite match existing completion semantics. It mixed exclusive access with completion. That is, we need to mark an object as being flushed before flushing it to disk, and then block any other attempt to flush it until the completion occurs. We do this but adding an extra count to the completion before we start using them. However, we still need to determine if there is a completion in progress, and allow no-blocking attempts fo completions to decrement the count. To do this we introduce: int try_wait_for_completion(struct completion x) returns a failure status if done == 0, otherwise decrements done to zero and returns a "started" status. This is provided to allow counted completions to begin safely while holding object locks in inverted order. int completion_done(struct completion x) returns 1 if there is no waiter, 0 if there is a waiter (i.e. a completion in progress). This replaces the use of semaphores for providing this exclusion and completion mechanism. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31816a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:40:43 +10:00
Bernhard Walle	31bad9246b	firmware/memmap: cleanup Various cleanup the drivers/firmware/memmap (after review by AKPM): - fix kdoc to conform to the standard - move kdoc from header to implementation files - remove superfluous WARN_ON() after kmalloc() - WARN_ON(x); if (!x) -> if(!WARN_ON(x)) - improve some comments Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:31 -07:00
Harvey Harrison	bc2aa80e18	byteorder: add include/linux/byteorder.h to define endian helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Harvey Harrison	40c9f22210	byteorder: add a new include/linux/swab.h to define byteswapping functions Collect the implementations from include/linux/byteorder/swab.h, swabb.h in swab.h The functionality provided covers: u16 swab16(u16 val) - return a byteswapped 16 bit value u32 swab32(u32 val) - return a byteswapped 32 bit value u64 swab64(u64 val) - return a byteswapped 64 bit value u32 swahw32(u32 val) - return a wordswapped 32 bit value u32 swahb32(u32 val) - return a high/low byteswapped 32 bit value Similar to above, but return swapped value from a naturally-aligned pointer u16 swab16p(u16 p) u32 swab32p(u32 p) u64 swab64p(u64 p) u32 swahw32p(u32 p) u32 swahb32p(u32 p) Similar to above, but swap the value in-place (in-situ) void swab16s(u16 p) void swab32s(u32 p) void swab64s(u64 p) void swahw32s(u32 p) void swahb32s(u32 p) Arches can override any of these with an optimized version by defining an inline in their asm/byteorder.h (example given for swab16()): u16 __arch_swab16() {} #define __arch_swab16 __arch_swab16 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Alexey Dobriyan	50ac2d694f	seq_file: add seq_cpumask(), seq_nodemask() Short enough reads from /proc/irq/*/smp_affinity return -EINVAL for no good reason. This became noticed with NR_CPUS=4096 patches, when length of printed representation of cpumask becase 1152, but cat(1) continued to read with 1024-byte chunks. bitmap_scnprintf() in good faith fills buffer, returns 1023, check returns -EINVAL. Fix it by switching to seq_file, so handler will just fill buffer and doesn't care about offsets, length, filling EOF and all this crap. For that add seq_bitmap(), and wrappers around it -- seq_cpumask() and seq_nodemask(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Paul Jackson <pj@sgi.com> Cc: Mike Travis <travis@sgi.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Uwe Kleine-König	070cb06593	move kernel-doc comment for might_sleep directly before its defining block Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:29 -07:00
Jean Delvare	1054635532	matrox maven: convert to a new-style i2c driver The legacy i2c model is going away soon, so switch to the new model. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:29 -07:00
Jan Beulich	74768ed833	page allocator: use no-panic variant of alloc_bootmem() in alloc_large_system_hash() .. since a failed allocation is being (initially) handled gracefully, and panic()-ed upon failure explicitly in the function if retries with smaller sizes failed. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:27 -07:00
Linus Torvalds	1c89ac5501	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: fix spinlock recursion in hvc_console stop_machine: remove unused variable modules: extend initcall_debug functionality to the module loader export virtio_rng.h lguest: use get_user_pages_fast() instead of get_user_pages() mm: Make generic weak get_user_pages_fast and EXPORT_GPL it lguest: don't set MAC address for guest unless specified	2008-08-12 08:40:19 -07:00
Linus Torvalds	88fa08f67b	Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp: fix SIS 5591/5592 wrong PCI id intel/agp: rewrite GTT on resume agp: use dev_printk when possible amd64-agp: run fallback when no bridges found, not when driver registration fails intel_agp: official name for GM45 chipset	2008-08-12 08:28:32 -07:00
Arjan van de Ven	59f9415ffb	modules: extend initcall_debug functionality to the module loader The kernel has this really nice facility where if you put "initcall_debug" on the kernel commandline, it'll print which function it's going to execute just before calling an initcall, and then after the call completes it will 1) print if it had an error code 2) checks for a few simple bugs (like leaving irqs off) and 3) print how long the init call took in milliseconds. While trying to optimize the boot speed of my laptop, I have been loving number 3 to figure out what to optimize... ... and then I wished that the same thing was done for module loading. This patch makes the module loader use this exact same functionality; it's a logical extension in my view (since modules are just sort of late binding initcalls anyway) and so far I've found it quite useful in finding where things are too slow in my boot. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-08-12 17:52:54 +10:00
Christian Borntraeger	4bceba417a	export virtio_rng.h Hello Rusty, The entropy device was added after we exported all virtio headers. This patch adds virtio_rng.h to the exportable userspace headers. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-08-12 17:52:54 +10:00
Rusty Russell	912985dce4	mm: Make generic weak get_user_pages_fast and EXPORT_GPL it Out of line get_user_pages_fast fallback implementation, make it a weak symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST. Export the symbol to modules so lguest can use it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-08-12 17:52:53 +10:00
Gerrit Renker	987c402ac3	skbuff: Code readability NiT Inserting a space between the `-' improved the C readability (some languages allow hyphens within functions and variable names, which is confusing). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-11 18:17:17 -07:00
Keith Packard	a8c84df9f7	intel/agp: rewrite GTT on resume On my Intel chipset (965GM), the GTT is entirely erased across suspend/resume. This patch simply re-plays the current mapping at resume time to restore the table.=20 I noticed this once I started relying on persistent GTT mappings across VT switch in our GEM work -- the old X server and DRM code carefully unbind all memory from the GTT on VT switch, but GEM does not bother. I placed the list management and rewrite code in the generic layer on the assumption that it will be needed on other hardware, but I did not add the rewrite call to anything other than the Intel resume function. Keep a list of current GATT mappings. At resume time, rewrite them into the GATT. This is needed on Intel (at least) as the entire GATT is cleared across suspend/resume. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Keith Packard <keithp@keithp.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-08-12 10:13:38 +10:00
Linus Torvalds	1ea2950884	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched, cpu hotplug: fix set_cpus_allowed() use in hotplug callbacks sched: fix mysql+oltp regression sched_clock: delay using sched_clock() sched clock: couple local and remote clocks sched clock: simplify __update_sched_clock() sched: eliminate scd->prev_raw sched clock: clean up sched_clock_cpu() sched clock: revert various sched_clock() changes sched: move sched_clock before first use sched: test runtime rather than period in global_rt_runtime() sched: fix SCHED_HRTICK dependency sched: fix warning in hrtick_start_fair()	2008-08-11 16:46:31 -07:00
Linus Torvalds	9b4d0bab32	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: fix debug_lock_alloc lockdep: increase MAX_LOCKDEP_KEYS generic-ipi: fix stack and rcu interaction bug in smp_call_function_mask() lockdep: fix overflow in the hlock shrinkage code lockdep: rename map_[acquire\|release]() => lock_map_[acquire\|release]() lockdep: handle chains involving classes defined in modules mm: fix mm_take_all_locks() locking order lockdep: annotate mm_take_all_locks() lockdep: spin_lock_nest_lock() lockdep: lock protection locks lockdep: map_acquire lockdep: shrink held_lock structure lockdep: re-annotate scheduler runqueues lockdep: lock_set_subclass - reset a held lock's subclass lockdep: change scheduler annotation debug_locks: set oops_in_progress if we will log messages. lockdep: fix combinatorial explosion in lock subgraph traversal	2008-08-11 16:45:46 -07:00
Ingo Molnar	23a0ee908c	Merge branch 'core/locking' into core/urgent	2008-08-12 00:11:49 +02:00
Linus Torvalds	10fec20ef5	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: tc6393 cleanup and update mfd: have TMIO drivers and subdevices depend on ARM mfd: TMIO MMC driver mfd: driver for the TMIO NAND controller mfd: t7l66 MMC platform data mfd: tc6387 MMC platform data mfd: Fix 7l66 and 6387 according to the new mfd-core API mfd: Fix tc6393 according to the new tmio.h mfd: driver for the TC6387XB TMIO controller. mfd: driver for the T7L66XB TMIO SoC mfd: TMIO MMC structures and accessors.	2008-08-11 10:44:43 -07:00
Linus Torvalds	e2205a156f	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: Remove include/linux/harrier_defs.h powerpc: Do not ignore arch/powerpc/include powerpc: Delete completed "ppc removal" task from feature removal file powerpc/mm: Fix attribute confusion with htab_bolt_mapping() powerpc/pci: Don't keep ISA memory hole resources in the tree powerpc: Zero fill the return values of rtas argument buffer powerpc/4xx: Update defconfig files for 2.6.27-rc1 powerpc/44x: Incorrect NOR offset in Warp DTS powerpc/44x: Warp DTS changes for board updates powerpc/4xx: Cleanup Warp for i2c driver changes. powerpc/44x: Adjust warp-nand resource end address	2008-08-11 10:40:28 -07:00
Linus Torvalds	a7ef6a40f7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: Limit VPD length for Broadcom 5708S PCI PM: Export pci_pme_active to drivers PCI: remove duplicate symbol from pci_ids.h PCI: check the return value of device_create_bin_file() in pci_create_bus() PCI: fully restore MSI state at resume time DMA: make dma-coherent.c documentation kdoc-friendly PCI: make pci_register_driver() a macro PCI: add Broadcom 5708S to VPD length quirk	2008-08-11 10:38:36 -07:00
Ingo Molnar	e5f363e358	lockdep: increase MAX_LOCKDEP_KEYS certain configs produce: [ 70.076229] BUG: MAX_LOCKDEP_KEYS too low! [ 70.080230] turning off the locking correctness validator. tune them up. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 15:25:07 +02:00
Ingo Molnar	ced9cd40ac	printk: robustify printk, fix fix: include/linux/kernel.h: In function ‘printk_needs_cpu': include/linux/kernel.h:217: error: parameter name omitted Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 15:04:19 +02:00
Peter Zijlstra	b845b517b5	printk: robustify printk Avoid deadlocks against rq->lock and xtime_lock by deferring the klogd wakeup by polling from the timer tick. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 13:46:53 +02:00
Paul E. McKenney	67182ae1c4	rcu, debug: detect stalled grace periods this is a diagnostic patch for Classic RCU. The approach is to record a timestamp at the beginning of the grace period (in rcu_start_batch()), then have rcu_check_callbacks() complain if: 1. it is running on a CPU that has holding up grace periods for a long time (say one second). This will identify the culprit assuming that the culprit has not disabled hardware irqs, instruction execution, or some such. 2. it is running on a CPU that is not holding up grace periods, but grace periods have been held up for an even longer time (say two seconds). It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter. Rather than exponential backoff, it backs off to once per 30 seconds. My feeling upon thinking on it was that if you have stalled RCU grace periods for that long, a few extra printk() messages are probably the least of your worries... Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: David Witbrodt <dawitbro@sbcglobal.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 13:35:18 +02:00
Ingo Molnar	c4c0c56a7a	Merge branch 'linus' into core/rcu	2008-08-11 13:27:47 +02:00
Paul Mackerras	13fa00a878	powerpc: Remove include/linux/harrier_defs.h It was only used by code in arch/ppc, and arch/ppc is gone, so remove the unused harrier_defs.h as well. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-08-11 21:00:12 +10:00
Peter Zijlstra	b42e737e57	lockdep: fix overflow in the hlock shrinkage code There is a overflow by 1 case in the new shrunken hlock code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 12:34:42 +02:00
Ingo Molnar	3295f0ef9f	lockdep: rename map_[acquire\|release]() => lock_map_[acquire\|release]() the names were too generic: drivers/uio/uio.c:87: error: expected identifier or '(' before 'do' drivers/uio/uio.c:87: error: expected identifier or '(' before 'while' drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 10:30:30 +02:00
Peter Zijlstra	b7d39aff91	lockdep: spin_lock_nest_lock() Expose the new lock protection lock. This can be used to annotate places where we take multiple locks of the same class and avoid deadlocks by always taking another (top-level) lock first. NOTE: we're still bound to the MAX_LOCK_DEPTH (48) limit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 09:30:24 +02:00
Peter Zijlstra	7531e2f34d	lockdep: lock protection locks On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote: > On Fri, 1 Aug 2008, David Miller wrote: > > > > Taking more than a few locks of the same class at once is bad > > news and it's better to find an alternative method. > > It's not always wrong. > > If you can guarantee that anybody that takes more than one lock of a > particular class will always take a single top-level lock _first_, then > that's all good. You can obviously screw up and take the same lock _twice_ > (which will deadlock), but at least you cannot get into ABBA situations. > > So maybe the right thing to do is to just teach lockdep about "lock > protection locks". That would have solved the multi-queue issues for > networking too - all the actual network drivers would still have taken > just their single queue lock, but the one case that needs to take all of > them would have taken a separate top-level lock first. > > Never mind that the multi-queue locks were always taken in the same order: > it's never wrong to just have some top-level serialization, and anybody > who needs to take <n> locks might as well do <n+1>, because they sure as > hell aren't going to be on _any_ fastpaths. > > So the simplest solution really sounds like just teaching lockdep about > that one special case. It's not "nesting" exactly, although it's obviously > related to it. Do as Linus suggested. The lock protection lock is called nest_lock. Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything that spills that it still up shit creek. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 09:30:24 +02:00
Peter Zijlstra	4f3e7524b2	lockdep: map_acquire Most the free-standing lock_acquire() usages look remarkably similar, sweep them into a new helper. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 09:30:23 +02:00
Dave Jones	f82b217e35	lockdep: shrink held_lock structure struct held_lock { u64 prev_chain_key; /* 0 8 / struct lock_class class; /* 8 8 / long unsigned int acquire_ip; / 16 8 / struct lockdep_map instance; /* 24 8 / int irq_context; / 32 4 / int trylock; / 36 4 / int read; / 40 4 / int check; / 44 4 / int hardirqs_off; / 48 4 / / size: 56, cachelines: 1 / / padding: 4 / / last cacheline: 56 bytes / }; struct held_lock { u64 prev_chain_key; / 0 8 / long unsigned int acquire_ip; / 8 8 / struct lockdep_map instance; /* 16 8 / unsigned int class_idx:11; / 24:21 4 / unsigned int irq_context:2; / 24:19 4 / unsigned int trylock:1; / 24:18 4 / unsigned int read:2; / 24:16 4 / unsigned int check:2; / 24:14 4 / unsigned int hardirqs_off:1; / 24:13 4 / / size: 32, cachelines: 1 / / padding: 4 / / bit_padding: 13 bits / / last cacheline: 32 bytes */ }; [mingo@elte.hu: shrunk hlock->class too] [peterz@infradead.org: fixup bit sizes] Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2008-08-11 09:30:23 +02:00
Peter Zijlstra	64aa348edc	lockdep: lock_set_subclass - reset a held lock's subclass this can be used to reset a held lock's subclass, for arbitrary-depth iterated data structures such as trees or lists which have per-node locks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 09:30:21 +02:00
Ingo Molnar	cf206bffbb	Merge branch 'linus' into sched/clock	2008-08-11 08:59:21 +02:00
Peter Zijlstra	c1955a3d47	sched_clock: delay using sched_clock() Some arch's can't handle sched_clock() being called too early - delay this until sched_clock_init() has been called. Reported-by: Bill Gatliff <bgat@billgatliff.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Nishanth Aravamudan <nacc@us.ibm.com> CC: Russell King - ARM Linux <linux@arm.linux.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 08:59:03 +02:00
Ian Molton	25d6cbd840	mfd: tc6393 cleanup and update This patchset cleans up the TC6393XB support. * Add provision for the MMC subdevice * Disable / enable clocks on suspend / resume * Remove fragments of badly merged code (eg. linux/fb include etc.) * Use a device specific clock name to break dependancy on ARM/PXA2XX * Drop unnecessary resource names * Switch to tmio_io* accessors Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-08-10 23:32:07 +02:00
Ian Molton	cbdfb42639	mfd: driver for the TC6387XB TMIO controller. This patch adds support for the TC6387XB. Unlike other TMIO devices this one has only one subdevice and no interrupt mux, however using the MFD framework allows it to share the TMIO MMC driver. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-08-10 20:09:55 +02:00
Ian Molton	1f192015ca	mfd: driver for the T7L66XB TMIO SoC This patchset provides support for the core functinality of the T7L66XB SoC from Toshiba. Supported in this patchset is the IRQ MUX, MMC controller and NAND flash controller. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-08-10 20:09:50 +02:00
Ian Molton	d3a2f71853	mfd: TMIO MMC structures and accessors. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-08-10 20:09:43 +02:00
Linus Torvalds	4fbb71597a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: SLUB: dynamic per-cache MIN_PARTIAL mm: unexport ksize	2008-08-09 16:21:33 -07:00
Randy Dunlap	6724cce8fb	list.h: fix fatal kernel-doc error Fix fatal multi-line kernel-doc error in list.h: function short description must be on one line. Error(linux-2.6.27-rc2-git3//include/linux/list.h:318): duplicate section name 'Description' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-08 16:17:16 -07:00
Linus Torvalds	49b75b87ce	Merge branch 'for-linus-merged' of master.kernel.org:/home/rmk/linux-2.6-arm * 'for-linus-merged' of master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 5177/1: arm/mach-sa1100/Makefile: remove CONFIG_SA1100_USB [ARM] 5166/1: magician: add MAINTAINERS entry [ARM] fix pnx4008 build errors [ARM] Fix SMP booting with non-zero PHYS_OFFSET [ARM] 5185/1: Fix spi num_chipselect for lubbock [ARM] Move include/asm-arm/arch-* to arch/arm//include/mach [ARM] Add support for arch/arm/mach-/include and arch/arm/plat-/include [ARM] Remove asm/hardware.h, use asm/arch/hardware.h instead [ARM] Eliminate useless includes of asm/mach-types.h [ARM] Fix circular include dependency with IRQ headers avr32: Use <mach/foo.h> instead of <asm/arch/foo.h> avr32: Introduce arch/avr32/mach-/include/mach avr32: Move include/asm-avr32 to arch/avr32/include/asm [ARM] sa1100_wdt: use reset_status to remember watchdog reset status [ARM] pxa: introduce reset_status and clear_reset_status for driver's usage [ARM] pxa: introduce reset.h for reset specific header information	2008-08-08 11:38:42 -07:00
Russell King	097d9eb537	Merge Linus' latest into master Conflicts: drivers/watchdog/at91rm9200_wdt.c drivers/watchdog/davinci_wdt.c drivers/watchdog/ep93xx_wdt.c drivers/watchdog/ixp2000_wdt.c drivers/watchdog/ixp4xx_wdt.c drivers/watchdog/ks8695_wdt.c drivers/watchdog/omap_wdt.c drivers/watchdog/pnx4008_wdt.c drivers/watchdog/sa1100_wdt.c drivers/watchdog/wdt285.c	2008-08-08 19:18:18 +01:00
Linus Torvalds	f2d7499be1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (99 commits) pkt_sched: Fix actions referencing bnx2x: fix logical op tcp: (whitespace only) fix confusing indentation pkt_sched: Fix qdisc config when link is down. [Bluetooth] Add full quirk implementation for btusb driver [Bluetooth] Removal of unnecessary ignore module parameter [Bluetooth] Add parameters to control BNEP header compression ath9k: Revamp wireless mode usage ath9k: More unused macros ath9k: Remove a few unused macros and fix indentation ath9k: Use mac80211's band macros and remove enum hal_freq_band ath9k: Remove redundant data structure ath9k_txq_info ath9k: Cleanup data structures related to HW capabilities ath9k: work around gcc ICEs ath9k: Add new Atheros IEEE 802.11n driver ath5k: remove Atheros 11n devices from supported list list.h: add list_cut_position() list.h: Add list_splice_tail() and list_splice_tail_init() p54: swap short slot time dcf values rt2x00: Block all unsupported modes ...	2008-08-08 11:15:23 -07:00
Russell King	2727f226a6	[ARM] fix pnx4008 build errors include/linux/i2c-pnx.h was missed when moving the include files. Fix it now; it doesn't really need to include mach/i2c.h at all. Successfully build tested with pnx4008_defconfig, which had failed in linux-next. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-08-08 15:13:27 +01:00
Linus Torvalds	aeee90dfa0	Merge branch 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace * 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace: tracehook: fix CLONE_PTRACE	2008-08-07 18:14:24 -07:00
Linus Torvalds	273b257839	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mad: Test ib_create_send_mad() return with IS_ERR(), not == NULL IB/mlx4: Allow 4K messages for UD QPs mlx4_core: Add ethernet fields to CQE struct IB/ipath: Fix printk format warnings RDMA/cxgb3: Fix deadlock initializing iw_cxgb3 device RDMA/cxgb3: Fix up MW access rights RDMA/cxgb3: Fix QP capabilities RDMA/cma: Remove padding arrays by using struct sockaddr_storage IB/ipath: Use unsigned long for irq flags IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr()	2008-08-07 18:14:07 -07:00
Roland McGrath	5861bbfcc1	tracehook: fix CLONE_PTRACE In the change in commit `09a05394fe`, I overlooked two nits in the logic and this broke using CLONE_PTRACE when PTRACE_O_TRACE* are not being used. A parent that is itself traced at all but not using PTRACE_O_TRACE*, using CLONE_PTRACE would have its new child fail to be traced. A parent that is not itself traced at all that uses CLONE_PTRACE (which should be a no-op in this case) would confuse the bookkeeping and lead to a crash at exit time. This restores the missing checks and fixes both failure modes. Reported-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Roland McGrath <roland@redhat.com>	2008-08-07 17:18:47 -07:00
Rafael J. Wysocki	5a6c9b60b4	PCI PM: Export pci_pme_active to drivers Export pci_pme_active() to drivers, so that they can clear the PME_status bit and disable PME# for their devices without involving ACPI. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-08-07 15:33:36 -07:00
akpm@linux-foundation.org	7bed523a95	PCI: remove duplicate symbol from pci_ids.h pci.ids.h: remove a duplicated symbol Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Grant Coady <gcoady.lk@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-08-07 09:49:57 -07:00
Andrew Morton	bba8116586	PCI: make pci_register_driver() a macro alpha: CC [M] drivers/usb/gadget/u_ether.o In file included from include/asm/dma-mapping.h:7, from include/linux/dma-mapping.h:52, from include/linux/dmaengine.h:29, from include/linux/skbuff.h:29, from include/linux/if_ether.h:114, from include/linux/etherdevice.h:27, from drivers/usb/gadget/u_ether.c:29: include/linux/pci.h: In function 'pci_register_driver': include/linux/pci.h:673: error: 'KBUILD_MODNAME' undeclared (first use in this function) include/linux/pci.h:673: error: (Each undeclared identifier is reported only once include/linux/pci.h:673: error: for each function it appears in.) Sam says: The problem is that u_ether.o is used by two modules so when we build it KBUILD_MODNAME is not defined because kbuild does not know what value to use. And in pci.h we have the following inline: static inline int __must_check pci_register_driver(struct pci_driver driver) { return __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME); } And alpha uses dma-mapping.h to nullify a number of functions that seem to require something from pci.h. Making it a macro fixes this particular problem. However, the underlying issue of a file using KBUILD_MODNAME and being shared between multiple modules is not* addressed. I guess the answer there is "don't do that". Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-08-07 06:52:01 -07:00
Luis R. Rodriguez	00e8a4da8c	list.h: add list_cut_position() This adds list_cut_position() which lets you cut a list into two lists given a pivot in the list. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-08-07 09:49:42 -04:00
Luis R. Rodriguez	7d283aee50	list.h: Add list_splice_tail() and list_splice_tail_init() If you are using linked lists for queues list_splice() will not do what you would expect even if you use the elements passed reversed. We need to handle these differently. We add list_splice_tail() and list_splice_tail_init(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-08-07 09:49:42 -04:00
Laurent Pinchart	fe41424855	dm9000: Support MAC address setting through platform data. The dm9000 driver reads the chip's MAC address from the attached EEPROM. When no EEPROM is present, or when the MAC address is invalid, it falls back to reading the address from the chip. This patch lets platform code set the desired MAC address through platform data. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-08-07 02:22:54 -04:00
Brandon Philips	b11f8d8cc3	ethtool: Expand ethtool_cmd.speed to 32 bits Introduce the speed_hi field to ethtool_cmd, using the reserved space, to expand the speed field to 2^32 Megabits/second. Making this field expansion now gives us plenty of time to fix up the user-space pieces that use SIOCETHTOOL before hardware faster than 64 Gb/s is available. Signed-off-by: Brandon Philips <bphilips@suse.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-08-07 02:22:08 -04:00
Yevgeny Petrilin	f780a9f119	mlx4_core: Add ethernet fields to CQE struct Add ethernet-related fields to struct mlx4_cqe so that the mlx4_en ethernet NIC driver can share the same definition. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-06 20:14:06 -07:00
Linus Torvalds	e63e03273b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (78 commits) AX.25: Fix sysctl registration if !CONFIG_AX25_DAMA_SLAVE pktgen: mac count pktgen: random flow bridge: Eliminate unnecessary forward delay bridge: fix compile warning in net/bridge/br_netfilter.c ipv4: remove unused field in struct flowi (include/net/flow.h). tg3: Fix 'scheduling while atomic' errors net: Kill plain NET_XMIT_BYPASS. net_sched: Add qdisc __NET_XMIT_BYPASS flag net_sched: Add qdisc __NET_XMIT_STOLEN flag iwl3945: fix merge mistake for packet injection iwlwifi: grap nic access before accessing periphery registers iwlwifi: decrement rx skb counter in scan abort handler iwlwifi: fix unhandled interrupt when HW rfkill is on iwlwifi: implement iwl5000_calc_rssi iwlwifi: memory allocation optimization iwlwifi: HW bug fixes p54: Fix potential concurrent access to private data rt2x00: Disable link tuning in rt2500usb iwlwifi: Don't use buffer allocated on the stack for led names ...	2008-08-05 19:37:42 -07:00
Richard Hughes	bf1db69fbf	pm_qos: spelling fixes A documentation cleanup patch. With a minor tweak to clarify units for kbs. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: mark gross <mgross@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:50 -07:00
Mark Asselstine	f6ac436dcc	Remove the deprecated cli() sti() functions These functions have been deprecated for some time now but remained until all legacy callers could be removed. With a few commits in 2.6.26 this has happened so now we can remove these deprecated functions. Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:48 -07:00
Shadi Ammouri	60cadec9da	spi: new orion_spi driver This adds an SPI driver for the SPI controller found in various Marvell Orion ARM SoCs. It currently supports only one slave, which must use SPI mode 0. [dbrownell@users.sourceforge.net: cleanups, meet specs, pass "sparse"] Signed-off-by: Shadi Ammouri <shadi@marvell.com> Signed-off-by: Saeed Bishara <saeed@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:46 -07:00
Bernhard Walle	c6e2bee26e	kdump: report actual value of VMCOREINFO_OSRELEASE in VMCOREINFO The current implementation reports the structure name as VMCOREINFO_OSRELEASE in VMCOREINFO, e.g. VMCOREINFO_OSRELEASE=init_uts_ns.name.release That doesn't make sense because it's always the same. Instead, use the value, e.g. VMCOREINFO_OSRELEASE=2.6.26-rc3 That's also what the 'makedumpfile -g' does. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: "Ken'ichi Ohmichi" <oomichi@mxs.nes.nec.co.jp> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:46 -07:00
Adrian Bunk	c5bfc3757f	ide: remove CONFIG_IDE_MAX_HWIFS The benefits of a user settable CONFIG_IDE_MAX_HWIFS have become pretty tiny and are no longer considered worth the trouble of an own option. Simply always #define MAX_HWIFS to 10. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-08-05 18:17:01 +02:00
Bartlomiej Zolnierkiewicz	39b986a6c7	ide: sanitize struct ide_port_ops documentation (take 2) v2: Add missing '@'-s. (Noticed by Randy Dunlap) Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-08-05 18:16:57 +02:00
David S. Miller	33e334950a	Merge branch 'no-ath9k' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2008-08-05 01:28:35 -07:00
Pekka Enberg	5595cffc82	SLUB: dynamic per-cache MIN_PARTIAL This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial value that is calculated from object size. The bigger the object size, the more pages we keep on the partial list. I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example script of the fio benchmarking tool. The script stresses the networking subsystem which should also give a fairly good beating of kmalloc() et al. To run the test yourself, first clone the fio repository: git clone git://git.kernel.dk/fio.git and then run the following command n times on your machine: time ./fio examples/netio The results on my 2-way 64-bit x86 machine are as follows: [ the minimum, maximum, and average are captured from 50 individual runs ] real time (seconds) min max avg sd SLAB 22.76 23.38 22.98 0.17 SLUB 22.80 25.78 23.46 0.72 SLUB (dynamic) 22.74 23.54 23.00 0.20 sys time (seconds) min max avg sd SLAB 6.90 8.28 7.70 0.28 SLUB 7.42 16.95 8.89 2.28 SLUB (dynamic) 7.17 8.64 7.73 0.29 user time (seconds) min max avg sd SLAB 36.89 38.11 37.50 0.29 SLUB 30.85 37.99 37.06 1.67 SLUB (dynamic) 36.75 38.07 37.59 0.32 As you can see from the above numbers, this patch brings SLUB to the same level as SLAB for this particular workload fixing a ~2% regression. I'd expect this change to help similar workloads that allocate a lot of objects that are close to the size of a page. Cc: Matthew Wilcox <matthew@wil.cx> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-08-05 09:28:47 +03:00
David S. Miller	cc6533e98a	net: Kill plain NET_XMIT_BYPASS. dst_input() was doing something completely absurd, looping on skb->dst->input() if NET_XMIT_BYPASS was seen, but these functions never return such an error. And as a result plain ole' NET_XMIT_BYPASS has no more references and can be completely killed off. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-04 23:04:08 -07:00
Jarek Poplawski	378a2f090f	net_sched: Add qdisc __NET_XMIT_STOLEN flag Patrick McHardy <kaber@trash.net> noticed: "The other problem that affects all qdiscs supporting actions is TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS even though the packet is not queued, corrupting upper qdiscs' qlen counters." and later explained: "The reason why it translates it at all seems to be to not increase the drops counter. Within a single qdisc this could be avoided by other means easily, upper qdiscs would still increase the counter when we return anything besides NET_XMIT_SUCCESS though. This means we need a new NET_XMIT return value to indicate this to the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN, return that to upper qdiscs and translate it to NET_XMIT_SUCCESS in dev_queue_xmit, similar to NET_XMIT_BYPASS." David Miller <davem@davemloft.net> noticed: "Maybe these NET_XMIT_* values being passed around should be a set of bits. They could be composed of base meanings, combined with specific attributes. So you could say "NET_XMIT_DROP \| __NET_XMIT_NO_DROP_COUNT" The attributes get masked out by the top-level ->enqueue() caller, such that the base meanings are the only thing that make their way up into the stack. If it's only about communication within the qdisc tree, let's simply code it that way." This patch is trying to realize these ideas. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-04 22:31:03 -07:00
Nick Piggin	ca5de404ff	fs: rename buffer trylock Like the page lock change, this also requires name change, so convert the raw test_and_set bitop to a trylock. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 21:56:09 -07:00
Nick Piggin	529ae9aaa0	mm: rename page trylock Converting page lock to new locking bitops requires a change of page flag operation naming, so we might as well convert it to something nicer (!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked). This also facilitates lockdeping of page lock. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 21:31:34 -07:00
Linus Torvalds	2e1e9212ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (29 commits) sh: enable maple_keyb in dreamcast_defconfig. SH2(A) cache update nommu: Provide vmalloc_exec(). add addrespace definition for sh2a. sh: Kill off ARCH_SUPPORTS_AOUT and remnants of a.out support. sh: define GENERIC_HARDIRQS_NO__DO_IRQ. sh: define GENERIC_LOCKBREAK. sh: Save NUMA node data in vmcore for crash dumps. sh: module_alloc() should be using vmalloc_exec(). sh: Fix up __bug_table handling in module loader. sh: Add documentation and integrate into docbook build. sh: Fix up broken kerneldoc comments. maple: Kill useless private_data pointer. maple: Clean up maple_driver_register/unregister routines. input: Clean up maple keyboard driver maple: allow removal and reinsertion of keyboard driver module sh: /proc/asids depends on MMU. arch/sh/boards/mach-se/7343/irq.c: removed duplicated #include arch/sh/boards/board-ap325rxa.c: removed duplicated #include sh/boards/Makefile typo fix ...	2008-08-04 17:26:15 -07:00
Roland McGrath	115a326c1e	tracehook: kerneldoc fix My last change to tracehook.h made it confuse the kerneldoc parser. Move the #define's before the comment so it's happy again. Signed-off-by: Roland McGrath <roland@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 17:23:43 -07:00
Linus Torvalds	c635fd3d3d	Merge git://git.infradead.org/users/dwmw2/random-2.6 * git://git.infradead.org/users/dwmw2/random-2.6: drivers/video/console/promcon.c: fix build error Fix IHEX firmware generation/loading	2008-08-04 17:03:56 -07:00
Linus Torvalds	82248a5e92	Merge git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: Add DIP switch readout for HFC-4S IOB4ST Fix remaining big endian issue of hfcmulti mISDN cleanup user interface mISDN fix main ISDN Makefile	2008-08-04 17:00:37 -07:00
Linus Torvalds	1a3f7d98e5	Revert "UFS: add const to parser token table" This reverts commit `f9247273cb` (and `fb2e405fc1` - "fix fs/nfs/nfsroot.c compilation" - that fixed a missed conversion). The changes cause problems for at least the sparc build. Let's re-do them when the exact issues are resolved. Requested-by: Andrew Morton <akpm@linux-foundation.org> Requested-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 16:50:38 -07:00
Emmanuel Grumbach	98f7dfd86c	mac80211: pass dtim_period to low level driver This patch adds the dtim_period in ieee80211_bss_conf, this allows the low level driver to know the dtim_period, and to plan power save accordingly. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-08-04 15:09:07 -04:00
Paul Mundt	617870632d	maple: Kill useless private_data pointer. We can simply wrap in to the dev_set/get_drvdata(), there's no reason to track an extra level of private data on top of the struct device. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-08-04 10:58:24 +09:00
Paul Mundt	63870295de	maple: Clean up maple_driver_register/unregister routines. These were completely inconsistent. Clean these up to take a maple_driver pointer directly for consistency. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-08-04 10:39:46 +09:00
Alexander Beregalov	cf368d2f9a	drivers/video/console/promcon.c: fix build error drivers/video/console/promcon.c:158: error: implicit declaration of function 'con_protect_unimap' Introduced by commit `a29ccf6f82` ("embedded: fix vc_translate operator precedence"). Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Cc: Tim Bird <tim.bird@am.sony.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-08-03 09:51:30 +01:00
Marc Zyngier	85ebd00334	Fix IHEX firmware generation/loading Fix both the IHEX firmware generation (len field always null, and EOF marker a byte too short) and loading (struct ihex_binrec needs to be packed to reflect the on-disk structure). Signed-off-by: Marc Zyngier <maz@misterjones.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-08-02 18:36:10 +01:00
Karsten Keil	ff4cc1de24	mISDN cleanup user interface The channelmap should have the same size on 32 and 64 bit systems and should not depend on endianess. Thanks to David Woodhouse for spotting this. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-08-02 16:28:50 +02:00
Tim Bird	4744b43431	embedded: fix vc_translate operator precedence This fixes a bug in operator precedence in the newly introduced vc_translate macro. Without this fix, the translation of some characters on the kernel console is garbled. This patch was copied to the e-mail list previously for testing. Now, all reports confirm that it works, so this is an official post for application. Signed-off-by: Tim Bird <tim.bird@am.sony.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-08-01 22:23:09 +01:00
Linus Torvalds	84ff7a0012	Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: s390: Fix kvm on IBM System z10 KVM: Advertise synchronized mmu support to userspace KVM: Synchronize guest physical memory map to host virtual memory map KVM: Allow browsing memslots with mmu_lock KVM: Allow reading aliases with mmu_lock	2008-08-01 12:48:16 -07:00
Linus Torvalds	3a4b7886ee	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: pata_it821x: Driver updates and reworking libata.h: replace __FUNCTION__ with __func__ ata_piix: subsys 106b:00a3 is apple ich8m too libata-core: make sure that ata_force_tbl is freed in case of an error libata: update atapi disable handling pata_via: add VX800 flag; add function for fixing h/w bugs pata_ali: misplaced pci_dev_put()	2008-08-01 12:41:29 -07:00
Linus Torvalds	b8a327be3f	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull: (64 commits) [XFS] Remove vn_revalidate calls in xfs. [XFS] Now that xfs_setattr is only used for attributes set from ->setattr [XFS] xfs_setattr currently doesn't just handle the attributes set through [XFS] fix use after free with external logs or real-time devices [XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a [XFS] fix compilation without CONFIG_PROC_FS [XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/ [XFS] fix mount option parsing in remount [XFS] Disable queue flag test in barrier check. [XFS] streamline init/exit path [XFS] Fix up problem when CONFIG_XFS_POSIX_ACL is not set and yet we still [XFS] Don't assert if trying to mount with blocksize > pagesize [XFS] Don't update mtime on rename source [XFS] Allow xfs_bmbt_split() to fallback to the lowspace allocator [XFS] Restore the lowspace extent allocator algorithm [XFS] use minleft when allocating in xfs_bmbt_split() [XFS] attrmulti cleanup [XFS] Check for invalid flags in xfs_attrlist_by_handle. [XFS] Fix CI lookup in leaf-form directories [XFS] Use the generic xattr methods. ...	2008-08-01 12:39:09 -07:00
Roland McGrath	5c7edcd7ee	tracehook: fix exit_signal=0 case My commit `2b2a1ff64a` introduced a regression (sorry about that) for the odd case of exit_signal=0 (e.g. clone_flags=0). This is not a normal use, but it's used by a case in the glibc test suite. Dying with exit_signal=0 sends no signal, but it's supposed to wake up a parent's blocked wait*() calls (unlike the delayed_group_leader case). This fixes tracehook_notify_death() and its caller to distinguish a "signal 0" wakeup from the delayed_group_leader case (with no wakeup). Signed-off-by: Roland McGrath <roland@redhat.com> Tested-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-01 12:01:11 -07:00
Linus Torvalds	1e24b15b26	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: raid10: wake up frozen array md: do not count blocked devices as spares md: do not progress the resync process if the stripe was blocked md: delay notification of 'active_idle' to the recovery thread md: fix merge error md: move async_tx_issue_pending_all outside spin_lock_irq	2008-08-01 11:56:07 -07:00
Linus Torvalds	63a16f9016	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: [PATCH] ocfs2: Release mutex in error handling code [PATCH] ocfs2: Fix oops when racing files truncates with writes into an mmap region [PATCH 2/2] ocfs2: Fix race between mount and recovery [PATCH 1/2] ocfs2: Add counter in struct ocfs2_dinode to track journal replays [PATCH] configfs: Convenience macros for attribute definition. [PATCH] configfs: Pin configfs subsystems separately from new config_items. [PATCH] configfs: Fix open directory making rmdir() fail [PATCH] configfs: Lock new directory inodes before removing on cleanup after failure [PATCH] configfs: Prevent userspace from creating new entries under attaching directories [PATCH] configfs: Fix failing symlink() making rmdir() fail [PATCH] configfs: Fix symlink() to a removing item [PATCH] configfs: Include linux/err.h in linux/configfs.h	2008-08-01 11:54:05 -07:00
Linus Torvalds	b17b3d479c	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: md: the bitmap code needs to use blk_plug_device_unlocked() block: add a blk_plug_device_unlocked() that grabs the queue lock	2008-08-01 11:46:00 -07:00
Linus Torvalds	9a5467fd60	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits) tcp: MD5: Fix IPv6 signatures skbuff: add missing kernel-doc for do_not_encrypt net/ipv4/route.c: fix build error tcp: MD5: Fix MD5 signatures on certain ACK packets ipv6: Fix ip6_xmit to send fragments if ipfragok is true ipvs: Move userspace definitions to include/linux/ip_vs.h netdev: Fix lockdep warnings in multiqueue configurations. netfilter: xt_hashlimit: fix race between htable_destroy and htable_gc netfilter: ipt_recent: fix race between recent_mt_destroy and proc manipulations netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged irda: replace __FUNCTION__ with __func__ nsc-ircc: default to dongle type 9 on IBM hardware bluetooth: add quirks for a few hci_usb devices hysdn: remove the packed attribute from PofTimStamp_tag isdn: use the common ascii hex helpers tg3: adapt tg3 to use reworked PCI PM code atm: fix direct casts of pointers to u32 in the InterPhase driver atm: fix const assignment/discard warnings in the ATM networking driver net: use the common ascii hex helpers random32: seeding improvement ...	2008-08-01 11:35:16 -07:00
Jens Axboe	6c5e0c4d51	block: add a blk_plug_device_unlocked() that grabs the queue lock blk_plug_device() must be called with the queue lock held, so callers often just grab and release the lock for that purpose. Add a helper that does just that. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-01 20:31:32 +02:00
Linus Torvalds	623fa579e6	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [MTD] [NAND] drivers/mtd/nand/nandsim.c: fix printk warnings [MTD] [NAND] Blackfin NFC Driver: Cleanup the error exit path of bf5xx_nand_probe function [MTD] [NAND] Blackfin NFC Driver: use standard dev_err() rather than printk() [MTD] [NAND] Blackfin NFC Driver: enable Blackfin nand HWECC support by default [MTD] [NAND] Blackfin NFC Driver: add proper devinit/devexit markings to probe/remove functions [MTD] [NAND] Blackfin NFC Driver: add support for the ECC layout the Blackfin bootrom uses [MTD] [NAND] Blackfin NFC Driver: fix bug - hw ecc calc by making sure we extract 11 bits from each register instead of 10 [MTD] [NAND] Blackfin NFC Driver: fix bug - do not clobber the status from the first 256 bytes if operating on 512 pages [MTD] [NAND] diskonchip.c fix sparse endian warnings [MTD] [NAND] drivers/mtd/nand/nandsim.c needs div64.h [JFFS2] Fix allocation of summary buffer Fix rename of at91_nand -> atmel_nand [MTD] [NOR] drivers/mtd/chips/jedec_probe.c: fix Am29DL800BB device ID [MTD] MTD_DEBUG always does compile-time typechecks [MTD] DataFlash: bugfix, binary page sizes now handled [MTD] [NAND] fsl_elbc_nand.c: fix printk warning [MTD] [NAND] nandsim: support random page read command [MTD] [NAND] fix subpage read for small page NAND	2008-08-01 11:29:54 -07:00
Linus Torvalds	d65f5c5803	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] pass struct path * to do_add_mount() [PATCH] switch mtd and dm-table to lookup_bdev() [patch 3/4] vfs: remove unused nameidata argument of may_create() [PATCH] devpts: switch to IDA [PATCH 2/2] proc: switch inode number allocation to IDA [PATCH 1/2] proc: fix inode number bogorithmetic [PATCH] fix bdev leak in block_dev.c do_open() [PATCH] fix races and leaks in vfs_quota_on() users [PATCH] clean dup2() up a bit [PATCH] merge locate_fd() and get_unused_fd() [PATCH] ipv4_static_sysctl_init() should be under CONFIG_SYSCTL Re: BUG at security/selinux/avc.c:883 (was: Re: linux-next: Tree	2008-08-01 11:26:51 -07:00
Linus Torvalds	561b35b341	Merge branch 'reg-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'reg-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator: TI bq24022 Li-Ion Charger driver regulator: maintainers - add maintainers for regulator framework. regulator: documentation - ABI regulator: documentation - machine regulator: documentation - regulator driver regulator: documentation - consumer interface regulator: documentation - overview regulator: core kbuild files regulator: regulator test harness regulator: add support for fixed regulators. regulator: regulator framework core regulator: fixed regulator interface regulator: machine driver interface regulator: regulator driver interface regulator: consumer device interface	2008-08-01 10:56:40 -07:00
Linus Torvalds	b14f7fb5aa	Merge git://git.infradead.org/battery-2.6 * git://git.infradead.org/battery-2.6: power_supply: Sharp SL-6000 (tosa) batteries support power_supply: fix up CHARGE_COUNTER output to be more precise power_supply: add CHARGE_COUNTER property and olpc_battery support for it power_supply: bump EC version check that we refuse to run with in olpc_battery power_supply: cleanup of the OLPC battery driver power_supply: add eeprom dump file to olpc_battery's sysfs power_supply: Support serial number in olpc_battery	2008-08-01 10:55:07 -07:00
Linus Torvalds	00e9028a95	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (28 commits) mm/hugetlb.c must #include <asm/io.h> video: Fix up hp6xx driver build regressions. sh: defconfig updates. sh: Kill off stray mach-rsk7203 reference. serial: sh-sci: Fix up SH7760/SH7780/SH7785 early printk regression. sh: Move out individual boards without mach groups. sh: Make sure AT_SYSINFO_EHDR is exposed to userspace in asm/auxvec.h. sh: Allow SH-3 and SH-5 to use common headers. sh: Provide common CPU headers, prune the SH-2 and SH-2A directories. sh/maple: clean maple bus code sh: More header path fixups for mach dir refactoring. sh: Move out the solution engine headers to arch/sh/include/mach-se/ sh: I2C fix for AP325RXA and Migo-R sh: Shuffle the board directories in to mach groups. sh: dma-sh: Fix up dreamcast dma.h mach path. sh: Switch KBUILD_DEFCONFIG to shx3_defconfig. sh: Add ARCH_DEFCONFIG entries for sh and sh64. sh: Fix compile error of Solution Engine sh: Proper __put_user_asm() size mismatch fix. sh: Stub in a dummy ENTRY_OFFSET for uImage offset calculation. ...	2008-08-01 10:53:43 -07:00
Linus Torvalds	57b1494d2b	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic, x86: fix add iommu_num_pages helper function x86: remove stray <6> in BogoMIPS printk x86: move dma32_reserve_bootmem() after reserve_crashkernel()	2008-08-01 10:28:17 -07:00
Al Viro	8d66bf5481	[PATCH] pass struct path * to do_add_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:32 -04:00
Al Viro	77e69dac3c	[PATCH] fix races and leaks in vfs_quota_on() users * new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the pathname resolution. * callers of vfs_quota_on() that do their own pathname resolution and checks based on it are switched to vfs_quota_on_path(); that way we avoid the races. * reiserfs leaked dentry/vfsmount references on several failure exits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:25 -04:00
Al Viro	1027abe882	[PATCH] merge locate_fd() and get_unused_fd() New primitive: alloc_fd(start, flags). get_unused_fd() and get_unused_fd_flags() become wrappers on top of it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:23 -04:00
Martin Schwidefsky	a4b526b3ba	[S390] Optimize storage key operations for anon pages For anonymous pages without a swap cache backing the check in page_remove_rmap for the physical dirty bit in page_remove_rmap is unnecessary. The instructions that are used to check and reset the dirty bit are expensive. Removing the check noticably speeds up process exit. In addition the clearing of the dirty bit in __SetPageUptodate is pointless as well. With these two changes there is no storage key operation for an anonymous page anymore if it does not hit the swap space. The micro benchmark which repeatedly executes an empty shell script gets about 5% faster. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-08-01 16:39:30 +02:00
Randy Dunlap	4a7b61d235	skbuff: add missing kernel-doc for do_not_encrypt Add missing kernel-doc notation to sk_buff: Warning(linux-2.6.27-rc1-git2//include/linux/skbuff.h:345): No description found for parameter 'do_not_encrypt' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-31 20:52:08 -07:00
Julius Volz	bc4768eb08	ipvs: Move userspace definitions to include/linux/ip_vs.h Current versions of ipvsadm include "/usr/src/linux/include/net/ip_vs.h" directly. This file also contains kernel-only definitions. Normally, public definitions should live in include/linux, so this patch moves the definitions shared with userspace to a new file, "include/linux/ip_vs.h". This also removes the unused NFC_IPVS_PROPERTY bitmask, which was once used to point into skb->nfcache. To make old ipvsadms still compile with this, the old header file includes the new one. Thanks to Dave Miller and Horms for noting/adding the missing Kbuild entry for the new header file. Signed-off-by: Julius Volz <juliusv@google.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-31 20:45:24 -07:00
David S. Miller	c3f26a269c	netdev: Fix lockdep warnings in multiqueue configurations. When support for multiple TX queues were added, the netif_tx_lock() routines we converted to iterate over all TX queues and grab each queue's spinlock. This causes heartburn for lockdep and it's not a healthy thing to do with lots of TX queues anyways. So modify this to use a top-level lock and a "frozen" state for the individual TX queues. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-31 16:58:50 -07:00
Joel Becker	ecb3d28c7e	[PATCH] configfs: Convenience macros for attribute definition. Sysfs has the _ATTR() and _ATTR_RO() macros to make defining extended form attributes easier. configfs should have something similiar. - _CONFIGFS_ATTR() and _CONFIGFS_ATTR_RO() are the counterparts to the sysfs macros. - CONFIGFS_ATTR_STRUCT() creates the extended form attribute structure. - CONFIGFS_ATTR_OPS() defines the show_attribute()/store_attribute() operations that call the show()/store() operations of the extended form configfs_attributes. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Joel Becker	dacdd0e047	[PATCH] configfs: Include linux/err.h in linux/configfs.h We now use PTR_ERR() in the ->make_item() and ->make_group() operations. Folks including configfs.h need err.h. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:12 -07:00
David Brownell	64a76f667d	hpet: /dev/hpet - fixes and cleanup Minor /dev/hpet updates and bugfixes: * Remove dead code, mostly remnants of an incomplete/unusable kernel interface ... noted when addressing "sparse" warnings: + hpet_unregister() and a routine it calls + hpet_task and all references, including hpet_task_lock + hpet_data.hd_flags (and HPET_DATA_PLATFORM) * Correct and improve boot message: + displays counter (shared between comparators) bit width, not timer bit widths (which are often mixed) + relabel "timers" as "comparators"; this is less confusing, they are not independent like normal timers are (sigh) + display MHz not Hz; it's never less than 10 MHz. * Tighten and correct the userspace interface code + don't accidentally program comparators in 64-bit mode using 32-bit values ... always force comparators into 32-bit mode + provide the correct bit definition flagging comparators with periodic capability ... the ABI is unchanged * Update Documentation/hpet.txt + be more correct and current + expand description a bit + don't mention that now-gone kernel interface Plus, add a FIXME comment for something that could cause big trouble on systems with more capable HPETs than at least Intel seems to ship. It seems that few folk use this userspace interface; it's not very usable given the general lack of HPET IRQ routing. I'm told that the only real point of it any more is to mmap for fast timestamps; IMO that's handled better through the gettimeofday() vsyscall. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:45:41 +02:00
Ingo Molnar	85e9ca333d	Merge branch 'linus' into timers/hpet	2008-07-31 18:43:41 +02:00
David Miller	419ca3f135	lockdep: fix combinatorial explosion in lock subgraph traversal When we traverse the graph, either forwards or backwards, we are interested in whether a certain property exists somewhere in a node reachable in the graph. Therefore it is never necessary to traverse through a node more than once to get a correct answer to the given query. Take advantage of this property using a global ID counter so that we need not clear all the markers in all the lock_class entries before doing a traversal. A new ID is choosen when we start to traverse, and we continue through a lock_class only if it's ID hasn't been marked with the new value yet. This short-circuiting is essential especially for high CPU count systems. The scheduler has a runqueue per cpu, and needs to take two runqueue locks at a time, which leads to long chains of backwards and forwards subgraphs from these runqueue lock nodes. Without the short-circuit implemented here, a graph traversal on a runqueue lock can take up to (1 << (N - 1)) checks on a system with N cpus. For anything more than 16 cpus or so, lockdep will eventually bring the machine to a complete standstill. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:38:28 +02:00
Ingo Molnar	e4e4e534fa	sched clock: revert various sched_clock() changes Found an interactivity problem on a quad core test-system - simple CPU loops would occasionally delay the system un an unacceptable way. After much debugging with Peter Zijlstra it turned out that the problem is caused by the string of sched_clock() changes - they caused the CPU clock to jump backwards a bit - which confuses the scheduler arithmetics. (which is unsigned for performance reasons) So revert: # `c300ba2`: sched_clock: and multiplier for TSC to gtod drift # c0c8773: sched_clock: only update deltas with local reads. # `af52a90`: sched_clock: stop maximum check on NO HZ # `f7cce27`: sched_clock: widen the max and min time This solves the interactivity problems. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de>	2008-07-31 17:20:29 +02:00
Ingo Molnar	4b336b0625	Merge branch 'x86/urgent' into x86/xen	2008-07-31 12:41:34 +02:00
Ingo Molnar	5fbf24659b	Merge branch 'linus' into x86/xen	2008-07-31 12:38:04 +02:00
Patrick McHardy	ae375044d3	netfilter: nf_conntrack_tcp: decrease timeouts while data in unacknowledged In order to time out dead connections quicker, keep track of outstanding data and cap the timeout. Suggested by Herbert Xu. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-31 00:38:01 -07:00
Alan Cox	963e4975c6	pata_it821x: Driver updates and reworking - Add support for the RDC 1010 variant - Rework the core library to have a read_id method. This allows the hacky bits of it821x to go and prepares us for pata_hd - Switch from WARN to BUG in ata_id_string as it will reboot if you get it wrong so WARN won't be seen - Allow the issue of command 0xFC on the 821x. This is needed to query rebuild status. - Tidy up printk formatting - Do more ident rewriting on RAID volumes to handle firmware provided ident data which is rather wonky - Report the firmware revision and device layout in RAID mode - Don't try and disable raid on the 8211 or RDC - they don't have the relevant bits Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-31 02:04:50 -04:00
Alexander Beregalov	1f938d060a	libata.h: replace __FUNCTION__ with __func__ Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-31 01:48:16 -04:00
Linus Torvalds	660fc1f4d8	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3) powerpc: Don't use the wrong thread_struct for ptrace get/set VSX regs powerpc: Fix ptrace buffer size for VSX powerpc: Correctly hookup PTRACE_GET/SETVSRREGS for 32 bit processes ide/powermac: Fix use of uninitialized pointer on media-bay powerpc: Allow non-hcall return values for lparcfg writes ipmi/powerpc: Use linux/of_{device,platform}.h instead of asm powerpc/fsl: proliferate simple-bus compatibility to soc nodes Documentation: remove old sbc8260 board specific information cpm2: Rework baud rate generators configuration to support external clocks. powerpc: rtc_cmos_setup: assign interrupts only if there is i8259 PIC cpm_uart: Add generic clock API support to set baudrates cpm_uart: Modem control lines support powerpc: implement GPIO LIB API on CPM1 Freescale SoC. cpm2: Implement GPIO LIB API on CPM2 Freescale SoC. powerpc: Fix 8xx build failure powerpc: clean up the Book-E HW watchpoint support	2008-07-30 10:43:56 -07:00
Stephen Rothwell	3dd730f2b4	cpumask: statement expressions confuse some versions of gcc when you take the address of the result. Noticed on a sparc64 compile using a version 3.4.5 cross compiler. kernel/time/tick-common.c: In function `tick_check_new_device': kernel/time/tick-common.c:210: error: invalid lvalue in unary `&' ... Just make it a regular expression. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 10:35:49 -07:00
Ingo Molnar	15dd859cac	Merge commit 'v2.6.27-rc1' into x86/core Conflicts: include/asm-x86/dma-mapping.h include/asm-x86/namei.h include/asm-x86/uaccess.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:33:48 +02:00
Linus Torvalds	a4319d9fa0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) net: Make "networking" one-click deselectable. ipv6: Fix useless proc net sockstat6 removal tcp: MD5: Use MIB counter instead of warning for MD5 mismatch. pkt_sched: Fix OOPS on ingress qdisc add. niu: Fix error checking in niu_ethflow_to_class. IPv6: datagram_send_ctl() should exit immediately when an error occured mac80211: fix mesh beaconing PS3: gelic: use unsigned long for irqflags mac80211: fix cfg80211 hooks for master interface nl80211: fix dump callbacks mac80211: partially fix skb->cb use rtl8187: Improve wireless statistics for RTL8187B rtl8187: Fix for TX sequence number problem mac80211: append CONFIG_ to MAC80211_VERBOSE_PS_DEBUG in net/mac80211/tx.c. mac80211: fix sparse integer as NULL pointer warning drivers/net/wireless/iwlwifi/iwl-led.c: printk fix mac80211: return correct error return from ieee80211_wep_init mac80211: tx, use dev_kfree_skb_any for beacon_get rt2x00: Clear queue entry flags during initialization rt2x00: Force full register config after start() ...	2008-07-30 10:13:37 -07:00
Jack Steiner	c627f9cc04	mm: add zap_vma_ptes(): a library function to unmap driver ptes zap_vma_ptes() is intended to be used by drivers to unmap ptes assigned to the driver private vmas. This interface is similar to zap_page_range() but is less general & less likely to be abused. Needed by the GRU driver. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:47 -07:00
Joerg Roedel	204b885e73	introduce lower_32_bits() macro The file kernel.h contains the upper_32_bits macro. This patch adds the other part, the lower_32_bits macro. Its first use will be in the driver for AMD IOMMU. Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:46 -07:00
Jerome Arbez-Gindre	2c203003f6	connector: add a BlackBoard user to connector Add a BlackBoard user to connector. BlackBoard is part of the TSP GPL sampling framework (http://savannah.nongnu.org/p/tsp) [akpm@linux-foundation.org: add comment] Signed-off-by: Jerome Arbez-Gindre <jeromearbezgindre@gmail.com> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:45 -07:00
Vegard Nossum	3f1712bac5	print_ip_sym(): use %pS Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:45 -07:00
Yinghai Lu	1d1958f050	mm: remove find_max_pfn_with_active_regions It has no user now Also print out info about adding/removing active regions. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:44 -07:00
Thomas Renninger	a1531acd43	cpufreq acpi: only call _PPC after cpufreq ACPI init funcs got called already Ingo Molnar provided a fix to not call _PPC at processor driver initialization time in "[PATCH] ACPI: fix cpufreq regression" (git commit `e4233dec74`) But it can still happen that _PPC is called at processor driver initialization time. This patch should make sure that this is not possible anymore. Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:43 -07:00
Magnus Damm	1a4e564b7d	resource: add resource_size() Avoid one-off errors by introducing a resource_size() function. Signed-off-by: Magnus Damm <damm@igel.co.jp> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:43 -07:00
David Brownell	95b1bc2053	[MTD] MTD_DEBUG always does compile-time typechecks The current style for debug messages is to ensure they're always parsed by the compiler and then subjected to dead code removal. That way builds won't break only when debug options get enabled, which is common when they are stripped out early by CPP. This patch makes CONFIG_MTD_DEBUG adopt that convention. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-30 14:21:05 +01:00
Alexey Korolev	96d8b647cf	[MTD] [NAND] fix subpage read for small page NAND Current implementation of subpage read feature for NAND has issues with small page devices. Small page NAND do not support RNDOUT command. So subpage feature is not applicable for them. This patch disables support of subpage for small page NAND. The code is verified on nandsim(SP NAND simulation) and on LP NAND devices. Thanks a lot to Artem for finding this issue. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-30 11:59:24 +01:00
David S. Miller	785957d3e8	tcp: MD5: Use MIB counter instead of warning for MD5 mismatch. From a report by Matti Aarnio, and preliminary patch by Adam Langley. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-30 03:27:25 -07:00
Philipp Zabel	0eb5d5ab3e	regulator: TI bq24022 Li-Ion Charger driver This adds a regulator driver for the TI bq24022 Single-Chip Li-Ion Charger with its nCE and ISET2 pins connected to GPIOs. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>	2008-07-30 10:10:23 +01:00
Mark Brown	48d335ba31	regulator: fixed regulator interface This patch adds support for fixed regulators. This class of regulator is not software controllable but can coexist on machines with software controlable regulators. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>	2008-07-30 10:10:21 +01:00
Liam Girdwood	4c1184e85c	regulator: machine driver interface This interface is for machine specific code and allows the creation of voltage/current domains (with constraints) for each regulator. It can provide regulator constraints that will prevent device damage through overvoltage or over current caused by buggy client drivers. It also allows the creation of a regulator tree whereby some regulators are supplied by others (similar to a clock tree). Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com> Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2008-07-30 10:10:21 +01:00
Liam Girdwood	571a354b15	regulator: regulator driver interface This allows regulator drivers to register their regulators and provide operations to the core. It also has a notifier call chain for propagating regulator events to clients. Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2008-07-30 10:10:20 +01:00
Liam Girdwood	e2ce4eaa76	regulator: consumer device interface Add support to allow consumer device drivers to control their regulator power supply. This uses a similar API to the kernel clock interface in that consumer drivers can get and put a regulator (like they can with clocks atm) and get/set voltage, current limit, mode, enable and disable. This should allow consumers complete control over their supply voltage and current limit. This also compiles out if not in use so drivers can be reused in systems with no regulator based power control. Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2008-07-30 10:10:20 +01:00
Nick Piggin	ce0ad7f095	powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3) Implement lockless get_user_pages_fast for 64-bit powerpc. Page table existence is guaranteed with RCU, and speculative page references are used to take a reference to the pages without having a prior existence guarantee on them. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-30 15:26:54 +10:00
David S. Miller	e93dc4891d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2008-07-29 21:51:00 -07:00
Aristeu Rozanski	5a599a1518	Input: add keycodes for remote controls/phone keypads The new keys are separate from normal numeric keys and standard numeric keypads. The userspace should not attempt to apply modifiers like shift and NumLock to these so tey work properly regardless of the language mapping used. Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-30 00:41:40 -04:00
Dmitry Torokhov	03bac96fae	Input: expand keycode space Expand the number of potential key codes from 512 to 768 since people are coming up with more and more keys. Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-30 00:41:39 -04:00
Dmitry Torokhov	8c4b3c2932	Input: gameport - mark gameport_register_driver() __must_check Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-30 00:41:37 -04:00
Dmitry Torokhov	6902c0bead	Input: gameport - make gameport_register_driver() return errors Perform actual driver registration right in gameport_register_driver() instead of offloading it to kgameportd and return proper error code to callers if driver registration fails. Note that driver <-> port matching is still done by kgameportd. Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-30 00:41:15 -04:00
Anton Vorontsov	9fec6060d9	Merge branch 'master' of /home/cbou/linux-2.6 Conflicts: drivers/power/Kconfig drivers/power/Makefile	2008-07-30 02:05:23 +04:00
Johannes Berg	d0f0980414	mac80211: partially fix skb->cb use This patch fixes mac80211 to not use the skb->cb over the queue step from virtual interfaces to the master. The patch also, for now, disables aggregation because that would still require requeuing, will fix that in a separate patch. There are two other places (software requeue and powersaving stations) where requeue can happen, but that is not currently used by any drivers/not possible to use respectively. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-29 16:55:08 -04:00
Henrique de Moraes Holschuh	f1b23361a0	rfkill: document the rfkill struct locking (v2) Reorder fields in struct rfkill and add comments to make it clear which fields are protected by rfkill->mutex. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-29 16:36:33 -04:00
Adrian McMenamin	1795cf48b3	sh/maple: clean maple bus code This patch cleans up the handling of the maple bus queue to remove the risk of races when adding packets. It also removes references to the redundant connect and disconnect functions. Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-07-29 22:10:56 +09:00
FUJITA Tomonori	8978b74253	generic, x86: fix add iommu_num_pages helper function This IOMMU helper function doesn't work for some architectures: http://marc.info/?l=linux-kernel&m=121699304403202&w=2 It also breaks POWER and SPARC builds: http://marc.info/?l=linux-kernel&m=121730388001890&w=2 Currently, only x86 IOMMUs use this so let's move it to x86 for now. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-29 12:12:48 +02:00
Avi Kivity	ed84862433	KVM: Advertise synchronized mmu support to userspace Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:34:02 +03:00
Andrea Arcangeli	e930bffe95	KVM: Synchronize guest physical memory map to host virtual memory map Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:33:53 +03:00
Linus Torvalds	5dfb66ba8c	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: accept pure device as a parent, not only platform_device mfd: add platform_data to mfd_cell mfd: Coding style fixes mfd: Use to_platform_device instead of container_of	2008-07-28 18:15:41 -07:00
Linus Torvalds	1d9b9f6a53	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits) x86/PCI: use dev_printk when possible PCI: add D3 power state avoidance quirk PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot PCI: add an option to allow ASPM enabled forcibly PCI: disable ASPM on pre-1.1 PCIe devices PCI: disable ASPM per ACPI FADT setting PCI MSI: Don't disable MSIs if the mask bit isn't supported PCI: handle 64-bit resources better on 32-bit machines PCI: rewrite PCI BAR reading code PCI: document pci_target_state PCI hotplug: fix typo in pcie hotplug output x86 gart: replace to_pages macro with iommu_num_pages x86, AMD IOMMU: replace to_pages macro with iommu_num_pages iommu: add iommu_num_pages helper function dma-coherent: add documentation to new interfaces Cris: convert to using generic dma-coherent mem allocator Sh: use generic per-device coherent dma allocator ARM: support generic per-device coherent dma mem Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE x86: use generic per-device dma coherent allocator ...	2008-07-28 18:14:24 -07:00
Dmitry Baryshkov	424f525a12	mfd: accept pure device as a parent, not only platform_device Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-29 01:30:26 +02:00
Hisashi Hifumi	8ab22b9abb	vfs: pagecache usage optimization for pagesize!=blocksize When we read some part of a file through pagecache, if there is a pagecache of corresponding index but this page is not uptodate, read IO is issued and this page will be uptodate. I think this is good for pagesize == blocksize environment but there is room for improvement on pagesize != blocksize environment. Because in this case a page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate. So I suggest that when all buffers which correspond to a part of a file that we want to read are uptodate, use this pagecache and copy data from this pagecache to user buffer even if a page is not uptodate. This can reduce read IO and improve system throughput. I wrote a benchmark program and got result number with this program. This benchmark do: 1: mount and open a test file. 2: create a 512MB file. 3: close a file and umount. 4: mount and again open a test file. 5: pwrite randomly 300000 times on a test file. offset is aligned by IO size(1024bytes). 6: measure time of preading randomly 100000 times on a test file. The result was: 2.6.26 330 sec 2.6.26-patched 226 sec Arch:i386 Filesystem:ext3 Blocksize:1024 bytes Memory: 1GB On ext3/4, a file is written through buffer/block. So random read/write mixed workloads or random read after random write workloads are optimized with this patch under pagesize != blocksize environment. This test result showed this. The benchmark program is as follows: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <time.h> #include <stdlib.h> #include <string.h> #include <sys/mount.h> #define LEN 1024 #define LOOP 1024512 / 512MB / main(void) { unsigned long i, offset, filesize; int fd; char buf[LEN]; time_t t1, t2; if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } memset(buf, 0, LEN); fd = open("/root/test1/testfile", O_CREAT\|O_RDWR\|O_TRUNC); if (fd < 0) { perror("cannot open file\n"); exit(1); } for (i = 0; i < LOOP; i++) write(fd, buf, LEN); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } fd = open("/root/test1/testfile", O_RDWR); if (fd < 0) { perror("cannot open file\n"); exit(1); } filesize = LEN LOOP; for (i = 0; i < 300000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pwrite(fd, buf, LEN, offset); } printf("start test\n"); time(&t1); for (i = 0; i < 100000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pread(fd, buf, LEN, offset); } time(&t2); printf("%ld sec\n", t2-t1); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } } Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@ucw.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	cddb8a5c14	mmu-notifiers: core With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages. There are secondary MMUs (with secondary sptes and secondary tlbs) too. sptes in the kvm case are shadow pagetables, but when I say spte in mmu-notifier context, I mean "secondary pte". In GRU case there's no actual secondary pte and there's only a secondary tlb because the GRU secondary MMU has no knowledge about sptes and every secondary tlb miss event in the MMU always generates a page fault that has to be resolved by the CPU (this is not the case of KVM where the a secondary tlb miss will walk sptes in hardware and it will refill the secondary tlb transparently to software if the corresponding spte is present). The same way zap_page_range has to invalidate the pte before freeing the page, the spte (and secondary tlb) must also be invalidated before any page is freed and reused. Currently we take a page_count pin on every page mapped by sptes, but that means the pages can't be swapped whenever they're mapped by any spte because they're part of the guest working set. Furthermore a spte unmap event can immediately lead to a page to be freed when the pin is released (so requiring the same complex and relatively slow tlb_gather smp safe logic we have in zap_page_range and that can be avoided completely if the spte unmap event doesn't require an unpin of the page previously mapped in the secondary MMU). The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know when the VM is swapping or freeing or doing anything on the primary MMU so that the secondary MMU code can drop sptes before the pages are freed, avoiding all page pinning and allowing 100% reliable swapping of guest physical address space. Furthermore it avoids the code that teardown the mappings of the secondary MMU, to implement a logic like tlb_gather in zap_page_range that would require many IPI to flush other cpu tlbs, for each fixed number of spte unmapped. To make an example: if what happens on the primary MMU is a protection downgrade (from writeable to wrprotect) the secondary MMU mappings will be invalidated, and the next secondary-mmu-page-fault will call get_user_pages and trigger a do_wp_page through get_user_pages if it called get_user_pages with write=1, and it'll re-establishing an updated spte or secondary-tlb-mapping on the copied page. Or it will setup a readonly spte or readonly tlb mapping if it's a guest-read, if it calls get_user_pages with write=0. This is just an example. This allows to map any page pointed by any pte (and in turn visible in the primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an full MMU with both sptes and secondary-tlb like the shadow-pagetable layer with kvm), or a remote DMA in software like XPMEM (hence needing of schedule in XPMEM code to send the invalidate to the remote node, while no need to schedule in kvm/gru as it's an immediate event like invalidating primary-mmu pte). At least for KVM without this patch it's impossible to swap guests reliably. And having this feature and removing the page pin allows several other optimizations that simplify life considerably. Dependencies: 1) mm_take_all_locks() to register the mmu notifier when the whole VM isn't doing anything with "mm". This allows mmu notifier users to keep track if the VM is in the middle of the invalidate_range_begin/end critical section with an atomic counter incraese in range_begin and decreased in range_end. No secondary MMU page fault is allowed to map any spte or secondary tlb reference, while the VM is in the middle of range_begin/end as any page returned by get_user_pages in that critical section could later immediately be freed without any further ->invalidate_page notification (invalidate_range_begin/end works on ranges and ->invalidate_page isn't called immediately before freeing the page). To stop all page freeing and pagetable overwrites the mmap_sem must be taken in write mode and all other anon_vma/i_mmap locks must be taken too. 2) It'd be a waste to add branches in the VM if nobody could possibly run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of mmu notifiers, but this already allows to compile a KVM external module against a kernel with mmu notifiers enabled and from the next pull from kvm.git we'll start using them. And GRU/XPMEM will also be able to continue the development by enabling KVM=m in their config, until they submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n). This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM are all =n. The mmu_notifier_register call can fail because mm_take_all_locks may be interrupted by a signal and return -EINTR. Because mmu_notifier_reigster is used when a driver startup, a failure can be gracefully handled. Here an example of the change applied to kvm to register the mmu notifiers. Usually when a driver startups other allocations are required anyway and -ENOMEM failure paths exists already. struct kvm kvm_arch_create_vm(void) { struct kvm kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL); + int err; if (!kvm) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); + kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops; + err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm); + if (err) { + kfree(kvm); + return ERR_PTR(err); + } + return kvm; } mmu_notifier_unregister returns void and it's reliable. The patch also adds a few needed but missing includes that would prevent kernel to compile after these changes on non-x86 archs (x86 didn't need them by luck). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix mm/filemap_xip.c build] [akpm@linux-foundation.org: fix mm/mmu_notifier.c build] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	7906d00cd1	mmu-notifiers: add mm_take_all_locks() operation mm_take_all_locks holds off reclaim from an entire mm_struct. This allows mmu notifiers to register into the mm at any time with the guarantee that no mmu operation is in progress on the mm. This operation locks against the VM for all pte/vma/mm related operations that could ever happen on a certain mm. This includes vmtruncate, try_to_unmap, and all page faults. The caller must take the mmap_sem in write mode before calling mm_take_all_locks(). The caller isn't allowed to release the mmap_sem until mm_drop_all_locks() returns. mmap_sem in write mode is required in order to block all operations that could modify pagetables and free pages without need of altering the vma layout (for example populate_range() with nonlinear vmas). It's also needed in write mode to avoid new anon_vmas to be associated with existing vmas. A single task can't take more than one mm_take_all_locks() in a row or it would deadlock. mm_take_all_locks() and mm_drop_all_locks are expensive operations that may have to take thousand of locks. mm_take_all_locks() can fail if it's interrupted by signals. When mmu_notifier_register returns, we must be sure that the driver is notified if some task is in the middle of a vmtruncate for the 'mm' where the mmu notifier was registered (mmu_notifier_invalidate_range_start/end is run around the vmtruncation but mmu_notifier_register can run after mmu_notifier_invalidate_range_start and before mmu_notifier_invalidate_range_end). Same problem for rmap paths. And we've to remove page pinning to avoid replicating the tlb_gather logic inside KVM (and GRU doesn't work well with page pinning regardless of needing tlb_gather), so without mm_take_all_locks when vmtruncate frees the page, kvm would have no way to notice that it mapped into sptes a page that is going into the freelist without a chance of any further mmu_notifier notification. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	6beeac76f5	mmu-notifiers: add list_del_init_rcu() Introduce list_del_init_rcu() and document it. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:20 -07:00
Mike Rapoport	56edb58be1	mfd: add platform_data to mfd_cell Adding platform_data to mfd_cell allows passing of platform data directly to the platform_device created for each cell and thus reuse of existing drivers. On the other side it can be used as a hook to mfd_cell itself removing the need in mfd_get_cell method. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-29 01:23:32 +02:00
Alan Cox	979b1791e5	PCI: add D3 power state avoidance quirk Libata has some hacks to deal with certain controllers going silly in D3 state. The right way to handle this is to keep a PCI device flag for such devices. That can then be generalised for no ATA devices with power problems. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 15:12:11 -07:00
Shaohua Li	149e16372a	PCI: disable ASPM on pre-1.1 PCIe devices Disable ASPM on pre-1.1 PCIe devices, as many of them don't implement it correctly. Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 14:56:57 -07:00
Shaohua Li	5fde244d39	PCI: disable ASPM per ACPI FADT setting The ACPI FADT table includes an ASPM control bit. If the bit is set, do not enable ASPM since it may indicate that the platform doesn't actually support the feature. Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 14:56:09 -07:00
Ingo Molnar	9e3ee1c39c	Merge branch 'linus' into cpus4096 Conflicts: kernel/stop_machine.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 23:32:00 +02:00
Jesse Barnes	29111f579f	Merge branch 'x86/iommu' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus	2008-07-28 14:31:10 -07:00
Linus Torvalds	e56b3bc794	cpu masks: optimize and clean up cpumask_of_cpu() Clean up and optimize cpumask_of_cpu(), by sharing all the zero words. Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns creating a huge array of constant bitmasks, realize that the zero words can be shared. In other words, on a 64-bit architecture, we only ever need 64 of these arrays - with a different bit set in one single world (with enough zero words around it so that we can create any bitmask by just offsetting in that big array). And then we just put enough zeroes around it that we can point every single cpumask to be one of those things. So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each, with one bit set in each array - 2MB memory total), we have exactly 64 arrays instead, each 8k bits in size (64kB total). And then we just point cpumask(n) to the right position (which we can calculate dynamically). Once we have the right arrays, getting "cpumask(n)" ends up being: static inline const cpumask_t get_cpu_mask(unsigned int cpu) { const unsigned long p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG]; p -= cpu / BITS_PER_LONG; return (const cpumask_t *)p; } This brings other advantages and simplifications as well: - we are not wasting memory that is just filled with a single bit in various different places - we don't need all those games to re-create the arrays in some dense format, because they're already going to be dense enough. if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory is a non-issue (especially since by doing this "overlapping" trick we probably get better cache behaviour anyway). [ mingo@elte.hu: Converted Linus's mails into a commit. See: http://lkml.org/lkml/2008/7/27/156 http://lkml.org/lkml/2008/7/28/320 Also applied a family filter - which also has the side-effect of leaving out the bits where Linus calls me an idio... Oh, never mind ;-) ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 22:20:41 +02:00
Ingo Molnar	414f746d23	Merge branch 'linus' into cpus4096	2008-07-28 21:14:43 +02:00
Linus Torvalds	f934fb19ef	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: add driver for Atmel integrated touchscreen controller Input: ads7846 - optimize order of calculating Rt in ads7846_rx() Input: ads7846 - fix sparse endian warnings Input: uinput - remove duplicate include Input: serio - offload resume to kseriod Input: serio - mark serio_register_driver() __must_check	2008-07-28 09:59:26 -07:00
Ben Dooks	7f71ac9374	mfd: Coding style fixes Fix some coding style fixes in the mfd core driver. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-28 18:29:09 +02:00
Linus Torvalds	d9089c296b	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Disable 64K hugetlb support when doing 64K SPU mappings powerpc/powermac: Fixup default serial port device for pmac_zilog powerpc/powermac: Use sane default baudrate for SCC debugging powerpc/mm: Implement _PAGE_SPECIAL & pte_special() for 64-bit powerpc: Show processor cache information in sysfs powerpc: Make core id information available to userspace powerpc: Make core sibling information available to userspace powerpc/vio: More fallout from dma_mapping_error API change ibmveth: Fix multiple errors with dma_mapping_error conversion powerpc/pseries: Fix CMO sysdev attribute API change fallout powerpc: Enable tracehook for the architecture powerpc: Add TIF_NOTIFY_RESUME support for tracehook powerpc: Add asm/syscall.h with the tracehook entry points powerpc: Make syscall tracing use tracehook.h helpers powerpc: Call tracehook_signal_handler() when setting up signal frames powerpc: Update cpu_sibling_maps dynamically powerpc: register_cpu_online should be __cpuinit powerpc: kill useless SMT code in prom_hold_cpus powerpc: Fix 8xx build failure powerpc: Fix vio build warnings ...	2008-07-28 09:05:35 -07:00
Linus Torvalds	b10a8b7238	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (72 commits) sh: SuperH Mobile CEU and camera platform data for AP325RXA sh: Update smc911x platform data for AP325RXA sh: SuperH Mobile LCDC platform data for AP325RXA sh: Add SuperH Mobile CEU platform data for Migo-R sh: Add SuperH Mobile LCDC platform data for Migo-R sh: Move asid_cache() out of ifdef to fix SH-3/4 nommu build. sh: Workaround for __put_user_asm() bug with gcc 4.x on big-endian. sh: Wire up new syscalls. sh: fix uImage Entry Point sh_keysc: remove request_mem_region() and release_mem_region() sh: Don't miss pending signals returning to user mode after signal processing sh: Use clk_always_enable() on sh7366 sh: Use clk_always_enable() on sh7343 / SE77343 sh: Use clk_always_enable() on sh7722 / Migo-R / SE7722 sh: Use clk_always_enable() on sh7723 / ap325rxa sh: Introduce clk_always_enable() function sh: Show all clocks and their state in /proc/clocks sh: Merge sh7343 and sh7722 clock code sh: Add SuperH Mobile MSTPCR bits to clock framework sh: Use arch_flags to simplify sh7722 siu clock code ...	2008-07-28 08:41:13 -07:00
Linus Torvalds	37eaf8c746	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: fix up ftrace.c stop_machine: Wean existing callers off stop_machine_run() stop_machine(): stop_machine_run() changed to use cpu mask Hotplug CPU: don't check cpu_online after take_cpu_down Simplify stop_machine stop_machine: add ALL_CPUS option module: fix build warning with !CONFIG_KALLSYMS	2008-07-28 08:37:46 -07:00
Jeremy Fitzhardinge	d974ae379a	generic, memparse(): constify argument memparse()'s first argument can be const, so it should be. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 15:05:23 +02:00
Adrian McMenamin	306cfd630a	maple: tidy maple_driver code by removing redundant connect/disconnect The connect and disconnect functions are unnecessary - everything they do can be accomplished in the initial probe - so remove them. Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-07-28 18:10:30 +09:00
Barry Naujok	9403540c06	dcache: Add case-insensitive support d_ci_add() routine This add a dcache entry to the dcache for lookup, but changing the name that is associated with the entry rather than the one passed in to the lookup routine. First, it sees if the case-exact match already exists in the dcache and uses it if one exists. Otherwise, it allocates a new node with the new name and splices it into the dcache. Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov. Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:39 +10:00
Benjamin Herrenschmidt	d65d830ca0	Merge commit 'gcl/gcl-next'	2008-07-28 16:30:40 +10:00
Rusty Russell	eeec4fad96	stop_machine(): stop_machine_run() changed to use cpu mask Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all cpus), pass a cpumask_t. Allow NULL for the common case (where we don't care which CPU the function is run on): temporary cpumask_t's are usually considered bad for stack space. This deprecates stop_machine_run, to be removed soon when all the callers are dead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-28 12:16:30 +10:00
Rusty Russell	ffdb5976c4	Simplify stop_machine stop_machine creates a kthread which creates kernel threads. We can create those threads directly and simplify things a little. Some care must be taken with CPU hotunplug, which has special needs, but that code seems more robust than it was in the past. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2008-07-28 12:16:29 +10:00
Jason Baron	5c2aed6225	stop_machine: add ALL_CPUS option -allow stop_mahcine_run() to call a function on all cpus. Calling stop_machine_run() with a 'ALL_CPUS' invokes this new behavior. stop_machine_run() proceeds as normal until the calling cpu has invoked 'fn'. Then, we tell all the other cpus to call 'fn'. Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> CC: Adrian Bunk <bunk@stusta.de> CC: Andi Kleen <andi@firstfloor.org> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Christoph Hellwig <hch@infradead.org> CC: mingo@elte.hu CC: akpm@osdl.org	2008-07-28 12:16:28 +10:00
Mauro Carvalho Chehab	c2f90e9536	Merge ../linux-2.6	2008-07-27 22:23:18 -03:00
Andrea Righi	940389b8af	task IO accounting: move all IO statistics in struct task_io_accounting Simplify the code of include/linux/task_io_accounting.h. It is also more reasonable to have all the task i/o-related statistics in a single struct (task_io_accounting). Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 16:12:28 -07:00
Mauro Carvalho Chehab	eb703027ac	Merge ../linux-2.6	2008-07-27 18:11:53 -03:00
Linus Torvalds	837b41b5de	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: state userland requirements in Kconfig help firewire: avoid memleak after phy config transmit failure firewire: fw-ohci: TSB43AB22/A dualbuffer workaround firewire: queue the right number of data firewire: warn on unfinished transactions during card removal firewire: small fw_fill_request cleanup firewire: fully initialize fw_transaction before marking it pending firewire: fix race of bus reset with request transmission	2008-07-27 10:24:06 -07:00
Andrea Righi	5995477ab7	task IO accounting: improve code readability Put all i/o statistics in struct proc_io_accounting and use inline functions to initialize and increment statistics, removing a lot of single variable assignments. This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and CONFIG_TASK_IO_ACCOUNTING=y). text data bss dec hex filename 11651 0 0 11651 2d83 kernel/exit.o.before 11619 0 0 11619 2d63 kernel/exit.o.after 10886 132 136 11154 2b92 kernel/fork.o.before 10758 132 136 11026 2b12 kernel/fork.o.after 3082029 807968 4818600 8708597 84e1f5 vmlinux.o.before 3081869 807968 4818600 8708437 84e155 vmlinux.o.after Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 09:58:20 -07:00
Mauro Carvalho Chehab	50cb993ea6	Merge ../linux-2.6	2008-07-27 12:25:57 -03:00
Mauro Carvalho Chehab	9fa0f6db3a	V4L/DVB (8522): videodev2: Fix merge conflict Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 12:24:02 -03:00
Hans Verkuil	c1d7f4f164	V4L/DVB (8524): videodev: copy the VID_TYPE defines to videodev.h The VID_TYPE defines are V4L1 specific, so copy them back to videodev.h. In videodev2.h ensure that they are not used in the kernel (you need to include videodev.h instead) and mark them are deprecated. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:07:12 -03:00
Jean-Francois Moine	1250ac6d4a	V4L/DVB (8518): gspca: Remove the remaining frame decoding functions from the subdrivers. SPCA505 and SPCA508 added in the pixel formats. Decode functions and associated resources removed in spca505, 506 and 508. The decode routines are now found in the V4L library. Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:06:42 -03:00
Mauro Carvalho Chehab	9993e51c0c	V4L/DVB (8502): videodev2.h: CodingStyle cleanups Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:06:20 -03:00
Linus Torvalds	8be1a6d6c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files mlx4_core: Add VLAN tag field to WQE control segment struct RDMA/nes: CM connection setup/teardown rework IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG IPoIB/cm: Connected mode is no longer EXPERIMENTAL RDMA/ucm: BKL is not needed for ib_ucm_open() RDMA/ucma: BKL is not needed for ucma_open()	2008-07-26 20:40:36 -07:00
Linus Torvalds	9ee08c2df4	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (57 commits) [MTD] [NAND] subpage read feature as a way to increase performance. CPUFREQ: S3C24XX NAND driver frequency scaling support. [MTD][NAND] au1550nd: remove unused variable [MTD] jedec_probe: Fix SST 16-bit chip detection [MTD][MTDPART] Fix a division by zero bug [MTD][MTDPART] Cleanup and document the erase region handling [MTD][MTDPART] Handle most checkpatch findings [MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition [MTD] physmap: resume already suspended chips on failure to suspend [MTD] physmap: Fix suspend/resume/shutdown bugs. [MTD] [NOR] Fix -ETIMEO errors in CFI driver [MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y [JFFS2] Use .unlocked_ioctl [MTD] Fix const assignment in the MTD command line partitioning driver [MTD] [NOR] gen_probe: No debug message when debugging is disabled [MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers [MTD] [MAPS] Remove the bast-flash driver. [MTD] [NAND] fsl_elbc_nand: ecclayout cleanups [MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT [MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips ...	2008-07-26 20:30:56 -07:00
Linus Torvalds	eaf0ba5ef6	Merge branch 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace * 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace: tracehook: comment fixes	2008-07-26 20:29:39 -07:00
Linus Torvalds	bdee6ac7d1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: atmel-mci: debugfs support mmc: Add per-card debugfs support mmc: Export internal host state through debugfs imxmmc: fix crash when no platform data is provided imxmmc: fix platform resources imxmmc: remove DEBUG definition mmc_spi: put signals to low power off fix	2008-07-26 20:27:31 -07:00
Linus Torvalds	4836e30078	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits) [PATCH] fix RLIM_NOFILE handling [PATCH] get rid of corner case in dup3() entirely [PATCH] remove remaining namei_{32,64}.h crap [PATCH] get rid of indirect users of namei.h [PATCH] get rid of __user_path_lookup_open [PATCH] f_count may wrap around [PATCH] dup3 fix [PATCH] don't pass nameidata to __ncp_lookup_validate() [PATCH] don't pass nameidata to gfs2_lookupi() [PATCH] new (local) helper: user_path_parent() [PATCH] sanitize __user_walk_fd() et.al. [PATCH] preparation to __user_walk_fd cleanup [PATCH] kill nameidata passing to permission(), rename to inode_permission() [PATCH] take noexec checks to very few callers that care Re: [PATCH 3/6] vfs: open_exec cleanup [patch 4/4] vfs: immutable inode checking cleanup [patch 3/4] fat: dont call notify_change [patch 2/4] vfs: utimes cleanup [patch 1/4] vfs: utimes: move owner check into inode_change_ok() [PATCH] vfs: use kstrdup() and check failing allocation ...	2008-07-26 20:23:44 -07:00
Linus Torvalds	5c7c204aec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: Add layer1 over IP support Add mISDN HFC multiport driver Add mISDN HFC PCI driver Add mISDN DSP Add mISDN core files Define AF_ISDN and PF_ISDN Add mISDN driver	2008-07-26 20:19:41 -07:00
Linus Torvalds	2284284281	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix ip_rt_frag_needed rt_is_expired netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences netfilter: fix double-free and use-after free netfilter: arptables in netns for real netfilter: ip{,6}tables_security: fix future section mismatch selinux: use nf_register_hooks() netfilter: ebtables: use nf_register_hooks() Revert "pkt_sched: sch_sfq: dump a real number of flows" qeth: use dev->ml_priv instead of dev->priv syncookies: Make sure ECN is disabled net: drop unused BUG_TRAP() net: convert BUG_TRAP to generic WARN_ON drivers/net: convert BUG_TRAP to generic WARN_ON	2008-07-26 20:17:56 -07:00
Andrea Righi	510a35d4a4	hugetlb: remove unused variable warning Remove the following warning when CONFIG_HUGETLB_PAGE is not set: ipc/shm.c: In function `shm_get_stat': ipc/shm.c:565: warning: unused variable `h' [akpm@linux-foundation.org: use tabs, not spaces] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 20:16:47 -07:00
Al Viro	3f8206d496	[PATCH] get rid of indirect users of namei.h fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all. Several places in the tree needed direct include. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:42 -04:00
Al Viro	964bd18362	[PATCH] get rid of __user_path_lookup_open Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:41 -04:00
Al Viro	516e0cc564	[PATCH] f_count may wrap around make it atomic_long_t; while we are at it, get rid of useless checks in affs, hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:40 -04:00
Al Viro	2d8f30380a	[PATCH] sanitize __user_walk_fd() et.al. * do not pass nameidata; struct path is all the callers want. * switch to new helpers: user_path_at(dfd, pathname, flags, &path) user_path(pathname, &path) user_lpath(pathname, &path) user_path_dir(pathname, &path) (fail if not a directory) The last 3 are trivial macro wrappers for the first one. * remove nameidata in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:34 -04:00
Al Viro	f419a2e3b6	[PATCH] kill nameidata passing to permission(), rename to inode_permission() Incidentally, the name that gives hundreds of false positives on grep is not a good idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:31 -04:00
Miklos Szeredi	9767d74957	[patch 1/4] vfs: utimes: move owner check into inode_change_ok() Add a new ia_valid flag: ATTR_TIMES_SET, to handle the UTIMES_OMIT/UTIMES_NOW and UTIMES_NOW/UTIMES_OMIT cases. In these cases neither ATTR_MTIME_SET nor ATTR_ATIME_SET is in the flags, yet the POSIX draft specifies that permission checking is performed the same way as if one or both of the times was explicitly set to a timestamp. See the path "vfs: utimensat(): fix error checking for {UTIME_NOW,UTIME_OMIT} case" by Michael Kerrisk for the patch introducing this behavior. This is a cleanup, as well as allowing filesystems (NFS/fuse/...) to perform their own permission checking instead of the default. CC: Ulrich Drepper <drepper@redhat.com> CC: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:25 -04:00
Li Zefan	88b387824f	[PATCH] vfs: use kstrdup() and check failing allocation - use kstrdup() instead of kmalloc() + memcpy() - return NULL if allocating ->mnt_devname failed - mnt_devname should be const Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:24 -04:00
Al Viro	b77b0646ef	[PATCH] pass MAY_OPEN to vfs_permission() explicitly ... and get rid of the last "let's deduce mask from nameidata->flags" bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:22 -04:00
Al Viro	a110343f0d	[PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess * MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS * MAY_ACCESS on fuse should affect only the last step of pathname resolution * fchdir() and chroot() should pass MAY_ACCESS, for the same reason why chdir() needs that. * now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be removed; it has no business being in nameidata. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:21 -04:00
Al Viro	7f2da1e7d0	[PATCH] kill altroot long overdue... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:20 -04:00
Al Viro	8bb79224b8	[PATCH] permission checks for chdir need special treatment only on the last step ... so we ought to pass MAY_CHDIR to vfs_permission() instead of having it triggered on every step of preceding pathname resolution. LOOKUP_CHDIR is killed by that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:19 -04:00
Miklos Szeredi	db2e747b14	[patch 5/5] vfs: remove mode parameter from vfs_symlink() Remove the unused mode parameter from vfs_symlink and callers. Thanks to Tetsuo Handa for noticing. CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:18 -04:00
Miklos Szeredi	2f1936b877	[patch 3/5] vfs: change remove_suid() to file_remove_suid() All calls to remove_suid() are made with a file pointer, because (similarly to file_update_time) it is called when the file is written. Clean up callers by passing in a file instead of a dentry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:16 -04:00
Al Viro	e6305c43ed	[PATCH] sanitize ->permission() prototype * kill nameidata * argument; map the 3 bits in ->flags anybody cares about to new MAY_... ones and pass with the mask. * kill redundant gfs2_iop_permission() * sanitize ecryptfs_permission() * fix remaining places where ->permission() instances might barf on new MAY_... found in mask. The obvious next target in that direction is permission(9) folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:14 -04:00
Al Viro	9043476f72	[PATCH] sanitize proc_sysctl * keep references to ctl_table_head and ctl_table in /proc/sys inodes * grab the former during operations, use the latter for access to entry if that succeeds * have ->d_compare() check if table should be seen for one who does lookup; that allows us to avoid flipping inodes - if we have the same name resolve to different things, we'll just keep several dentries and ->d_compare() will reject the wrong ones. * have ->lookup() and ->readdir() scan the table of our inode first, then walk all ctl_table_header and scan ->attached_by for those that are attached to our directory. * implement ->getattr(). * get rid of insane amounts of tree-walking * get rid of the need to know dentry in ->permission() and of the contortions induced by that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:12 -04:00
Al Viro	ae7edecc9b	[PATCH] sysctl: keep track of tree relationships In a sense, that's the heart of the series. It's based on the following property of the trees we are actually asked to add: they can be split into stem that is already covered by registered trees and crown that is entirely new. IOW, if a/b and a/c/d are introduced by our tree, then a/c is also introduced by it. That allows to associate tree and table entry with each node in the union; while directory nodes might be covered by many trees, only one will cover the node by its crown. And that will allow much saner logics for /proc/sys in the next patches. This patch introduces the data structures needed to keep track of that. When adding a sysctl table, we find a "parent" one. Which is to say, find the deepest node on its stem that already is present in one of the tables from our table set or its ancestor sets. That table will be our parent and that node in it - attachment point. Add our table to list anchored in parent, have it refer the parent and contents of attachment point. Also remember where its crown lives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:11 -04:00
Al Viro	f7e6ced406	[PATCH] allow delayed freeing of ctl_table_header Refcount the sucker; instead of freeing it by the end of unregistration just drop the refcount and free only when it hits zero. Make sure that we _always_ make ->unregistering non-NULL in start_unregistering(). That allows anybody to get a reference to such puppy, preventing its freeing and reuse. It does not block unregistration. Anybody who holds such a reference can * try to grab a "use" reference (ctl_head_grab()); that will succeeds if and only if it hadn't entered unregistration yet. If it succeeds, we can use it in all normal ways until we release the "use" reference (with ctl_head_finish()). Note that this relies on having ->unregistering become non-NULL in all cases when one starts to unregister the sucker. * keep pointers to ctl_table entries; they can be freed if the entire thing is unregistered. However, if ctl_head_grab() succeeds, we know that unregistration had not happened (and will not happen until ctl_head_finish()) and such pointers can be used safely. IOW, now we can have inodes under /proc/sys keep references to ctl_table entries, protecting them with references to ctl_table_header and grabbing the latter for the duration of operations that require access to ctl_table. That won't cause deadlocks, since unregistration will not be stopped by mere keeping a reference to ctl_table_header. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:09 -04:00
Al Viro	734550921e	[PATCH] beginning of sysctl cleanup - ctl_table_set New object: set of sysctls [currently - root and per-net-ns]. Contains: pointer to parent set, list of tables and "should I see this set?" method (->is_seen(set)). Current lists of tables are subsumed by that; net-ns contains such a beast. ->lookup() for ctl_table_root returns pointer to ctl_table_set instead of that to ->list of that ctl_table_set. [folded compile fixes by rdd for configs without sysctl] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:08 -04:00
Denys Vlasenko	d2d9648ec6	[PATCH] reuse xxx_fifo_fops for xxx_pipe_fops Merge fifo and pipe file_operations. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:06 -04:00
Pekka Enberg	93bc4e89c2	netfilter: fix double-free and use-after free As suggested by Patrick McHardy, introduce a __krealloc() that doesn't free the original buffer to fix a double-free and use-after-free bug introduced by me in netfilter that uses RCU. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Tested-by: Dieter Ries <clip2@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-26 17:49:33 -07:00
Karsten Keil	af69fb3a8f	Add mISDN HFC multiport driver Enable support for cards with Cologne Chip AG's HFC multiport chip. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 02:00:43 +02:00
Karsten Keil	960366cf8d	Add mISDN DSP Enable support for digital audio processing capability. This module may be used for special applications that require cross connecting of bchannels, conferencing, dtmf decoding echo cancelation, tone generation, and Blowfish encryption and decryption. It may use hardware features if available. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:56:38 +02:00
Karsten Keil	1b2b03f8e5	Add mISDN core files Add mISDN core files Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:54:58 +02:00
Karsten Keil	04578dd330	Define AF_ISDN and PF_ISDN Define the address and protocol family value for mISDN. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:47:00 +02:00
Haavard Skinnemoen	f4b7f927b5	mmc: Add per-card debugfs support For each card successfully added to the bus, create a subdirectory under the host's debugfs root with information about the card. At the moment, only a single file is added to the card directory for all cards: "state". It reflects the "state" field in struct mmc_card, indicating whether the card is present, readonly, etc. For MMC and SD cards (not SDIO), another file is added: "status". Reading this file will ask the card about its current status and return it. This can be useful if the card just refuses to respond to any commands, which might indicate that the card state is not what the MMC core thinks it is (due to a missing stop command, for example.) Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-27 01:26:17 +02:00
Haavard Skinnemoen	6edd8ee60a	mmc: Export internal host state through debugfs When CONFIG_DEBUG_FS is set, create a few files under /sys/kernel/debug containing information about an mmc host's internal state. Currently, just a single file is created, "ios", which contains information about the current operating parameters for the bus (clock speed, bus width, etc.) Host drivers can add additional files and directories under the host's root directory by passing the debugfs_root field in struct mmc_host as the 'parent' parameter to debugfs_create_*. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-27 01:26:16 +02:00
Roland McGrath	a9906a1919	tracehook: comment fixes This fixes some typos and errors in <linux/tracehook.h> comments. No code changes. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:41:26 -07:00
Linus Torvalds	fb3b806144	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels x86, RDC321x: remove gpio.h complications x86, RDC321x: add to mach-default crashdump: fix undefined reference to `elfcorehdr_addr' flag parameters: fix compile error of sys_epoll_create1	2008-07-26 13:25:05 -07:00
Adrian Bunk	9580d85f9c	drivers/char/rtc.c: make 2 functions static The following functions can now become static: - rtc_interrupt() - rtc_get_rtc_time() Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Bernhard Walle <bwalle@suse.de> Acked-by: Paul Gortmaker <p_gortmaker@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	7c363b8c65	mm/swapfile.c: make code static This patch makes the following needlessly global code static: - swap_lock - nr_swapfiles - struct swap_list Signed-off-by: Adrian Bunk <bunk@kernel.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	15f59adae0	make mm/memory.c:print_bad_pte() static This patch makes the needlessly global print_bad_pte() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	9d8fddfb17	mm/allocpercpu.c: make 4 functions static This patch makes the following needlessly global functions static: - percpu_depopulate() - __percpu_depopulate_mask() - percpu_populate() - __percpu_populate_mask() Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Roland McGrath	bbc698636e	task_current_syscall This adds the new function task_current_syscall() on machines where the asm/syscall.h interface is supported (CONFIG_HAVE_ARCH_TRACEHOOK). It's exported for modules to use in the future. This function safely samples the state of a blocked thread to collect what system call it is blocked in, and the six system call argument registers. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:10 -07:00
Roland McGrath	85ba2d862e	tracehook: wait_task_inactive This extends wait_task_inactive() with a new argument so it can be used in a "soft" mode where it will check for the task changing state unexpectedly and back off. There is no change to existing callers. This lays the groundwork to allow robust, noninvasive tracing that can try to sample a blocked thread but back off safely if it wakes up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	828c365cc8	tracehook: asm/syscall.h This adds asm-generic/syscall.h, which documents what a real asm-ARCH/syscall.h file should define. This is not used yet, but will provide all the machine-dependent details of examining a user system call about to begin, in progress, or just ended. Each arch should add an asm-ARCH/syscall.h that defines all the entry points documented in asm-generic/syscall.h, as short inlines if possible. This lets us write new tracing code that understands user system call registers, without any new arch-specific work. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	64b1208d5b	tracehook: TIF_NOTIFY_RESUME This adds tracehook.h inlines to enable a new arch feature in support of user debugging/tracing. This is not used yet, but it lays the groundwork for a debugger to be able to wrangle a task that's possibly running, without interrupting its syscalls in progress. Each arch should define TIF_NOTIFY_RESUME, and in their entry.S code treat it much like TIF_SIGPENDING. That is, it causes you to take the slow path when returning to user mode, where you get the full user-mode state accessible as for signal handling or ptrace. The arch code should check TIF_NOTIFY_RESUME after handling TIF_SIGPENDING. When it's set, clear it and then call tracehook_notify_resume(). In future, tracing code will call set_notify_resume() when it wants to get a callback in tracehook_notify_resume(). Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	b787f7ba67	tracehook: force signal_pending() This defines a new hook tracehook_force_sigpending() that lets tracing code decide to force TIF_SIGPENDING on in recalc_sigpending(). This is not used yet, so it compiles away to nothing for now. It lays the groundwork for new tracing code that can interrupt a task synthetically without actually sending a signal. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	2b2a1ff64a	tracehook: death This moves the ptrace logic in task death (exit_notify) into tracehook.h inlines. Some code is rearranged slightly to make things nicer. There is no change, only cleanup. There is one hook called with the tasklist_lock write-locked, as ptrace needs. There is also a new hook called after exit_state changes and without locks. This is a better place for tracing work to be in the future, since it doesn't delay the whole system with locking. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	fa00b80b3c	tracehook: job control This defines the tracehook_notify_jctl() hook to formalize the ptrace effects on the job control notifications. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	7bcf6a2ca5	tracehook: get_signal_to_deliver This defines the tracehook_get_signal() hook to allow tracing code to slip in before normal signal dequeuing. This lays the groundwork for new tracing features that can inject synthetic signals outside the normal queue or control the disposition of delivered signals. The calling convention lets tracehook_get_signal() decide both exactly what will happen and what signal number to report in the handler/exit. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	283d7559e7	tracehook: syscall This adds standard tracehook.h inlines for arch code to call when TIF_SYSCALL_TRACE has been set. This replaces having each arch implement the ptrace guts for its syscall tracing support. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	445a91d2fe	tracehook: tracehook_consider_fatal_signal This defines tracehook_consider_fatal_signal() has a fine-grained hook for deciding to skip the special cases for a fatal signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	35de254dc6	tracehook: tracehook_consider_ignored_signal This defines tracehook_consider_ignored_signal() has a fine-grained hook for deciding to prevent the normal short-circuit of sending an ignored signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	c45aea2761	tracehook: tracehook_signal_handler This defines tracehook_signal_handler() as a hook for the arch signal handling code to call. It gives ptrace the opportunity to stop for a pseudo-single-step trap immediately after signal handler setup is done. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	fa8e26ccd4	tracehook: tracehook_expect_breakpoints This adds tracehook_expect_breakpoints() as a formal hook for the nommu code to use for its, "Is text-poking likely?" check at mmap time. This names the actual semantics the code means to test, and documents it. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	0d094efeb1	tracehook: tracehook_tracer_task This adds the tracehook_tracer_task() hook to consolidate all forms of "Who is using ptrace on me?" logic. This is used for "TracerPid:" in /proc and for permission checks. We also clean up the selinux code the called an identical accessor. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	dae33574dc	tracehook: release_task This moves the ptrace-related logic from release_task into tracehook.h and ptrace.h inlines. It provides clean hooks both before and after locking tasklist_lock, for future tracing logic to do more cleanup without the lock. This also changes release_task() itself in the rare "zap_leader" case to set the leader to EXIT_DEAD before iterating. This maintains the invariant that release_task() only ever handles a task in EXIT_DEAD. This is a common-sense invariant that is already always true except in this one arcane case of zombie leader whose parent ignores SIGCHLD. This change is harmless and only costs one store in this one rare case. It keeps the expected state more consisently sane, which is nicer when debugging weirdness in release_task(). It also lets some future code in the tracehook entry points rely on this invariant for bookkeeping. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	daded34be9	tracehook: vfork-done This moves the PTRACE_EVENT_VFORK_DONE tracing into a tracehook.h inline, tracehook_report_vfork_done(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	09a05394fe	tracehook: clone This moves all the ptrace initialization and tracing logic for task creation into tracehook.h and ptrace.h inlines. It reorganizes the code slightly, but should not change any behavior. There are four tracehook entry points, at each important stage of task creation. This keeps the interface from the core fork.c code fairly clean, while supporting the complex setup required for ptrace or something like it. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	30199f5a46	tracehook: exit This moves the PTRACE_EVENT_EXIT tracing into a tracehook.h inline, tracehook_report_exec(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	6341c393fc	tracehook: exec This moves all the ptrace hooks related to exec into tracehook.h inlines. This also lifts the calls for tracing out of the binfmt load_binary hooks into search_binary_handler() after it calls into the binfmt module. This change has no effect, since all the binfmt modules' load_binary functions did the call at the end on success, and now search_binary_handler() does it immediately after return if successful. We consolidate the repeated code, and binfmt modules no longer need to import ptrace_notify(). Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	88ac2921a7	tracehook: add linux/tracehook.h This patch series introduces the "tracehook" interface layer of inlines in <linux/tracehook.h>. There are more details in the log entry for patch 01/23 and in the header file comments inside that patch. Most of these changes move code around with little or no change, and they should not break anything or change any behavior. This sets a new standard for uniform arch support to enable clean arch-independent implementations of new debugging and tracing stuff, denoted by CONFIG_HAVE_ARCH_TRACEHOOK. Patch 20/23 adds that symbol to arch/Kconfig, with comments listing everything an arch has to do before setting "select HAVE_ARCH_TRACEHOOK". These are elaborted a bit at: http://sourceware.org/systemtap/wiki/utrace/arch/HowTo The new inlines that arch code must define or call have detailed kerneldoc comments in the generic header files that say what is required. No arch is obligated to do any work, and no arch's build should be broken by these changes. There are several steps that each arch should take so it can set HAVE_ARCH_TRACEHOOK. Most of these are simple. Providing this support will let new things people add for doing debugging and tracing of user-level threads "just work" for your arch in the future. For an arch that does not provide HAVE_ARCH_TRACEHOOK, some new options for such features will not be available for config. I have done some arch work and will submit this to the arch maintainers after the generic tracehook series settles in. For now, that work is available in my GIT repositories, and in patch and mbox-of-patches form at http://people.redhat.com/roland/utrace/2.6-current/ This paves the way for my "utrace" work, to be submitted later. But it is not innately tied to that. I hope that the tracehook series can go in soon regardless of what eventually does or doesn't go on top of it. For anyone implementing any kind of new tracing/debugging plan, or just understanding all the context of the existing ptrace implementation, having tracehook.h makes things much easier to find and understand. This patch: This adds the new kernel-internal header file <linux/tracehook.h>. This is not yet used at all. The comments in the header introduce what the following series of patches is about. The aim is to formalize and consolidate all the places that the core kernel code and the arch code now ties into the ptrace implementation. These patches mostly don't cause any functional change. They just move the details of ptrace logic out of core code into tracehook.h inlines, where they are mostly compiled away to the same as before. All that changes is that everything is thoroughly documented and any future reworking of ptrace, or addition of something new, would not have to touch core code all over, just change the tracehook.h inlines. The new linux/ptrace.h inlines are used by the following patches in the new tracehook_*() inlines. Using these helpers for the ptrace event stops makes it simple to change or disable the old ptrace implementation of these stops conditionally later. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Alexey Dobriyan	51cc50685a	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:07 -07:00
Nick Piggin	19fd623127	mm: spinlock tree_lock mapping->tree_lock has no read lockers. convert the lock from an rwlock to a spinlock. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	e286781d5f	mm: speculative page references If we can be sure that elevating the page_count on a pagecache page will pin it, we can speculatively run this operation, and subsequently check to see if we hit the right page rather than relying on holding a lock or otherwise pinning a reference to the page. This can be done if get_page/put_page behaves consistently throughout the whole tree (ie. if we "get" the page after it has been used for something else, we must be able to free it with a put_page). Actually, there is a period where the count behaves differently: when the page is free or if it is a constituent page of a compound page. We need an atomic_inc_not_zero operation to ensure we don't try to grab the page in either case. This patch introduces the core locking protocol to the pagecache (ie. adds page_cache_get_speculative, and tweaks some update-side code to make it work). Thanks to Hugh for pointing out an improvement to the algorithm setting page_count to zero when we have control of all references, in order to hold off speculative getters. [kamezawa.hiroyu@jp.fujitsu.com: fix migration_entry_wait()] [hugh@veritas.com: fix add_to_page_cache] [akpm@linux-foundation.org: repair a comment] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	47feff2c8e	radix-tree: add gang_lookup_slot, gang_lookup_slot_tag Introduce gang_lookup_slot() and gang_lookup_slot_tag() functions, which are used by lockless pagecache. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	21cc199baa	mm: introduce get_user_pages_fast Introduce a new get_user_pages_fast mm API, which is basically a get_user_pages with a less general API (but still tends to be suited to the common case): - task and mm are always current and current->mm - force is always 0 - pages is always non-NULL - don't pass back vmas This restricted API can be implemented in a much more scalable way on many architectures when the ptes are present, by walking the page tables locklessly (no mmap_sem or page table locks). When the ptes are not populated, get_user_pages_fast() could be slower. This is implemented locklessly on x86, and used in some key direct IO call sites, in later patches, which provides nearly 10% performance improvement on a threaded database workload. Lots of other code could use this too, depending on use cases (eg. grep drivers/). And it might inspire some new and clever ways to use it. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:05 -07:00
Huang Weiyi	080ccd4573	include/linux/aio.h: removed duplicated include Removed duplicated include <linux/uio.h> in include/linux/aio.h Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	20d8b67c06	relay: add buffer-only channels; useful for early logging Allows one to create and use a channel with no associated files. Files can be initialized later. This is useful in scenarios such as logging in early code, before VFS is up. Therefore, such channels can be created and used as soon as kmem_cache_init() completed. This is needed by kmemtrace to do tracing in early kernel code. [kosaki.motohiro@jp.fujitsu.com: build fix] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	7babe8db99	Full conversion to early_initcall() interface, remove old interface A previous patch added the early_initcall(), to allow a cleaner hooking of pre-SMP initcalls. Now we remove the older interface, converting all existing users to the new one. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: build fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	c2147a5092	Better interface for hooking early initcalls Added early initcall (pre-SMP) support, using an identical interface to that of regular initcalls. Functions called from do_pre_smp_initcalls() could be converted to use this cleaner interface. This is required by CPU hotplug, because early users have to register notifiers before going SMP. One such CPU hotplug user is the relay interface with buffer-only channels, which needs to register such a notifier, to be usable in early code. This in turn is used by kmemtrace. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Huang Ying	89081d17f7	kexec jump: save/restore device state This patch implements devices state save/restore before after kexec. This patch together with features in kexec_jump patch can be used for following: - A simple hibernation implementation without ACPI support. You can kexec a hibernating kernel, save the memory image of original system and shutdown the system. When resuming, you restore the memory image of original system via ordinary kexec load then jump back. - Kernel/system debug through making system snapshot. You can make system snapshot, jump back, do some thing and make another system snapshot. - Cooperative multi-kernel/system. With kexec jump, you can switch between several kernels/systems quickly without boot process except the first time. This appears like swap a whole kernel/system out/in. - A general method to call program in physical mode (paging turning off). This can be used to invoke BIOS code under Linux. The following user-space tools can be used with kexec jump: - kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 - makedumpfile with patches are used as memory image saving tool, it can exclude free pages from original kernel memory image file. The patches and the precompiled makedumpfile can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10 - An initramfs image can be used as the root file system of kexeced kernel. An initramfs image built with "BuildRoot" can be downloaded from the following URL: initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz All user space tools above are included in the initramfs image. Usage example of simple hibernation: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_RELOCATABLE=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PM=y CONFIG_HIBERNATION=y CONFIG_KEXEC_JUMP=y 2. Build an initramfs image contains kexec-tool and makedumpfile, or download the pre-built initramfs image, called rootfs.gz in following text. 3. Prepare a partition to save memory image of original kernel, called hibernating partition in following text. 4. Boot kernel compiled in step 1 (kernel A). 5. In the kernel A, load kernel compiled in step 1 (kernel B) with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000 --mem-max=0xffffff --initrd=rootfs.gz 6. Boot the kernel B with following shell command line: /sbin/kexec -e 7. The kernel B will boot as normal kexec. In kernel B the memory image of kernel A can be saved into hibernating partition as follow: jump_back_entry=`cat /proc/cmdline \| tr ' ' '\n' \| grep kexec_jump_back_entry \| cut -d '='` echo $jump_back_entry > kexec_jump_back_entry cp /proc/vmcore dump.elf Then you can shutdown the machine as normal. 8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as root file system. 9. In kernel C, load the memory image of kernel A as follow: /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf 10. Jump back to the kernel A as follow: /sbin/kexec -e Then, kernel A is resumed. Implementation point: To support jumping between two kernels, before jumping to (executing) the new kernel and jumping back to the original kernel, the devices are put into quiescent state, and the state of devices and CPU is saved. After jumping back from kexeced kernel and jumping to the new kernel, the state of devices and CPU are restored accordingly. The devices/CPU state save/restore code of software suspend is called to implement corresponding function. Known issues: - Because the segment number supported by sys_kexec_load is limited, hibernation image with many segments may not be load. This is planned to be eliminated by adding a new flag to sys_kexec_load to make a image can be loaded with multiple sys_kexec_load invoking. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Huang Ying	3ab8352137	kexec jump This patch provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used by the original kernel before/after kexec. - Save/restore CPU state before/after kexec. The features of this patch can be used as a general method to call program in physical mode (paging turning off). This can be used to call BIOS code under Linux. kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 Usage example of calling some physical mode code and return: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_KEXEC=y CONFIG_PM=y CONFIG_KEXEC_JUMP=y 2. Build patched kexec-tool or download the pre-built one. 3. Build some physical mode executable named such as "phy_mode" 4. Boot kernel compiled in step 1. 5. Load physical mode executable with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context --args-none phy_mode 6. Call physical mode executable with following shell command line: /sbin/kexec -e Implementation point: To support jumping without reserving memory. One shadow backup page (source page) is allocated for each page used by kexeced code image (destination page). When do kexec_load, the image of kexeced code is loaded into source pages, and before executing, the destination pages and the source pages are swapped, so the contents of destination pages are backupped. Before jumping to the kexeced code image and after jumping back to the original kernel, the destination pages and the source pages are swapped too. C ABI (calling convention) is used as communication protocol between kernel and called code. A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to indicate that the loaded kernel image is used for jumping back. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Alex Dubov	17017d8d2c	memstick: add "start" and "stop" methods to memstick device In some cases it may be desirable to ensure that associated driver is not going to access the media in some period of time. "start" and "stop" methods are provided therefore to allow it. Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Alex Dubov	b77899985b	memstick: allow "set_param" method to return an error code Some controllers (Jmicron, for instance) can report temporal failure condition during power-on. It is desirable to account for this using a return value of "set_param" device method. The return value can also be handy to distinguish between supported and unsupported device parameters in run time. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Adrian Bunk	929dfb24fb	parport/share.c: proper externs This patch adds proper externs for parport_default_timeslice and parport_default_spintime in include/linux/parport.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Andrew Morton	16d69265b9	uninline arch_pick_mmap_layout() Fix this, on avr32: include/linux/utsname.h:35, from init/main.c:20: include/linux/sched.h: In function 'arch_pick_mmap_layout': include/linux/sched.h:2149: error: implicit declaration of function 'PAGE_ALIGN' Reported-by: Adrian Bunk <bunk@kernel.org> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:01 -07:00
Mauro Carvalho Chehab	fdd2a7e2da	V4L/DVB (8500a): videotext.h: whitespace cleanup Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-26 13:25:25 -03:00
Mike Travis	b8d317d10c	cpumask: make cpumask_of_cpu_map generic If an arch doesn't define cpumask_of_cpu_map, create a generic statically-initialized one for them. This allows removal of the buggy cpumask_of_cpu() macro (&cpumask_of_cpu() gives address of out-of-scope var). An arch with NR_CPUS of 4096 probably wants to allocate this itself based on the actual number of CPUs, since otherwise they're using 2MB of rodata (1024 cpus means 128k). That's what CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the moment). In future as we support more CPUs, we'll need to resort to a get_cpu_map()/put_cpu_map() allocation scheme. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:40:32 +02:00
Ingo Molnar	10d3285d0b	Merge branch 'x86/urgent' into x86/core Conflicts: include/asm-x86/gpio.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:30:19 +02:00
Ingo Molnar	6dec3a10a7	Merge branch 'x86/x2apic' into x86/core Conflicts: include/asm-x86/i8259.h include/asm-x86/msidef.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:29:23 +02:00
Yinghai Lu	0af36739af	usb: move ehci reg def prepare x86: usb debug port early console move ehci struct def to linux/usrb/ehci_def.h from host/ehci.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: "Arjan van de Ven" <arjan@infradead.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Greg KH" <greg@kroah.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:17:01 +02:00
Joerg Roedel	3bc9f79ee1	iommu: add iommu_num_pages helper function Calculating the number of pages from given address and length numbers is a task required in multiple IOMMU implementations. So implement this as a generic function into the IOMMU helper code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: iommu@lists.linux-foundation.org Cc: bhavna.sarathy@amd.com Cc: robert.richter@amd.com Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:43:05 +02:00
Robert Richter	ee648bc77f	OProfile: add IBS code macros Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:48:04 +02:00
Robert Richter	021f8b75e7	x86: add PCI IDs for AMD Barcelona PCI devices Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:47:59 +02:00
Ingo Molnar	36ac26171a	crashdump: fix undefined reference to `elfcorehdr_addr' fix build bug introduced by `95b68dec0d` "calgary iommu: use the first kernels TCE tables in kdump": arch/x86/kernel/built-in.o: In function `calgary_iommu_init': (.init.text+0x8399): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `calgary_iommu_init': (.init.text+0x856c): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `detect_calgary': (.init.text+0x8c68): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `detect_calgary': (.init.text+0x8d0c): undefined reference to `elfcorehdr_addr' make elfcorehdr_addr a generally available symbol. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:26:23 +02:00
Ilpo Järvinen	ec34c702ca	net: drop unused BUG_TRAP() Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 21:45:49 -07:00
Grant Likely	284b018973	spi: Add OF binding support for SPI busses This patch adds support for populating an SPI bus based on data in the OF device tree. This is useful for powerpc platforms which use the device tree instead of discrete code for describing platform layout. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2008-07-25 22:34:40 -04:00
Grant Likely	dc87c98e8f	spi: split up spi_new_device() to allow two stage registration. spi_new_device() allocates and registers an spi device all in one swoop. If the driver needs to add extra data to the spi_device before it is registered, then this causes problems. This is needed for OF device tree support so that the SPI device tree helper can add a pointer to the device node after the device is allocated, but before the device is registered. OF aware SPI devices can then retrieve data out of the device node to populate a platform data structure. This patch splits the allocation and registration portions of code out of spi_new_device() and creates two new functions; spi_alloc_device() and spi_register_device(). spi_new_device() is modified to use the new functions for allocation and registration. None of the existing users of spi_new_device() should be affected by this change. Drivers using the new API can forego the use of spi_board_info structure to describe the device layout and populate data into the spi_device structure directly. This change is in preparation for adding an OF device tree parser to generate spi_devices based on data in the device tree. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David Brownell <dbrownell@users.sourceforge.net>	2008-07-25 22:34:29 -04:00
Grant Likely	3f07af494d	of: adapt of_find_i2c_driver() to be usable by SPI also SPI has a similar problem as I2C in that it needs to determine an appropriate modalias value for each device node. This patch adapts the of_i2c of_find_i2c_driver() function to be usable by of_spi also. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2008-07-25 22:25:13 -04:00
Harvey Harrison	b4615e69b6	sys_paccept definition missing __user annotation Introduced by commit `aaca0bdca5` ("flag parameters: paccept"): net/socket.c:1515:17: error: symbol 'sys_paccept' redeclared with different type (originally declared at include/linux/syscalls.h:413) - incompatible argument 4 (different address spaces) Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 17:28:49 -07:00
Linus Torvalds	ff5d48a6d1	Merge git://git.infradead.org/embedded-2.6 * git://git.infradead.org/embedded-2.6: Make console charset translation optional	2008-07-25 12:02:08 -07:00
Linus Torvalds	762b8291be	Merge git://git.infradead.org/~dwmw2/random-2.6 * git://git.infradead.org/~dwmw2/random-2.6: remove dummy asm/kvm.h files firmware: create firmware binaries during 'make modules'.	2008-07-25 12:01:37 -07:00
Johannes Weiner	c6af5e9f8a	bootmem: Move node allocation macros back to !HAVE_ARCH_BOOTMEM_NODE These got unintentionally moved, put them back as x86 provides its own versions. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 11:36:44 -07:00
Adrian Bunk	7dcf2a9fce	remove dummy asm/kvm.h files This patch removes the dummy asm/kvm.h files on architectures not (yet) supporting KVM and uses the same conditional headers installation as already used for a.out.h . Also removed are superfluous install rules in the s390 and x86 Kbuild files (they are already in Kbuild.asm). Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-25 14:35:50 -04:00
Linus Torvalds	5047887caf	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc: Wireup new syscalls Move update_mmu_cache() declaration from tlbflush.h to pgtable.h powerpc/pseries: Remove kmalloc call in handling writes to lparcfg powerpc/pseries: Update arch vector to indicate support for CMO ibmvfc: Add support for collaborative memory overcommit ibmvscsi: driver enablement for CMO ibmveth: enable driver for CMO ibmveth: Automatically enable larger rx buffer pools for larger mtu powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O powerpc/pseries: vio bus support for CMO powerpc/pseries: iommu enablement for CMO powerpc/pseries: Add CMO paging statistics powerpc/pseries: Add collaborative memory manager powerpc/pseries: Utilities to set firmware page state powerpc/pseries: Enable CMO feature during platform setup powerpc/pseries: Split retrieval of processor entitlement data into a helper routine powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg powerpc: Fix compile error with binutils 2.15 ... Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually.	2008-07-25 11:08:17 -07:00
Linus Torvalds	996abf053e	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6: (22 commits) UBI: always start the background thread UBI: fix gcc warning UBI: remove pre-sqnum images support UBI: fix kernel-doc errors and warnings UBI: fix checkpatch.pl errors and warnings UBI: bugfix - do not torture PEB needlessly UBI: rework scrubbing messages UBI: implement multiple volumes rename UBI: fix and re-work debugging stuff UBI: amend commentaries UBI: fix error message UBI: improve mkvol request validation UBI: add ubi_sync() interface UBI: fix 64-bit calculations UBI: fix LEB locking UBI: fix memory leak on error path UBI: do not forget to free internal volumes UBI: fix memory leak UBI: avoid unnecessary division operations UBI: fix buffer padding ...	2008-07-25 11:02:17 -07:00
Arthur Jones	8f421c595a	edac: i5100 new intel chipset driver Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	33670fa296	fuse: nfs export special lookups Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	bde74e4bc6	locks: add special return value for asynchronous locks Use a special error value FILE_LOCK_DEFERRED to mean that a locking operation returned asynchronously. This is returned by posix_lock_file() for sleeping locks to mean that the lock has been queued on the block list, and will be woken up when it might become available and needs to be retried (either fl_lmops->fl_notify() is called or fl_wait is woken up). f_op->lock() to mean either the above, or that the filesystem will call back with fl_lmops->fl_grant() when the result of the locking operation is known. The filesystem can do this for sleeping as well as non-sleeping locks. This is to make sure, that return values of -EAGAIN and -EINPROGRESS by filesystems are not mistaken to mean an asynchronous locking. This also makes error handling in fs/locks.c and lockd/svclock.c slightly cleaner. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Keika Kobayashi	016ae219b9	per-task-delay-accounting: update taskstats for memory reclaim delay Add members for memory reclaim delay to taskstats, and accumulate them in __delayacct_add_tsk() . Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp> Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Keika Kobayashi	873b477177	per-task-delay-accounting: add memory reclaim delay Sometimes, application responses become bad under heavy memory load. Applications take a bit time to reclaim memory. The statistics, how long memory reclaim takes, will be useful to measure memory usage. This patch adds accounting memory reclaim to per-task-delay-accounting for accounting the time of do_try_to_free_pages(). <i.e> - When System is under low memory load, memory reclaim may not occur. $ free total used free shared buffers cached Mem: 8197800 1577300 6620500 0 4808 1516724 -/+ buffers/cache: 55768 `8142032` Swap: 16386292 0 16386292 $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0 Measure the time of tar command. $ ls -s test.dat 1501472 test.dat $ time tar cvf test.tar test.dat real 0m13.388s user 0m0.116s sys 0m5.304s $ ./delayget -d -p <pid> CPU count real total virtual total delay total 428 5528345500 5477116080 62749891 IO count delay total 338 8078977189 SWAP count delay total 0 0 RECLAIM count delay total 0 0 - When system is under heavy memory load memory reclaim may occur. $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0 0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 100 0 0 0 7159032 49848 1812 3012 0 0 0 0 3 22 0 0 100 0 In this case, one process uses more 8G memory by execution of malloc() and memset(). $ time tar cvf test.tar test.dat real 1m38.563s <- increased by 85 sec user 0m0.140s sys 0m7.060s $ ./delayget -d -p <pid> CPU count real total virtual total delay total 9021 7140446250 7315277975 923201824 IO count delay total 8965 90466349669 SWAP count delay total 3 21036367 RECLAIM count delay total 740 61011951153 In the later case, the value of RECLAIM is increasing. So, taskstats can show how much memory reclaim influences TAT. Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujistu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Andrea Righi	297c5d9263	task IO accounting: provide distinct tgid/tid I/O statistics Report per-thread I/O statistics in /proc/pid/task/tid/io and aggregate parent I/O statistics in /proc/pid/io. This approach follows the same model used to account per-process and per-thread CPU times. As a practial application, this allows for example to quickly find the top I/O consumer when a process spawns many child threads that perform the actual I/O work, because the aggregated I/O statistics can always be found in /proc/pid/io. [ Oleg Nesterov points out that we should check that the task is still alive before we iterate over the threads, but also says that we can do that fixup on top of this later. - Linus ] Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: Matt Heaton <matt@hostmonster.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Acked-by-with-comments: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Pavel Emelyanov	0b6b030fc3	bsdacct: switch from global bsd_acct_struct instance to per-pidns one Allocate the structure on the first call to sys_acct(). After this each namespace, that ordered the accounting, will live with this structure till its own death. Two notes - routines, that close the accounting on fs umount time use the init_pid_ns's acct by now; - accounting routine accounts to dying task's namespace (also by now). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Pavel Emelyanov	20fad13ac6	pidns: add the struct bsd_acct_struct pointer on pid_namespace struct All the bsdacct-related info will be stored in the area, pointer by this one. It will be NULL automatically for all new namespaces. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:46 -07:00
Jonathan Lim	49b5cf3472	accounting: account for user time when updating memory integrals Adapt acct_update_integrals() to include user time when calculating the time difference. The units of acct_rss_mem1 and acct_vm_mem1 are also changed from pages-jiffies to pages-usecs to avoid calling jiffies_to_usecs() in xacct_add_tsk() which might overflow. Signed-off-by: Jonathan Lim <jlim@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:46 -07:00
Pavel Emelyanov	dbda0de526	pidns: remove find_task_by_pid, unused for a long time It seems to me that it was a mistake marking this function as deprecated and scheduling it for removal, rather than resolutely removing it after the last caller's death. Anyway - better late, then never. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Pavel Emelyanov	e49859e71e	pidns: remove now unused find_pid function. This one had the only users so far - the kill_proc, which is removed, so drop this (invalid in namespaced world) call too. And of course - erase all references on it from comments. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Pavel Emelyanov	19b0cfcca4	pidns: remove now unused kill_proc function This function operated on a pid_t to kill a task, which is no longer valid in a containerized system. It has finally lost all its users and we can safely remove it from the tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Richard Kennedy	33166b1ffc	shrink struct pid by removing padding on 64 bit builds When struct pid is built on a 64 bit platform gcc has to insert padding to maintain the correct alignment, by simply reordering its members the memory usage shrinks from 88 bytes to 80. I've successfully run with this patch on my desktop AMD64 machine. There are no significant kernel size changes to a default config.X86_64 on the latest git v2.6.26-rc1 text data bss dec hex filename 5404828 976760 734280 7115868 6c945c vmlinux 5404811 976760 734280 7115851 6c944b vmlinux.pid-patch Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Adrian Bunk	3ae4eed34b	proper pid{hash,map}_init() prototypes This patch adds proper prototypes for pid{hash,map}_init() in include/linux/pid_namespace.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Alexey Dobriyan	881adb8535	proc: always do ->release Current two-stage scheme of removing PDE emphasizes one bug in proc: open rmmod remove_proc_entry close ->release won't be called because ->proc_fops were cleared. In simple cases it's small memory leak. For every ->open, ->release has to be done. List of openers is introduced which is traversed at remove_proc_entry() if neeeded. Discussions with Al long ago (sigh). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Adrian Bunk	6e644c3126	move proc_kmsg_operations to fs/proc/internal.h This patch moves the extern of struct proc_kmsg_operations to fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c so that the latter sees the former. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Abdel Benamrouche	d805dda412	fs/partition/check.c: fix return value warning fs/partitions/check.c:381: warning: ignoring return value of ___device_add___, declared with attribute warn_unused_result [akpm@linux-foundation.org: multiple-return-statements-per-function are evil] Signed-off-by: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Nadia Derbey	9eefe520c8	ipc: do not use a negative value to re-enable msgmni automatic recomputing This patch proposes an alternative to the "magical positive-versus-negative number trick" Andrew complained about last week in http://lkml.org/lkml/2008/6/24/418. This had been introduced with the patches that scale msgmni to the amount of lowmem. With these patches, msgmni has a registered notification routine that recomputes msgmni value upon memory add/remove or ipc namespace creation/ removal. When msgmni is changed from user space (i.e. value written to the proc file), that notification routine is unregistered, and the way to make it registered back is to write a negative value into the proc file. This is the "magical positive-versus-negative number trick". To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni. This file acts as ON/OFF for msgmni automatic recomputing. With this patch, the process is the following: 1) kernel boots in "automatic recomputing mode" /proc/sys/kernel/msgmni contains the value that has been computed (depends on lowmem) /proc/sys/kernel/automatic_msgmni contains "1" 2) echo <val> > /proc/sys/kernel/msgmni . sets msg_ctlmni to <val> . de-activates automatic recomputing (i.e. if, say, some memory is added msgmni won't be recomputed anymore) . /proc/sys/kernel/automatic_msgmni now contains "0" 3) echo "0" > /proc/sys/kernel/automatic_msgmni . de-activates msgmni automatic recomputing this has the same effect as 2) except that msg_ctlmni's value stays blocked at its current value) 3) echo "1" > /proc/sys/kernel/automatic_msgmni . recomputes msgmni's value based on the current available memory size and number of ipc namespaces . re-activates automatic recomputing for msgmni. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	380af1b33b	ipc/sem.c: rewrite undo list locking The attached patch: - reverses the locking order of ulp->lock and sem_lock: Previously, it was first ulp->lock, then inside sem_lock. Now it's the other way around. - converts the undo structure to rcu. Benefits: - With the old locking order, IPC_RMID could not kfree the undo structures. The stale entries remained in the linked lists and were released later. - The patch fixes a a race in semtimedop(): if both IPC_RMID and a semget() that recreates exactly the same id happen between find_alloc_undo() and sem_lock, then semtimedop() would access already kfree'd memory. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	a1193f8ec0	ipc/sem.c: convert sem_array.sem_pending to struct list_head sem_array.sem_pending is a double linked list, the attached patch converts it to struct list_head. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	2c0c29d414	ipc/sem.c: remove unused entries from struct sem_queue sem_queue.sma and sem_queue.id were never used, the attached patch removes them. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	4daa28f6d8	ipc/sem.c: convert undo structures to struct list_head The undo structures contain two linked lists, the attached patch replaces them with generic struct list_head lists. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Nadia Derbey	f9c46d6ea5	idr: make idr_find rcu-safe Make idr_find rcu-safe: it can now be called inside an rcu_read critical section. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Nadia Derbey	944ca05c7b	idr: error checking factorization Do some code factorization in the return code analysis. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Nadia Derbey	2027d1abc2	idr: change the idr structure After scalability problems have been detected when using the sysV ipcs, I have proposed to use an RCU based implementation of the IDR api instead (see threads http://lkml.org/lkml/2008/4/11/212 and http://lkml.org/lkml/2008/4/29/295). This resulted in many people asking to convert the idr API and make it rcu safe (because most of the code was duplicated and thus unmaintanable and unreviewable). So here is a first attempt. The important change wrt to the idr API itself is during idr removes: idr layers are freed after a grace period, instead of being moved to the free list. The important change wrt to ipcs, is that idr_find() can now be called locklessly inside a rcu read critical section. Here are the results I've got for the pmsg test sent by Manfred: 2.6.25-rc3-mm1 2.6.25-rc3-mm1+ 2.6.25-mm1 Patched 2.6.25-mm1 1 1168441 1064021 876000 947488 2 1094264 921059 1549592 1730685 3 2082520 1738165 1694370 2324880 4 2079929 1695521 404553 2400408 5 2898758 406566 391283 3246580 6 2921417 261275 263249 3752148 7 3308761 126056 191742 4243142 8 `3329456` 100129 141722 4275780 1st column: stock 2.6.25-rc3-mm1 2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs) 3nd column: stock 2.6.25-mm1 4th column: 2.6.25-mm1 + this pacth series. This patch: Add an rcu_head to the idr_layer structure in order to free it after a grace period. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Chandru	95b68dec0d	calgary iommu: use the first kernels TCE tables in kdump kdump kernel fails to boot with calgary iommu and aacraid driver on a x366 box. The ongoing dma's of aacraid from the first kernel continue to exist until the driver is loaded in the kdump kernel. Calgary is initialized prior to aacraid and creation of new tce tables causes wrong dma's to occur. Here we try to get the tce tables of the first kernel in kdump kernel and use them. While in the kdump kernel we do not allocate new tce tables but instead read the base address register contents of calgary iommu and use the tables that the registers point to. With these changes the kdump kernel and hence aacraid now boots normally. Signed-off-by: Chandru Siddalingappa <chandru@in.ibm.com> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Oleg Nesterov	3da1c84c00	workqueues: make get_online_cpus() useable for work->func() workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under cpu_maps_update_begin(). This means that the multithreaded workqueues can't use get_online_cpus() due to the possible deadlock, very bad and very old problem. Introduce the new state, CPU_POST_DEAD, which is called after cpu_hotplug_done() but before cpu_maps_update_done(). Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD. This means that create/destroy functions can't rely on get_online_cpus() any longer and should take cpu_add_remove_lock instead. [akpm@linux-foundation.org: fix CONFIG_SMP=n] Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	db70089722	workqueues: implement flush_work() Most of users of flush_workqueue() can be changed to use cancel_work_sync(), but sometimes we really need to wait for the completion and cancelling is not an option. schedule_on_each_cpu() is good example. Add the new helper, flush_work(work), which waits for the completion of the specific work_struct. More precisely, it "flushes" the result of of the last queue_work() which is visible to the caller. For example, this code queue_work(wq, work); /* WINDOW / queue_work(wq, work); flush_work(work); doesn't necessary work "as expected". What can happen in the WINDOW above is - wq starts the execution of work->func() - the caller migrates to another CPU now, after the 2nd queue_work() this work is active on the previous CPU, and at the same time it is queued on another. In this case flush_work(work) may return before the first work->func() completes. It is trivial to add another helper int flush_work_sync(struct work_struct work) { return flush_work(work) \|\| wait_on_work(work); } which works "more correctly", but it has to iterate over all CPUs and thus it much slower than flush_work(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Acked-by: Jarek Poplawski <jarkao2@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	a94e2d408e	coredump: kill mm->core_done Now that we have core_state->dumper list we can use it to wake up the sub-threads waiting for the coredump completion. This uglifies the code and .text grows by 47 bytes, but otoh mm_struct lessens by sizeof(struct completion). Also, with this change we can decouple exit_mm() from the coredumping code. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	b564daf806	coredump: construct the list of coredumping threads at startup time binfmt->core_dump() has to iterate over the all threads in system in order to find the coredumping threads and construct the list using the GFP_ATOMIC allocations. With this patch each thread allocates the list node on exit_mm()'s stack and adds itself to the list. This allows us to do further changes: - simplify ->core_dump() - change exit_mm() to clear ->mm first, then wait for ->core_done. this makes the coredumping process visible to oom_kill - kill mm->core_done Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	c5f1cc8c18	coredump: turn core_state->nr_threads into atomic_t Turn core_state->nr_threads into atomic_t and kill now unneeded down_write(&mm->mmap_sem) in exit_mm(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	999d9fc167	coredump: move mm->core_waiters into struct core_state Move mm->core_waiters into "struct core_state" allocated on stack. This shrinks mm_struct a little bit and allows further changes. This patch mostly does s/core_waiters/core_state. The only essential change is that coredump_wait() must clear mm->core_state before return. The coredump_wait()'s path is uglified and .text grows by 30 bytes, this is fixed by the next patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	32ecb1f26d	coredump: turn mm->core_startup_done into the pointer to struct core_state mm->core_startup_done points to "struct completion startup_done" allocated on the coredump_wait()'s stack. Introduce the new structure, core_state, which holds this "struct completion". This way we can add more info visible to the threads participating in coredump without enlarging mm_struct. No changes in affected .o files. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	246bb0b1de	kill PF_BORROWED_MM in favour of PF_KTHREAD Kill PF_BORROWED_MM. Change use_mm/unuse_mm to not play with ->flags, and do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users. No functional changes yet. But this allows us to do further fixes/cleanups. oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the kthreads, this is wrong because of use_mm(). The problem with PF_BORROWED_MM is that we need task_lock() to avoid races. With this patch we can check PF_KTHREAD directly, or use a simple lockless helper: /* The result must not be dereferenced !!! / struct mm_struct __get_task_mm(struct task_struct *tsk) { if (tsk->flags & PF_KTHREAD) return NULL; return tsk->mm; } Note also ecard_task(). It runs with ->mm != NULL, but it's the kernel thread without PF_BORROWED_MM. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	7b34e4283c	introduce PF_KTHREAD flag Introduce the new PF_KTHREAD flag to mark the kernel threads. It is set by INIT_TASK() and copied to the forked childs (we could set it in kthreadd() along with PF_NOFREEZE instead). daemonize() was changed as well. In that case testing of PF_KTHREAD is racy, but daemonize() is hopeless anyway. This flag is cleared in do_execve(), before search_binary_handler(). Probably not the best place, we can do this in exec_mmap() or in start_thread(), or clear it along with PF_FORKNOEXEC. But I think this doesn't matter in practice, and if do_execve() fails kthread should die soon. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	364d3c13c1	ptrace: give more respect to SIGKILL ptrace_stop() has some complicated checks to prevent the scheduling in the TASK_TRACED state with the pending SIGKILL, but these checks are racy, and they depend on arch_ptrace_stop_needed(). This patch assumes that the traced task should die asap if it was killed by SIGKILL, in that case schedule()->signal_pending_state() has no reason to ignore the TASK_WAKEKILL part of TASK_TRACED, and we can kill this nasty special case. Note: do_exit()->ptrace_notify() is special, the killed task can already dequeue SIGKILL at this point. Another indication that fatal_signal_pending() is not exactly right. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
KAMEZAWA Hiroyuki	12b9804419	res_counter: limit change support ebusy Add an interface to set limit. This is necessary to memory resource controller because it shrinks usage at set limit. Other controllers may not need this interface to shrink usage because shrinking is not necessary or impossible. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	c9b0ed5148	memcg: helper function for relcaim from shmem. A new call, mem_cgroup_shrink_usage() is added for shmem handling and relacing non-standard usage of mem_cgroup_charge/uncharge. Now, shmem calls mem_cgroup_charge() just for reclaim some pages from mem_cgroup. In general, shmem is used by some process group and not for global resource (like file caches). So, it's reasonable to reclaim pages from mem_cgroup where shmem is mainly used. [hugh@veritas.com: shmem_getpage release page sooner] [hugh@veritas.com: mem_cgroup_shrink_usage css_put] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	69029cd550	memcg: remove refcnt from page_cgroup memcg: performance improvements Patch Description 1/5 ... remove refcnt fron page_cgroup patch (shmem handling is fixed) 2/5 ... swapcache handling patch 3/5 ... add helper function for shmem's memory reclaim patch 4/5 ... optimize by likely/unlikely ppatch 5/5 ... remove redundunt check patch (shmem handling is fixed.) Unix bench result. == 2.6.26-rc2-mm1 + memory resource controller Execl Throughput 2915.4 lps (29.6 secs, 3 samples) C Compiler Throughput 1019.3 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5796.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1097.7 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 565.3 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1022128.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 544057.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 346481.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 319325.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 148788.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 99051.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2058917.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1606109.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 854789.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 126145.2 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 2915.4 678.0 File Copy 1024 bufsize 2000 maxblocks 3960.0 346481.0 875.0 File Copy 256 bufsize 500 maxblocks 1655.0 99051.0 598.5 File Copy 4096 bufsize 8000 maxblocks 5800.0 854789.0 1473.8 Shell Scripts (8 concurrent) 6.0 1097.7 1829.5 ========= FINAL SCORE 991.3 == 2.6.26-rc2-mm1 + this set == Execl Throughput 3012.9 lps (29.9 secs, 3 samples) C Compiler Throughput 981.0 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5872.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1120.3 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 578.0 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1003993.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 550452.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 347159.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 314644.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 151852.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 101000.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2033256.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1611814.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 847979.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 128148.7 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 3012.9 700.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 347159.0 876.7 File Copy 256 bufsize 500 maxblocks 1655.0 101000.0 610.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 847979.0 1462.0 Shell Scripts (8 concurrent) 6.0 1120.3 1867.2 ========= FINAL SCORE 1004.6 This patch: Remove refcnt from page_cgroup(). After this, * A page is charged only when !page_mapped() && no page_cgroup is assigned. * Anon page is newly mapped. * File page is added to mapping->tree. * A page is uncharged only when * Anon page is fully unmapped. * File page is removed from LRU. There is no change in behavior from user's view. This patch also removes unnecessary calls in rmap.c which was used only for refcnt mangement. [akpm@linux-foundation.org: fix warning] [hugh@veritas.com: fix shmem_unuse_inode charging] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	e8589cc189	memcg: better migration handling This patch changes page migration under memory controller to use a different algorithm. (thanks to Christoph for new idea.) Before: - page_cgroup is migrated from an old page to a new page. After: - a new page is accounted , no reuse of page_cgroup. Pros: - We can avoid compliated lock depndencies and races in migration. Cons: - new param to mem_cgroup_charge_common(). - mem_cgroup_getref() is added for handling ref_cnt ping-pong. This version simplifies complicated lock dependency in page migraiton under memory resource controller. new refcnt sequence is following. a mapped page: prepage_migration() ..... +1 to NEW page try_to_unmap() ..... all refs to OLD page is gone. move_pages() ..... +1 to NEW page if page cache. remap... ..... all refs from map is added to NEW one. end_migration() ..... -1 to New page. page's mapcount + (page_is_cache) refs are added to NEW one. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
Serge E. Hallyn	e885dcde75	cgroup_clone: use pid of newly created task for new cgroup cgroup_clone creates a new cgroup with the pid of the task. This works correctly for unshare, but for clone cgroup_clone is called from copy_namespaces inside copy_process, which happens before the new pid is created. As a result, the new cgroup was created with current's pid. This patch: 1. Moves the call inside copy_process to after the new pid is created 2. Passes the struct pid into ns_cgroup_clone (as it is not yet attached to the task) 3. Passes a name from ns_cgroup_clone() into cgroup_clone() so as to keep cgroup_clone() itself simpler 4. Uses pid_vnr() to get the process id value, so that the pid used to name the new cgroup is always the pid as it would be known to the task which did the cloning or unsharing. I think that is the most intuitive thing to do. This way, task t1 does clone(CLONE_NEWPID) to get t2, which does clone(CLONE_NEWPID) to get t3, then the cgroup for t3 will be named for the pid by which t2 knows t3. (Thanks to Dan Smith for finding the main bug) Changelog: June 11: Incorporate Paul Menage's feedback: don't pass NULL to ns_cgroup_clone from unshare, and reduce patch size by using 'nodename' in cgroup_clone. June 10: Original version [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge Hallyn <serge@us.ibm.com> Acked-by: Paul Menage <menage@google.com> Tested-by: Dan Smith <danms@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
Paul Menage	856c13aa1f	cgroup files: convert res_counter_write() to be a cgroups write_string() handler Currently res_counter_write() is a raw file handler even though it's ultimately taking a number, since in some cases it wants to pre-process the string when converting it to a number. This patch converts res_counter_write() from a raw file handler to a write_string() handler; this allows some of the boilerplate copying/locking/checking to be removed, and simplies the cleanup path, since these functions are now performed by the cgroups framework. [lizf@cn.fujitsu.com: build fix] Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:36 -07:00
Paul Menage	84eea84288	cgroups: misc cleanups to write_string patchset This patch contains cleanups suggested by reviewers for the recent write_string() patchset: - pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for clarity, rather than directly unlocking cgroup_mutex. - make the return type of cgroup_lock_live_group() a bool - use a #define'd constant for the local buffer size in read/write functions Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	e788e066c6	cgroup files: move the release_agent file to use typed handlers Adds cgroup_release_agent_write() and cgroup_release_agent_show() methods to handle writing/reading the path to a cgroup hierarchy's release agent. As a result, cgroup_common_file_read() is now unnecessary. As part of the change, a previously-tolerated race in cgroup_release_agent() is avoided by copying the current release_agent_path prior to calling call_usermode_helper(). Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	db3b14978a	cgroup files: add write_string cgroup control file method This patch adds a write_string() method for cgroups control files. The semantics are that a buffer is copied from userspace to kernelspace and the handler function invoked on that buffer. The buffer is guaranteed to be nul-terminated, and no longer than max_write_len (defaulting to 64 bytes if unspecified). Later patches will convert existing raw file write handlers in control group subsystems to use this method. Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	ce16b49d37	cgroup files: clean up whitespace in struct cftype This patch removes some extraneous spaces from method declarations in struct cftype, to fit in with conventional kernel style. Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Pavel Emelyanov	f2992db2a4	Mark res_counter_charge(_locked) with __must_check Ignoring their return values may result in counter underflow in the future - when the value charged will be uncharged (or in "leaks" - when the value is not uncharged). This also prevents from using charging routines to decrement the counter value (i.e. uncharge it) ;) (Current code works OK with res_counter, however :) ) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	657d3bfa98	quota: implement sending information via netlink about user below quota Sometimes it may be useful for userspace to know (e.g. for some hosting guys) that some user stopped exceeding his hardlimit or softlimit in quotas. Implement sending of such events to userspace via quota netlink protocol so that they don't have to poll for such events. Based on idea and initial implementation by Vladislav Bogdanov. Cc: Vladislav Bogdanov <slava@nsys.by> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	03b063436c	quota: convert macros to inline functions Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	74abb9890d	quota: move function-macros from quota.h to quotaops.h Move declarations of some macros, which should be in fact functions to quotaops.h. This way they can be later converted to inline functions because we can now use declarations from quota.h. Also add necessary includes of quotaops.h to a few files. [akpm@linux-foundation.org: fix JFS build] [akpm@linux-foundation.org: fix UFS build] [vegard.nossum@gmail.com: fix QUOTA=n build] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Arjen Pool <arjenpool@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	02a55ca871	quota: cleanup loop in sync_dquots() Make loop in sync_dquots() checking whether there's something to write more readable, remove useless variable and macro info_any_dirty() which is used only in this place. Signed-off-by: Jan Kara <jack@suse.cz> Cc: "Vegard Nossum" <vegard.nossum@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	b85f4b87a5	quota: rename quota functions from upper case, make bigger ones non-inline Cleanup quotaops.h: Rename functions from uppercase to lowercase (and define backward compatibility macros), move larger functions to dquot.c and make them non-inline. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Joe Peterson	b271e067c8	fatfs: add UTC timestamp option Provide a new mount option ("tz=UTC") for DOS (vfat/msdos) filesystems, allowing timestamps to be in coordinated universal time (UTC) rather than local time in applications where doing this is advantageous. In particular, portable devices that use fat/vfat (such as digital cameras) can benefit from using UTC in their internal clocks, thus avoiding daylight saving time errors and general time ambiguity issues. The user of the device does not have to worry about changing the time when moving from place or when daylight saving changes. The new mount option, when set, disables the counter-adjustment that Linux currently makes to FAT timestamp info in anticipation of the normal userspace time zone correction. When used in this new mode, all daylight saving time and time zone handling is done in userspace as is normal for many other filesystems (like ext3). The default mode, which remains unchanged, is still appropriate when mounting volumes written in Windows (because of its use of local time). I originally based this patch on one submitted last year by Paul Collins, but I updated it to work with current source and changed variable/option naming. Ogawa Hirofumi (who maintains these filesystems) and I discussed this patch at length on lkml, and he suggested using the option name in the attached version of the patch. Barry Bouwsma pointed out a good addition to the patch as well. Signed-off-by: Joe Peterson <joe@skyrush.com> Signed-off-by: Paul Collins <paul@ondioline.org> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Barry Bouwsma <free_beer_for_all@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Adrian Bunk	e8938a62a8	remove unused #include <linux/dirent.h>'s Remove some unused #include <linux/dirent.h>'s. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Adrian Bunk	cf6ae8b50e	remove the in-kernel struct dirent{,64} The kernel struct dirent{,64} were different from the ones in userspace. Even worse, we exported the kernel ones to userspace. But after the fat usages are fixed we can remove the conflicting kernel versions. Reviewed-by: H. Peter Anvin <hpa@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Rene Scharfe	7557bc66be	msdos fs: remove unsettable atari option It has been impossible to set the option 'atari' of the MSDOS filesystem for several years. Since nobody seems to have missed it, let's remove its remains. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
OGAWA Hirofumi	4596c8aaf9	fat: fix VFAT_IOCTL_READDIR_xxx and cleanup for userland "struct dirent" is a kernel type here, but is a different type in userspace! This means both the structure and the IOCTL number is wrong! So, this adds new "struct __fat_dirent" to generate correct IOCTL number. And kernel stuff moves to under __KERNEL__. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Jeff Mahoney	90415deac7	reiserfs: convert j_commit_lock to mutex j_commit_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Jeff Mahoney	afe7025907	reiserfs: convert j_flush_sem to mutex j_flush_sem is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: fix mutex_trylock retval treatment] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Jeff Mahoney	f68215c464	reiserfs: convert j_lock to mutex j_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Adrian Bunk	de0ca06a99	coda: remove CODA_FS_OLD_API While fixing CONFIG_ leakages to the userspace kernel headers I ran into CODA_FS_OLD_API. After five years, are there still people using the old API left? Especially considering that you have to choose at compile time which API to support in the kernel (and distributions tend to offer the new API for some time). Jan: "The old API can definitely go. Around the time the new interface went in there were some non-Coda userspace file system implementations that took a while longer to convert to the new API, but by now they all switched to the new interface or in some cases to a FUSE-based solution." Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Duane Griffin	ae76dd9a6b	ext3: handle corrupted orphan list at mount If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext3_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext3_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: printk warning fix] Signed-off-by: Duane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:32 -07:00
Samuel Thibault	50c33a84db	ext2: fix typo in Hurd part of include/linux/ext2_fs.h Fix typo in Hurd part of include/linux/ext2_fs.h The ';' here is redundant or can even pose problem. This is actually not used by the Linux kernel, but it is exposed in GNU/Hurd. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:31 -07:00
Eric Miao	bbcd6d543d	gpio: max732x driver This adds a driver supporting a family of I2C port expanders from Maxim, which includes the MAX7319 and MAX7320-7327 chips. [dbrownell@users.sourceforge.net: minor fixes] Signed-off-by: Jack Ren <jack.ren@marvell.com> Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
David Brownell	8f1cc3b10e	gpio: mcp23s08 handles multiple chips per chipselect Teach the mcp23s08 driver about a curious feature of these chips: up to four of them can share the same chipselect, with the SPI signals wired in parallel, by matching two bits in the first protocol byte against two address lines on the chip. This is handled by three software changes: * Platform data now holds an array of per-chip structs, not just one chip's address and pullup configuration. * Probe() and remove() now use another level of structure, wrapping an instance of the original structure for each mcp23s08 chip sharing that chipselect. * The HAEN bit is set, so that the hardware address bits can no longer be ignored (boot firmware may not have enabled them). The "one struct per chip" preserves the guts of the current code, but platform_data will need minor changes. OLD: /* incorrect "slave" ID may not have mattered / .slave = 3, .pullups = BIT(3) \| BIT(1) \| BIT(0), NEW: / slave address _must_ match chip's wiring */ .chip[3] = { .is_present = true, .pullups = BIT(3) \| BIT(1) \| BIT(0), }, There's no change in how things _behave_ for spi_device nodes with a single mcp23s08 chip. New multi-chip configurations assign GPIOs in sequence, without holes. The spi_device just resembles a bigger controller, but internally it has multiple gpio_chip instances. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
David Brownell	d8f388d8dc	gpio: sysfs interface This adds a simple sysfs interface for GPIOs. /sys/class/gpio /export ... asks the kernel to export a GPIO to userspace /unexport ... to return a GPIO to the kernel /gpioN ... for each exported GPIO #N /value ... always readable, writes fail for input GPIOs /direction ... r/w as: in, out (default low); write high, low /gpiochipN ... for each gpiochip; #N is its first GPIO /base ... (r/o) same as N /label ... (r/o) descriptive, not necessarily unique /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1) GPIOs claimed by kernel code may be exported by its owner using a new gpio_export() call, which should be most useful for driver debugging. Such exports may optionally be done without a "direction" attribute. Userspace may ask to take over a GPIO by writing to a sysfs control file, helping to cope with incomplete board support or other "one-off" requirements that don't merit full kernel support: echo 23 > /sys/class/gpio/export ... will gpio_request(23, "sysfs") and gpio_export(23); use /sys/class/gpio/gpio-23/direction to (re)configure it, when that GPIO can be used as both input and output. echo 23 > /sys/class/gpio/unexport ... will gpio_free(23), when it was exported as above The extra D-space footprint is a few hundred bytes, except for the sysfs resources associated with each exported GPIO. The additional I-space footprint is about two thirds of the current size of gpiolib (!). Since no /dev node creation is involved, no "udev" support is needed. Related changes: * This adds a device pointer to "struct gpio_chip". When GPIO providers initialize that, sysfs gpio class devices become children of that device instead of being "virtual" devices. * The (few) gpio_chip providers which have such a device node have been updated. * Some gpio_chip drivers also needed to update their module "owner" field ... for which missing kerneldoc was added. * Some gpio_chips don't support input GPIOs. Those GPIOs are now flagged appropriately when the chip is registered. Based on previous patches, and discussion both on and off LKML. A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this merges to mainline. [akpm@linux-foundation.org: a few maintenance build fixes] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Srinivasa D S	ef53d9c5e4	kprobes: improve kretprobe scalability with hashed locking Currently list of kretprobe instances are stored in kretprobe object (as used_instances,free_instances) and in kretprobe hash table. We have one global kretprobe lock to serialise the access to these lists. This causes only one kretprobe handler to execute at a time. Hence affects system performance, particularly on SMP systems and when return probe is set on lot of functions (like on all systemcalls). Solution proposed here gives fine-grain locks that performs better on SMP system compared to present kretprobe implementation. Solution: 1) Instead of having one global lock to protect kretprobe instances present in kretprobe object and kretprobe hash table. We will have two locks, one lock for protecting kretprobe hash table and another lock for kretporbe object. 2) We hold lock present in kretprobe object while we modify kretprobe instance in kretprobe object and we hold per-hash-list lock while modifying kretprobe instances present in that hash list. To prevent deadlock, we never grab a per-hash-list lock while holding a kretprobe lock. 3) We can remove used_instances from struct kretprobe, as we can track used instances of kretprobe instances using kretprobe hash table. Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system with return probes set on all systemcalls looks like this. cacheline non-cacheline Un-patched kernel aligned patch aligned patch =============================================================================== real 9m46.784s 9m54.412s 10m2.450s user 40m5.715s 40m7.142s 40m4.273s sys 2m57.754s 2m58.583s 3m17.430s =========================================================== Time duration for kernel compilation ("make -j 8) on the same system, when kernel is not probed. ========================= real 9m26.389s user 40m8.775s sys 2m7.283s ========================= Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Ben Dooks	42cd2366fb	sm501: gpio I2C support Add support for adding the GPIO based I2C resources. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Arnaud Patard	60e540d617	sm501: gpio dynamic registration for PCI devices The SM501 PCI card requires a dyanmic gpio allocation as the number of cards is not known at compile time. Fixup the platform data and registration to deal with this. Acked-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Ben Dooks	f61be273d3	sm501: add gpiolib support Add support for exporting the GPIOs on the SM501 via gpiolib. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Ben Dooks	472dba7d11	sm501: add power control callback Add callback to get or set the power control if the device has the sleep connected to some form of GPIO. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Dave Young	717115e1a5	printk ratelimiting rewrite All ratelimit user use same jiffies and burst params, so some messages (callbacks) will be lost. For example: a call printk_ratelimit(5 * HZ, 1) b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will will be supressed. - rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for hints from andrew. - Add WARN_ON_RATELIMIT, update rcupreempt.h - remove __printk_ratelimit - use __ratelimit in net_ratelimit Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Vegard Nossum	2711b793eb	kallsyms: unify 32- and 64-bit code Use the %p format string which already accounts for the padding you need with a pointer type on a particular architecture. Also replace the macro with a static inline function to match the rest of the file. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Arjan van de Ven	b6c6393700	Rename WARN() to WARNING() to clear the namespace We want to use WARN() as a variant of WARN_ON(), however a few drivers are using WARN() internally. This patch renames these to WARNING() to avoid the namespace clash. A few cases were defining but not using the thing, for those cases I just deleted the definition. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Greg KH <greg@kroah.com> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Robert P. J. Day	4500d067ee	init.h: remove obsolete content Remove apparently obsolete content from init.h referring to gcc 2.9x and to "no_module_init". Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:28 -07:00
KOSAKI Motohiro	ac331d158e	call_usermodehelper(): increase reliability Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return NULL _very_ easily. GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust and better. thus, I add gfp_mask argument to call_usermodehelper_setup(). So, its callers pass the gfp_t as below: call_usermodehelper() and call_usermodehelper_keys(): depend on 'wait' argument. call_usermodehelper_pipe(): always GFP_KERNEL because always run under process context. orderly_poweroff(): pass to GFP_ATOMIC because may run under interrupt context. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: "Paul Menage" <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:28 -07:00
Andrew Morton	cebbd3fb80	build-kernel-profileo-only-when-requested-cleanups Cc: Adrian Bunk <bunk@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Adrian Bunk	b03f6489f9	build kernel/profile.o only when requested Build kernel/profile.o only if CONFIG_PROFILING is enabled. This makes CONFIG_PROFILING=n kernels smaller. As a bonus, some profile_tick() calls and one branch from schedule() are now eliminated with CONFIG_PROFILING=n (but I doubt these are measurable effects). This patch changes the effects of CONFIG_PROFILING=n, but I don't think having more than two choices would be the better choice. This patch also adds the name of the first parameter to the prototypes of profile_{hits,tick}() since I anyway had to add them for the dummy functions. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Robert P. J. Day	e0ce0da9fe	lists: remove a redundant conditional definition of list_add() Remove the conditional surrounding the definition of list_add() from list.h since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently pick up from lib/list_debug.c will be absolutely identical, at which point you can remove that redundant definition from list_debug.c as well. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Robert P. J. Day	b39c08cb69	Remove apparently unused fd1772.h header file. This header file has been unused for quite some time, and the corresponding source files appear to have been removed back in commit `99eb8a550d` ("Remove the arm26 port") Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Adrian Bunk <bunk@stusta.de> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Harvey Harrison	8b5ac31e27	include: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Steven Rostedt	3f307891ce	locking: add typecheck on irqsave and friends for correct flags There haave been several areas in the kernel where an int has been used for flags in local_irq_save() and friends instead of a long. This can cause some hard to debug problems on some architectures. This patch adds a typecheck inside the irqsave and restore functions to flag these cases. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Andrew Morton	e0deaff470	split the typecheck macros out of include/linux/kernel.h Needed to fix up a recursive include snafu in locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Yevgeny Petrilin	25c94d010a	mlx4_core: Add VLAN tag field to WQE control segment struct Add fields for VLAN tag and insert VLAN tag flag to the control section struct. These fields will be used for sending ethernet packets. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-25 10:30:06 -07:00
David Miller	3d6f4a20cc	endian: Always evaluate arguments. Changeset `7fa897b91a` ("ide: trivial sparse annotations") created an IDE bootup regression on big-endian systems. In drivers/ide/ide-iops.c, function ide_fixstring() we now have the loop: for (p = end ; p != s;) be16_to_cpus((u16 *)(p -= 2)); which will never terminate on big-endian because in such a configuration be16_to_cpus() evaluates to "do { } while (0)" Therefore, always evaluate the arguments to nop endian transformation operations. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 09:28:09 -07:00
Alexey Korolev	3d45955962	[MTD] [NAND] subpage read feature as a way to increase performance. This patch enables NAND subpage read functionality. If upper layer drivers are requesting to read non page aligned data NAND subpage-read functionality reads the only whose ECC regions which include requested data when original code reads whole page. This significantly improves performance in many cases. Here are some digits : UBI volume mount time No subpage reads: 5.75 seconds Subpage read patch: 2.42 seconds Open/stat time for files on JFFS2 volume: No subpage read 0m 5.36s Subpage read 0m 2.88s Signed-off-by Alexey Korolev <akorolev@infradead.org> Acked-by: Artem Bityutskiy <dedekind@infradead.org> Acked-by: Jörn Engel <joern@logfs.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-25 10:49:50 -04:00
David Woodhouse	ff877ea80e	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6	2008-07-25 10:40:14 -04:00
Stefan Richter	95984f62c9	firewire: fw-ohci: TSB43AB22/A dualbuffer workaround Isochronous reception in dualbuffer mode is reportedly broken with TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been determined as the trigger: https://bugzilla.redhat.com/show_bug.cgi?id=435550 Two fixes are possible: - pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK); at least when IR descriptors are allocated, or - simply don't use dualbuffer. This fix implements the latter workaround. But we keep using dualbuffer on x86-32 which won't give us highmen (and thus physical addresses outside the 31bit range) in coherent DMA memory allocations. Right now we could for example also whitelist PPC32, but DMA mapping implementation details are expected to change there. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-07-25 15:41:23 +02:00
Ingo Molnar	10a010f695	Merge branch 'linus' into x86/x2apic Conflicts: drivers/pci/dmar.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-25 13:08:16 +02:00
Nathan Lynch	483fad1c3f	ELF loader support for auxvec base platform string Some IBM POWER-based platforms have the ability to run in a mode which mostly appears to the OS as a different processor from the actual hardware. For example, a Power6 system may appear to be a Power5+, which makes the AT_PLATFORM value "power5+". This means that programs are restricted to the ISA supported by Power5+; Power6-specific instructions are treated as illegal. However, some applications (virtual machines, optimized libraries) can benefit from knowledge of the underlying CPU model. A new aux vector entry, AT_BASE_PLATFORM, will denote the actual hardware. For example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is that AT_PLATFORM indicates the instruction set supported, while AT_BASE_PLATFORM indicates the underlying microarchitecture. If the architecture has defined ELF_BASE_PLATFORM, copy that value to the user stack in the same manner as ELF_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Linus Torvalds	832fe9c222	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: Add transport feature handling stub for virtio_ring. virtio: Rename set_features to finalize_features virtio: Formally reserve bits 28-31 to be 'transport' features. s390: use virtio_console for KVM on s390 virtio: console as a config option virtio_console: use virtqueue notification for hvc_console hvc_console: rework setup to replace irq functions with callbacks virtio_blk: check for hardsector size from host virtio: Use bus_type probe and remove methods virtio: don't always force a notification when ring is full virtio: clarify that ABI is usable by any implementations virtio: Recycle unused recv buffer pages for large skbs in net driver virtio net: Allow receiving SG packets virtio net: Add ethtool ops for SG/GSO virtio: fix virtio_net xmit of freed skb bug	2008-07-24 19:11:49 -07:00
Rusty Russell	ed9559d38a	Label kthread_create() with printf attribute tag. Obvious misc patch been in my queue (& linux-next) for over a cycle. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 19:11:15 -07:00
Rusty Russell	e34f872567	virtio: Add transport feature handling stub for virtio_ring. To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:14 +10:00
Rusty Russell	c624896e48	virtio: Rename set_features to finalize_features Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:12 +10:00
Rusty Russell	dd7c7bc462	virtio: Formally reserve bits 28-31 to be 'transport' features. We assign feature bits as required, but it makes sense to reserve some for the particular transport, rather than the particular device. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:07 +10:00
Christian Borntraeger	066f4d82a6	virtio_blk: check for hardsector size from host Currently virtio_blk assumes a 512 byte hard sector size. This can cause trouble / performance issues if the backing has a different block size (like a file on an ext3 file system formatted with 4k block size or a dasd). Lets add a feature flag that tells the guest to use a different hard sector size than 512 byte. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:05 +10:00
Rusty Russell	674bfc23c5	virtio: clarify that ABI is usable by any implementations We want others to implement and use virtio, so it makes sense to BSD license the non-__KERNEL__ parts of the headers to make this crystal clear. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Ryan Harper <ryanh@us.ibm.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com> Acked-by: Anthony Liguori <aliguori@us.ibm.com>	2008-07-25 12:06:04 +10:00
Linus Torvalds	b5684b83b1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) ide: use proper printk() KERN_* levels in ide-probe.c ide: fix for EATA SCSI HBA in ATA emulating mode ide: remove stale comments from drivers/ide/Makefile ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase ide-scsi: remove kmalloced struct request ht6560b: remove old history ht6560b: update email address ide-cd: fix oops when using growisofs gayle: release resources on ide_host_add() failure palm_bk3710: add UltraDMA/100 support ide: trivial sparse annotations ide: ide-tape.c sparse annotations and unaligned access removal ide: drop 'name' parameter from ->init_chipset method ide: prefix messages from IDE PCI host drivers by driver name it821x: remove DECLARE_ITE_DEV() macro it8213: remove DECLARE_ITE_DEV() macro ide: include PCI device name in messages from IDE PCI host drivers ide: remove <asm/ide.h> for some archs ide-generic: remove ide_default_{io_base,irq}() inlines (take 3) ide-generic: is no longer needed on ppc32 ...	2008-07-24 14:55:09 -07:00
Linus Torvalds	5042d99795	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fixup sparse endianness warnings in proc.c PCI PM: make more PCI PM core functionality available to drivers PCI/DMAR: don't assume presence of RMRRs PCI hotplug: fix error path in pci_slot's register_slot	2008-07-24 13:57:13 -07:00
Bartlomiej Zolnierkiewicz	a326b02b0c	ide: drop 'name' parameter from ->init_chipset method There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:33 +02:00
Bartlomiej Zolnierkiewicz	2a8f7450f8	ide: remove <asm/ide.h> for some archs * Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes <linux/interrupt.h> which is enough). * Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz	d83b8b85cd	ide: define MAX_HWIFS in <linux/ide.h> * Now that ide_hwif_t instances are allocated dynamically the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10 is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs except these ones that use MAX_HWIFS == 1. * Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>. [ Please note that avr32/cris/v850 have no <asm/ide.h> and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:30 +02:00
Bartlomiej Zolnierkiewicz	ef0b04276d	ide: add ide_pci_remove() helper * Add 'unsigned long host_flags' field to struct ide_host. * Set ->host_flags in ide_host_alloc_all(). * Always set PCI dev's ->driver_data in ide_pci_init_{one,two}(). * Add ide_pci_remove() helper (the default implementation for struct pci_driver's ->remove method). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:19 +02:00
Bartlomiej Zolnierkiewicz	08da591e14	ide: add ide_device_{get,put}() helpers * Add 'struct ide_host host' field to ide_hwif_t and set it in ide_host_alloc_all(). Add ide_device_{get,put}() helpers loosely based on SCSI's scsi_device_{get,put}() ones. * Convert IDE device drivers to use ide_device_{get,put}(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:15 +02:00
Bartlomiej Zolnierkiewicz	6cdf6eb357	ide: add ->dev and ->host_priv fields to struct ide_host * Add 'struct device dev[2]' and 'void host_priv' fields to struct ide_host. * Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s](). * Pass 'void priv' argument to ide_setup_pci_device[s]() and use it to set ->host_priv. Set PCI dev's ->driver_data to point to the struct ide_host instance if PCI host driver wants to use ->host_priv. * Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:14 +02:00
Linus Torvalds	5c402355ad	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event	2008-07-24 12:56:07 -07:00
Linus Torvalds	ecc8b655b3	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well nohz: prevent tick stop outside of the idle loop	2008-07-24 12:55:01 -07:00
Linus Torvalds	7540081c6b	Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: Remove __DECLARE_SEMAPHORE_GENERIC Remove asm/semaphore.h Remove use of asm/semaphore.h Add missing semaphore.h includes Remove mention of semaphores from kernel-locking	2008-07-24 12:24:40 -07:00
Linus Torvalds	c54554d388	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Ensure led->trigger is set earlier leds: Add support for Philips PCA955x I2C LED drivers leds: Fix sparse warnings in leds-h1940 driver leds: mark led_classdev.default_trigger as const leds: fix unsigned value overflow in atmel pwm driver leds: Add pca9532 platform data for Thecus N2100 leds: Add pca9532 led driver	2008-07-24 12:16:02 -07:00
Steven Whitehouse	f9247273cb	UFS: add const to parser token table This patch adds a "const" to the parser token table. I've done an allmodconfig build to see if this produces any warnings/failures and the patch includes a fix for the only warning that was produced. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Alexander Viro <aviro@redhat.com> Acked-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 11:50:15 -07:00
Philippe De Muyter	5bb49fcd50	video/fb: cleanup FB_MAJOR usage Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses, and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c needs. Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from fb.h to major.h, and fix fbmem.c to use major.h's definition. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:41 -07:00
Hans-Christian Egtvedt	3e074058d7	fbdev: LCD backlight driver using Atmel PWM driver This patch adds a platform driver using the ATMEL PWM driver to control a backlight which requires a PWM signal and optional GPIO signal for discrete on/off signal. It has been tested on Favr-32 board from EarthLCD. The driver is configurable by supplying a struct with the platform data. See the include/linux/atmel-pwm-bl.h for details. The board code for Favr-32 will be submitted to the AVR32 kernel list. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:41 -07:00
Ben Dooks	0c531360ed	lcd: add lcd_device to check_fb() entry in lcd_ops Add the lcd_device being checked to the check_fb entry of lcd_ops. This ensures that any driver using this to check against it's own state can do so, and also makes all the calls in lcd_ops more orthogonal in their arguments. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:40 -07:00
Ben Dooks	206c5d69d0	sm501: add inversion controls for VBIASEN and FPEN Add flags to allow the driver to invert the sense of both VBIASEN and FPEN signals comming from the SM501. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:40 -07:00
Krzysztof Helt	01a2d9ed85	tridentfb: acceleration constants change This patch replaces deprecated constant FB_ACCELF_TEXT with FBINFO_HWACCEL_DISABLED and adds constants for Trident families of accelerators. The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works now. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:36 -07:00
David Brownell	d3de851a44	rtc: BCD codeshrink This updates <linux/bcd.h> to define the key routines as constant functions, which the macros will then call. Newer code can now call bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS. This lets each driver shrink their codespace by using N function calls to a single (global) copy of those routines, instead of N inlined copies of these functions per driver. These routines aren't used in speed-critical code. Almost all callers are in the RTC framework. Typical per-driver savings is near 300 bytes. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Adrian Bunk <bunk@kernel.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
David Brownell	53e84b672c	rtc: ds1305/ds1306 driver Support the Dallas/Maxim DS1305 and DS1306 RTC chips. These use SPI, and support alarms, NVRAM, and a trickle charger for use when their backup power supply is a supercap or rechargeable cell. This basic driver doesn't yet support suspend/resume or wakealarms. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
David Brownell	5ad31a5751	rtc: remove BKL for ioctl() Remove implicit use of BKL in ioctl() from the RTC framework. Instead, the rtc->ops_lock is used. That's the same lock that already protects the RTC operations when they're issued through the exported rtc_*() calls in drivers/rtc/interface.c ... making this a bugfix, not just a cleanup, since both ioctl calls and set_alarm() need to update IRQ enable flags and that implies a common lock (which RTC drivers as a rule do not provide on their own). A new comment at the declaration of "struct rtc_class_ops" summarizes current locking rules. It's not clear to me that the exceptions listed there should exist ... if not, those are pre-existing problems which can be fixed in a patch that doesn't relate to BKL removal. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
Ian Kent	aa55ddf340	autofs4: remove unused ioctls The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added several years ago but what they were intended for has never been implemented (as far as I'm aware noone uses them) so remove them. Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
Grant Likely	102eb97564	spi: make spi_board_info.modalias a char array Currently, 'modalias' in the spi_device structure is a 'const char '. The spi_new_device() function fills in the modalias value from a passed in spi_board_info data block. Since it is a pointer copy, the new spi_device remains dependent on the spi_board_info structure after the new spi_device is registered (no other fields in spi_device directly depend on the spi_board_info structure; all of the other data is copied). This causes a problem when dynamically propulating the list of attached SPI devices. For example, in arch/powerpc, the list of SPI devices can be populated from data in the device tree. With the current code, the device tree adapter must kmalloc() a new spi_board_info structure for each new SPI device it finds in the device tree, and there is no simple mechanism in place for keeping track of these allocations. This patch changes modalias from a 'const char ' to a fixed char array. By copying the modalias string instead of referencing it, the dependency on the spi_board_info structure is eliminated and an outside caller does not need to maintain a separate spi_board_info allocation for each device. If searched through the code to the best of my ability for any references to modalias which may be affected by this change and haven't found anything. It has been tested with the lite5200b platform in arch/powerpc. [dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc] Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:30 -07:00
Ulrich Drepper	9fe5ad9c8c	flag parameters add-on: remove epoll_create size param Remove the size parameter from the new epoll_create syscall and renames the syscall itself. The updated test program follows. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	510df2dd48	flag parameters: NONBLOCK in inotify_init This patch adds non-blocking support for inotify_init1. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("inotify_init1(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_NONBLOCK); if (fd == -1) { puts ("inotify_init1(IN_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("inotify_init1(IN_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	be61a86d72	flag parameters: NONBLOCK in pipe This patch adds O_NONBLOCK support to pipe2. It is minimally more involved than the patches for eventfd et.al but still trivial. The interfaces of the create_write_pipe and create_read_pipe helper functions were changed and the one other caller as well. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fds[2]; if (syscall (__NR_pipe2, fds, 0) == -1) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1) { puts ("pipe2(O_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	6b1ef0e60d	flag parameters: NONBLOCK in timerfd_create This patch adds support for the TFD_NONBLOCK flag to timerfd_create. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("timerfd_create(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_NONBLOCK); if (fd == -1) { puts ("timerfd_create(TFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("timerfd_create(TFD_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	e7d476dfdf	flag parameters: NONBLOCK in eventfd This patch adds support for the EFD_NONBLOCK flag to eventfd2. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("eventfd2(0) sets non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_NONBLOCK); if (fd == -1) { puts ("eventfd2(EFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("eventfd2(EFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	5fb5e04926	flag parameters: NONBLOCK in signalfd This patch adds support for the SFD_NONBLOCK flag to signalfd4. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_NONBLOCK O_NONBLOCK int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("signalfd4(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_NONBLOCK); if (fd == -1) { puts ("signalfd4(SFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("signalfd4(SFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	77d2720059	flag parameters: NONBLOCK in socket and socketpair This patch introduces support for the SOCK_NONBLOCK flag in socket, socketpair, and paccept. To do this the internal function sock_attach_fd gets an additional parameter which it uses to set the appropriate flag for the file descriptor. Given that in modern, scalable programs almost all socket connections are non-blocking and the minimal additional cost for the new functionality I see no reason not to add this code. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <pthread.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #include <sys/syscall.h> #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_NONBLOCK O_NONBLOCK static pthread_barrier_t b; static void * tf (void arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); return NULL; } int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("socket(0) set non-blocking mode"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM\|SOCK_NONBLOCK, 0); if (fd == -1) { puts ("socket(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("socket(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("socketpair(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM\|SOCK_NONBLOCK, 0, fds) == -1) { puts ("socketpair(SOCK_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("socketpair(SOCK_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, NULL) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr ) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } fl = fcntl (s2, F_GETFL); if (fl & O_NONBLOCK) { puts ("paccept(0) set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_NONBLOCK); if (s2 < 0) { puts ("paccept(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (s2, F_GETFL); if ((fl & O_NONBLOCK) == 0) { puts ("paccept(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	4006553b06	flag parameters: inotify_init This patch introduces the new syscall inotify_init1 (note: the 1 stands for the one parameter the syscall takes, as opposed to no parameter before). The values accepted for this parameter are function-specific and defined in the inotify.h header. Here the values must match the O_* flags, though. In this patch CLOEXEC support is introduced. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("inotify_init1(0) set close-on-exit"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_CLOEXEC); if (fd == -1) { puts ("inotify_init1(IN_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	ed8cae8ba0	flag parameters: pipe This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	336dd1f70f	flag parameters: dup2 This patch adds the new dup3 syscall. It extends the old dup2 syscall by one parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag is added in this patch. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_dup3 # ifdef __x86_64__ # define __NR_dup3 292 # elif defined __i386__ # define __NR_dup3 330 # else # error "need __NR_dup3" # endif #endif int main (void) { int fd = syscall (__NR_dup3, 1, 4, 0); if (fd == -1) { puts ("dup3(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("dup3(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC); if (fd == -1) { puts ("dup3(O_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("dup3(O_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	a0998b50c3	flag parameters: epoll_create This patch adds the new epoll_create2 syscall. It extends the old epoll_create syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EPOLL_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 1, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	11fcb6c146	flag parameters: timerfd_create The timerfd_create syscall already has a flags parameter. It just is unused so far. This patch changes this by introducing the TFD_CLOEXEC flag to set the close-on-exec flag for the returned file descriptor. A new name TFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("timerfd_create(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC); if (fd == -1) { puts ("timerfd_create(TFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	b087498eb5	flag parameters: eventfd This patch adds the new eventfd2 syscall. It extends the old eventfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("eventfd2(0) sets close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC); if (fd == -1) { puts ("eventfd2(EFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	9deb27baed	flag parameters: signalfd This patch adds the new signalfd4 syscall. It extends the old signalfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is SFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name SFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_CLOEXEC O_CLOEXEC int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("signalfd4(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC); if (fd == -1) { puts ("signalfd4(SFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	7d9dbca342	flag parameters: anon_inode_getfd extension This patch just extends the anon_inode_getfd interface to take an additional parameter with a flag value. The flag value is passed on to get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag. No actual semantic changes here, the changed callers all pass 0 for now. [akpm@linux-foundation.org: KVM fix] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	c019bbc612	flag parameters: paccept w/out set_restore_sigmask Some platforms do not have support to restore the signal mask in the return path from a syscall. For those platforms syscalls like pselect are not defined at all. This is, I think, not a good choice for paccept() since paccept() adds more value on top of accept() than just the signal mask handling. Therefore this patch defines a scaled down version of the sys_paccept function for those platforms. It returns -EINVAL in case the signal mask is non-NULL but behaves the same otherwise. Note that I explicitly included <linux/thread_info.h>. I saw that it is currently included but indirectly two levels down. There is too much risk in relying on this. The header might change and then suddenly the function definition would change without anyone immediately noticing. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	aaca0bdca5	flag parameters: paccept This patch is by far the most complex in the series. It adds a new syscall paccept. This syscall differs from accept in that it adds (at the userlevel) two additional parameters: - a signal mask - a flags value The flags parameter can be used to set flag like SOCK_CLOEXEC. This is imlpemented here as well. Some people argued that this is a property which should be inherited from the file desriptor for the server but this is against POSIX. Additionally, we really want the signal mask parameter as well (similar to pselect, ppoll, etc). So an interface change in inevitable. The flag value is the same as for socket and socketpair. I think diverging here will only create confusion. Similar to the filesystem interfaces where the use of the O_* constants differs, it is acceptable here. The signal mask is handled as for pselect etc. The mask is temporarily installed for the thread and removed before the call returns. I modeled the code after pselect. If there is a problem it's likely also in pselect. For architectures which use socketcall I maintained this interface instead of adding a system call. The symmetry shouldn't be broken. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <errno.h> #include <fcntl.h> #include <pthread.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #include <sys/syscall.h> #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_CLOEXEC O_CLOEXEC static pthread_barrier_t b; static void * tf (void arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); sleep (2); pthread_kill ((pthread_t) arg, SIGUSR1); return NULL; } static void handler (int s) { } int main (void) { pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, (void ) pthread_self ()) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } int coe = fcntl (s2, F_GETFD); if (coe & FD_CLOEXEC) { puts ("paccept(0) set close-on-exec-flag"); return 1; } close (s2); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC); if (s2 < 0) { puts ("paccept(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (s2, F_GETFD); if ((coe & FD_CLOEXEC) == 0) { puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (s2); pthread_barrier_wait (&b); struct sigaction sa; sa.sa_handler = handler; sa.sa_flags = 0; sigemptyset (&sa.sa_mask); sigaction (SIGUSR1, &sa, NULL); sigset_t ss; pthread_sigmask (SIG_SETMASK, NULL, &ss); sigaddset (&ss, SIGUSR1); pthread_sigmask (SIG_SETMASK, &ss, NULL); sigdelset (&ss, SIGUSR1); alarm (4); pthread_barrier_wait (&b); errno = 0 ; s2 = paccept (s, NULL, 0, &ss, 0); if (s2 != -1 \|\| errno != EINTR) { puts ("paccept did not fail with EINTR"); return 1; } close (s); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: make it compile] [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Roland McGrath <roland@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	a677a039be	flag parameters: socket and socketpair This patch adds support for flag values which are ORed to the type passwd to socket and socketpair. The additional code is minimal. The flag values in this implementation can and must match the O_* flags. This avoids overhead in the conversion. The internal functions sock_alloc_fd and sock_map_fd get a new parameters and all callers are changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #define PORT 57392 /* For Linux these must be the same. */ #define SOCK_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("socket(0) set close-on-exec flag"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM\|SOCK_CLOEXEC, 0); if (fd == -1) { puts ("socket(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM\|SOCK_CLOEXEC, 0, fds) == -1) { puts ("socketpair(SOCK_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Adrian Bunk	f606ddf42f	remove the v850 port Trying to compile the v850 port brings many compile errors, one of them exists since at least kernel 2.6.19. There also seems to be noone willing to bring this port back into a usable state. This patch therefore removes the v850 port. If anyone ever decides to revive the v850 port the code will still be available from older kernels, and it wouldn't be impossible for the port to reenter the kernel if it would become actively maintained again. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:24 -07:00
Shaohua Li	bdfe6b7c68	pm: acpi hibernation: utilize hardware signature ACPI defines a hardware signature. BIOS calculates the signature according to hardware configure and if hardware changes while hibernated, the signature will change. In that case, S4 resume should fail. Still, there may be systems on which this mechanism does not work correctly, so it is better to provide a workaround for them. For this reason, add a new switch to the acpi_sleep= command line argument allowing one to disable hardware signature checking. [shaohua.li@intel.com: build fix] Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: <Valdis.Kletnieks@vt.edu> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:24 -07:00
Zhang Rui	c1a220e7ac	pm: introduce new interfaces schedule_work_on() and queue_work_on() This interface allows adding a job on a specific cpu. Although a work struct on a cpu will be scheduled to other cpu if the cpu dies, there is a recursion if a work task tries to offline the cpu it's running on. we need to schedule the task to a specific cpu in this case. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [oleg@tv-sign.ru: cleanups] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Rus <harbour@sfinx.od.ua> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Alan Stern	8111d1b552	pm: add new PM_EVENT codes for runtime power transitions This patch (as1112) adds some new PM_EVENT_* codes for use by kernel subsystems. They describe runtime power-state transitions of the sort already implemented by the USB subsystem. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Rafael J. Wysocki	8c363265d5	pm: drop unnecessary includes from pm.h Drop unnecessary includes from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Rafael J. Wysocki	e7ecb331e1	pm: remove remaining obsolete definitions from pm.h Remove the remaining obsolete definitions from include/linux/pm.h and move the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which is the only user of them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Rafael J. Wysocki	558481f038	pm: remove definition of struct pm_dev Remove the definition of 'struct pm_dev', which is not used any more, along with some related stuff from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Adrian Bunk	d75f65fd24	remove include/linux/pm_legacy.h Remove the obsolete and no longer used include/linux/pm_legacy.h Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Pavel Machek <pavel@suse.cz> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Hugh Dickins	9b3e43a747	security: remove unused forwards Why would linux/security.h need forward declarations for nfsctl_arg and swap_info_struct? It's hard to imagine: remove them. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Andrew G. Morgan	5459c164f0	security: protect legacy applications from executing with insufficient privilege When cap_bset suppresses some of the forced (fP) capabilities of a file, it is generally only safe to execute the program if it understands how to recognize it doesn't have enough privilege to work correctly. For legacy applications (fE!=0), which have no non-destructive way to determine that they are missing privilege, we fail to execute (EPERM) any executable that requires fP capabilities, but would otherwise get pP' < fP. This is a fail-safe permission check. For some discussion of why it is problematic for (legacy) privileged applications to run with less than the set of capabilities requested for them, see: http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html With this iteration of this support, we do not include setuid-0 based privilege protection from the bounding set. That is, the admin can still (ab)use the bounding set to suppress the privileges of a setuid-0 program. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: cleanup] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Gerald Schaefer	83d1674a94	mm: make CONFIG_MIGRATION available w/o CONFIG_NUMA We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA support. This patch makes CONFIG_MIGRATION selectable for architectures that define ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the kernel won't compile because migrate_vmas() does not know about vm_ops->migrate() and vma_migratable() does not know about policy_zone. To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA' because they are not being used w/o NUMA. vma_migratable() is moved over from migrate.h to mempolicy.h. [kosaki.motohiro@jp.fujitsu.com: build fix] Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Milton Miller	9ca908f47b	kcalloc: remove runtime division While in all cases in the kernel we know the size of the elements to be created, we don't always know the count of elements. By commuting the size and count in the overflow check, the compiler can reduce the runtime division of size_t with a compare to a (unique) constant in these cases. Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Badari Pulavarty	5c755e9fd8	memory-hotplug: add sysfs removable attribute for hotplug memory remove Memory may be hot-removed on a per-memory-block basis, particularly on POWER where the SPARSEMEM section size often matches the memory-block size. A user-level agent must be able to identify which sections of memory are likely to be removable before attempting the potentially expensive operation. This patch adds a file called "removable" to the memory directory in sysfs to help such an agent. In this patch, a memory block is considered removable if; o It contains only MOVABLE pageblocks o It contains only pageblocks with free pages regardless of pageblock type On the other hand, a memory block starting with a PageReserved() page will never be considered removable. Without this patch, the user-agent is forced to choose a memory block to remove randomly. Sample output of the sysfs files: ./memory/memory0/removable: 0 ./memory/memory1/removable: 0 ./memory/memory2/removable: 0 ./memory/memory3/removable: 0 ./memory/memory4/removable: 0 ./memory/memory5/removable: 0 ./memory/memory6/removable: 0 ./memory/memory7/removable: 1 ./memory/memory8/removable: 0 ./memory/memory9/removable: 0 ./memory/memory10/removable: 0 ./memory/memory11/removable: 0 ./memory/memory12/removable: 0 ./memory/memory13/removable: 0 ./memory/memory14/removable: 0 ./memory/memory15/removable: 0 ./memory/memory16/removable: 0 ./memory/memory17/removable: 1 ./memory/memory18/removable: 1 ./memory/memory19/removable: 1 ./memory/memory20/removable: 1 ./memory/memory21/removable: 1 ./memory/memory22/removable: 1 Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Yasunori Goto	af370fb8cb	memory hotplug: small fixes to bootmem freeing for memory hotremove - Change some naming * Magic -> types * MIX_INFO -> MIX_SECTION_INFO * Change definition of bootmem type from direct hex value - __free_pages_bootmem() becomes __meminit. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Andrea Righi	27ac792ca0	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Timur Tabi	2be0ffe2b2	mm: add alloc_pages_exact() and free_pages_exact() alloc_pages_exact() is similar to alloc_pages(), except that it allocates the minimum number of pages to fulfill the request. This is useful if you want to allocate a very large buffer that is slightly larger than an even power-of-two number of pages. In that case, alloc_pages() will waste a lot of memory. I have a video driver that wants to allocate a 5MB buffer. alloc_pages() wiill waste 3MB of physically-contiguous memory. Signed-off-by: Timur Tabi <timur@freescale.com> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	3560e249ab	bootmem: replace node_boot_start in struct bootmem_data Almost all users of this field need a PFN instead of a physical address, so replace node_boot_start with node_min_pfn. [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()] Signed-off-by: Johannes Weiner <hannes@saeureba.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	5f2809e69c	bootmem: clean up alloc_bootmem_core alloc_bootmem_core has become quite nasty to read over time. This is a clean rewrite that keeps the semantics. bdata->last_pos has been dropped. bdata->last_success has been renamed to hint_idx and it is now an index relative to the node's range. Since further block searching might start at this index, it is now set to the end of a succeeded allocation rather than its beginning. bdata->last_offset has been renamed to last_end_off to be more clear that it represents the ending address of the last allocation relative to the node. [y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	223e8dc924	bootmem: reorder code to match new bootmem structure This only reorders functions so that further patches will be easier to read. No code changed. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Jon Tollefson	53ba51d21d	hugetlb: allow arch overridden hugepage allocation Allow alloc_bootmem_huge_page() to be overridden by architectures that can't always use bootmem. This requires huge_boot_pages to be available for use by this function. This is required for powerpc 16G pages, which have to be reserved prior to boot-time. The location of these pages are indicated in the device tree. Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Andi Kleen	ceb8687961	hugetlb: introduce pud_huge Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:18 -07:00
Andi Kleen	b54bbf7b81	mm: introduce non panic alloc_bootmem Straight forward variant of the existing __alloc_bootmem_node, only subsequent patch when allocating giant hugepages at boot -- don't want to panic if we can't allocate as many as the user asked for. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Nishanth Aravamudan	a343787016	hugetlb: new sysfs interface Provide new hugepages user APIs that are more suited to multiple hstates in sysfs. There is a new directory, /sys/kernel/hugepages. Underneath that directory there will be a directory per-supported hugepage size, e.g.: /sys/kernel/hugepages/hugepages-64kB /sys/kernel/hugepages/hugepages-16384kB /sys/kernel/hugepages/hugepages-16777216kB corresponding to 64k, 16m and 16g respectively. Within each hugepages-size directory there are a number of files, corresponding to the tracked counters in the hstate, e.g.: /sys/kernel/hugepages/hugepages-64/nr_hugepages /sys/kernel/hugepages/hugepages-64/nr_overcommit_hugepages /sys/kernel/hugepages/hugepages-64/free_hugepages /sys/kernel/hugepages/hugepages-64/resv_hugepages /sys/kernel/hugepages/hugepages-64/surplus_hugepages Of these files, the first two are read-write and the latter three are read-only. The size of the hugepage being manipulated is trivially deducible from the enclosing directory and is always expressed in kB (to match meminfo). [dave@linux.vnet.ibm.com: fix build] [nacc@us.ibm.com: hugetlb: hang off of /sys/kernel/mm rather than /sys/kernel] [nacc@us.ibm.com: hugetlb: remove CONFIG_SYSFS dependency] Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	a137e1cc6d	hugetlbfs: per mount huge page sizes Add the ability to configure the hugetlb hstate used on a per mount basis. - Add a new pagesize= option to the hugetlbfs mount that allows setting the page size - This option causes the mount code to find the hstate corresponding to the specified size, and sets up a pointer to the hstate in the mount's superblock. - Change the hstate accessors to use this information rather than the global_hstate they were using (requires a slight change in mm/memory.c so we don't NULL deref in the error-unmap path -- see comments). [np: take hstate out of hugetlbfs inode and vma->vm_private_data] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	e5ff215941	hugetlb: multiple hstates for multiple page sizes Add basic support for more than one hstate in hugetlbfs. This is the key to supporting multiple hugetlbfs page sizes at once. - Rather than a single hstate, we now have an array, with an iterator - default_hstate continues to be the struct hstate which we use by default - Add functions for architectures to register new hstates [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	a551643895	hugetlb: modular state for hugetlb page size The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Nishanth Aravamudan	ff7ea79cf7	mm: create /sys/kernel/mm Add a kobject to create /sys/kernel/mm when sysfs is mounted. The kobject will exist regardless. This will allow for the hugepage related sysfs directories to exist under the mm "subsystem" directory. Add an ABI file appropriately. [kosaki.motohiro@jp.fujitsu.com: fix build] Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andy Whitcroft	cdfd4325c0	mm: record MAP_NORESERVE status on vmas and fix small page mprotect reservations With Mel's hugetlb private reservation support patches applied, strict overcommit semantics are applied to both shared and private huge page mappings. This can be a problem if an application relied on unlimited overcommit semantics for private mappings. An example of this would be an application which maps a huge area with the intention of using it very sparsely. These application would benefit from being able to opt-out of the strict overcommit. It should be noted that prior to hugetlb supporting demand faulting all mappings were fully populated and so applications of this type should be rare. This patch stack implements the MAP_NORESERVE mmap() flag for huge page mappings. This flag has the same meaning as for small page mappings, suppressing reservations for that mapping. Thanks to Mel Gorman for reviewing a number of early versions of these patches. This patch: When a small page mapping is created with mmap() reservations are created by default for any memory pages required. When the region is read/write the reservation is increased for every page, no reservation is needed for read-only regions (as they implicitly share the zero page). Reservations are tracked via the VM_ACCOUNT vma flag which is present when the region has reservation backing it. When we convert a region from read-only to read-write new reservations are aquired and VM_ACCOUNT is set. However, when a read-only map is created with MAP_NORESERVE it is indistinguishable from a normal mapping. When we then convert that to read/write we are forced to incorrectly create reservations for it as we have no record of the original MAP_NORESERVE. This patch introduces a new vma flag VM_NORESERVE which records the presence of the original MAP_NORESERVE flag. This allows us to distinguish these two circumstances and correctly account the reserve. As well as fixing this FIXME in the code, this makes it much easier to introduce MAP_NORESERVE support for huge pages as this flag is available consistantly for the life of the mapping. VM_ACCOUNT on the other hand is heavily used at the generic level in association with small pages. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Adam Litke <agl@us.ibm.com> Cc: Johannes Weiner <hannes@saeurebad.de> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Mel Gorman	04f2cbe356	hugetlb: guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed After patch 2 in this series, a process that successfully calls mmap() for a MAP_PRIVATE mapping will be guaranteed to successfully fault until a process calls fork(). At that point, the next write fault from the parent could fail due to COW if the child still has a reference. We only reserve pages for the parent but a copy must be made to avoid leaking data from the parent to the child after fork(). Reserves could be taken for both parent and child at fork time to guarantee faults but if the mapping is large it is highly likely we will not have sufficient pages for the reservation, and it is common to fork only to exec() immediatly after. A failure here would be very undesirable. Note that the current behaviour of mainline with MAP_PRIVATE pages is pretty bad. The following situation is allowed to occur today. 1. Process calls mmap(MAP_PRIVATE) 2. Process calls mlock() to fault all pages and makes sure it succeeds 3. Process forks() 4. Process writes to MAP_PRIVATE mapping while child still exists 5. If the COW fails at this point, the process gets SIGKILLed even though it had taken care to ensure the pages existed This patch improves the situation by guaranteeing the reliability of the process that successfully calls mmap(). When the parent performs COW, it will try to satisfy the allocation without using reserves. If that fails the parent will steal the page leaving any children without a page. Faults from the child after that point will result in failure. If the child COW happens first, an attempt will be made to allocate the page without reserves and the child will get SIGKILLed on failure. To summarise the new behaviour: 1. If the original mapper performs COW on a private mapping with multiple references, it will attempt to allocate a hugepage from the pool or the buddy allocator without using the existing reserves. On fail, VMAs mapping the same area are traversed and the page being COW'd is unmapped where found. It will then steal the original page as the last mapper in the normal way. 2. The VMAs the pages were unmapped from are flagged to note that pages with data no longer exist. Future no-page faults on those VMAs will terminate the process as otherwise it would appear that data was corrupted. A warning is printed to the console that this situation occured. 2. If the child performs COW first, it will attempt to satisfy the COW from the pool if there are enough pages or via the buddy allocator if overcommit is allowed and the buddy allocator can satisfy the request. If it fails, the child will be killed. If the pool is large enough, existing applications will not notice that the reserves were a factor. Existing applications depending on the no-reserves been set are unlikely to exist as for much of the history of hugetlbfs, pages were prefaulted at mmap(), allocating the pages at that point or failing the mmap(). [npiggin@suse.de: fix CONFIG_HUGETLB=n build] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Mel Gorman	a1e78772d7	hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork() This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in a similar manner to the reservations taken for MAP_SHARED mappings. The reserve count is accounted both globally and on a per-VMA basis for private mappings. This guarantees that a process that successfully calls mmap() will successfully fault all pages in the future unless fork() is called. The characteristics of private mappings of hugetlbfs files behaviour after this patch are; 1. The process calling mmap() is guaranteed to succeed all future faults until it forks(). 2. On fork(), the parent may die due to SIGKILL on writes to the private mapping if enough pages are not available for the COW. For reasonably reliable behaviour in the face of a small huge page pool, children of hugepage-aware processes should not reference the mappings; such as might occur when fork()ing to exec(). 3. On fork(), the child VMAs inherit no reserves. Reads on pages already faulted by the parent will succeed. Successful writes will depend on enough huge pages being free in the pool. 4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper and at fault time otherwise. Before this patch, all reads or writes in the child potentially needs page allocations that can later lead to the death of the parent. This applies to reads and writes of uninstantiated pages as well as COW. After the patch it is only a write to an instantiated page that causes problems. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Johannes Weiner	9109fb7b35	mm: drop unneeded pgdat argument from free_area_init_node() free_area_init_node() gets passed in the node id as well as the node descriptor. This is redundant as the function can trivially get the node descriptor itself by means of NODE_DATA() and the node's id. I checked all the users and NODE_DATA() seems to be usable everywhere from where this function is called. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Andrew Morton	2185e69f68	mapping_set_error: add unlikely() This is called on a per-page basis and in the vast majority of cases `error' is zero. Cc: Guillaume Chazarain <guichaz@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	9023cb7e85	slob: record page flag overlays explicitly SLOB reuses two page bits for internal purposes, it overlays PG_active and PG_private. This is hidden away in slob.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	8a38082d21	slub: record page flag overlays explicitly SLUB reuses two page bits for internal purposes, it overlays PG_active and PG_error. This is hidden away in slub.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	0cad47cf13	page-flags: record page flag overlays explicitly With the recent page flag reorganisation we have a single enum which defines the valid page flags and their values, nice and clear. However there are a number of bits which are overloaded by different subsystems. Firstly there is PG_owner_priv_1 which is used by filesystems and by XEN. Secondly both SLOB and SLUB use a couple of extra page bits to manage internal state for pages they own; both overlay other bits. All of these "aliases" are scattered about the source making it very hard for a reader to know if the bits are safe to rely on in all contexts; confusion here is bad. As we now have a single place where the bits are clearly assigned it makes sense to clarify the reuse of bits by making the aliases explicit and visible with the original bit assignments. This patch creates explicit aliases within the enum itself for the overloaded bits, creates standard bit accessors PageFoo etc. and uses those throughout. This version pulls the bit manipulation out to standard named page bit accessors as suggested by Christoph, it retains the explicit mapping to the overlayed bits. A fusion of both ideas. This has been SLUB and SLOB have been compile tested on x86_64 only, and SLUB boot tested. If people feel this is worth doing then I can run a fuller set of testing. This patch: Some page flags are used for more than one purpose, for example PG_owner_priv_1. Currently there are individual accessors for each user, each built using the common flag name far away from the bit definitions. This makes it hard to see all possible uses of these bits. Now that we have a single enum to generate the bit orders it makes sense to express overlays in the same place. So create per use aliases for this bit in the main page-flags enum and use those in the accessors. [akpm@linux-foundation.org: fix xen] Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Kentaro Makita	da3bbdd463	fix soft lock up at NFS mount via per-SB LRU-list of unused dentries [Summary] Split LRU-list of unused dentries to one per superblock to avoid soft lock up during NFS mounts and remounting of any filesystem. Previously I posted here: http://lkml.org/lkml/2008/3/5/590 [Descriptions] - background dentry_unused is a list of dentries which are not referenced. dentry_unused grows up when references on directories or files are released. This list can be very long if there is huge free memory. - the problem When shrink_dcache_sb() is called, it scans all dentry_unused linearly under spin_lock(), and if dentry->d_sb is differnt from given superblock, scan next dentry. This scan costs very much if there are many entries, and very ineffective if there are many superblocks. IOW, When we need to shrink unused dentries on one dentry, but scans unused dentries on all superblocks in the system. For example, we scan 500 dentries to unmount a filesystem, but scans 1,000,000 or more unused dentries on other superblocks. In our case , At mounting NFS, shrink_dcache_sb() is called to shrink unused dentries on NFS, but scans 100,000,000 unused dentries on superblocks in the system such as local ext3 filesystems. I hear NFS mounting took 1 min on some system in use. : NFS uses virtual filesystem in rpc layer, so NFS is affected by this problem. 100,000,000 is possible number on large systems. Per-superblock LRU of unused dentried can reduce the cost in reasonable manner. - How to fix I found this problem is solved by David Chinner's "Per-superblock unused dentry LRU lists V3"(1), so I rebase it and add some fix to reclaim with fairness, which is in Andrew Morton's comments(2). 1) http://lkml.org/lkml/2006/5/25/318 2) http://lkml.org/lkml/2006/5/25/320 Split LRU-list of unused dentries to each superblocks. Then, NFS mounting will check dentries under a superblock instead of all. But this spliting will break LRU of dentry-unused. So, I've attempted to make reclaim unused dentrins with fairness by calculate number of dentries to scan on this sb based on following way number of dentries to scan on this sb = count * (number of dentries on this sb / number of dentries in the machine) - ToDo - I have to measuring performance number and do stress tests. - When unmount occurs during prune_dcache(), scanning on same superblock, It is unable to reach next superblock because it is gone away. We restart scannig superblock from first one, it causes unfairness of reclaim unused dentries on first superblock. But I think this happens very rarely. - Test Results Result on 6GB boxes with excessive unused dentries. Without patch: $ cat /proc/sys/fs/dentry-state 10181835 10180203 45 0 0 0 # mount -t nfs 10.124.60.70:/work/kernel-src nfs real 0m1.830s user 0m0.001s sys 0m1.653s With this patch: $ cat /proc/sys/fs/dentry-state 10236610 10234751 45 0 0 0 # mount -t nfs 10.124.60.70:/work/kernel-src nfs real 0m0.106s user 0m0.002s sys 0m0.032s [akpm@linux-foundation.org: fix comments] Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: David Chinner <dgc@sgi.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Jan Beulich	42b7772812	mm: remove double indirection on tlb parameter to free_pgd_range() & Co The double indirection here is not needed anywhere and hence (at least) confusing. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Rik van Riel	28b2ee20c7	access_process_vm device memory infrastructure In order to be able to debug things like the X server and programs using the PPC Cell SPUs, the debugger needs to be able to access device memory through ptrace and /proc/pid/mem. This patch: Add the generic_access_phys access function and put the hooks in place to allow access_process_vm to access device or PPC Cell SPU memory. [riel@redhat.com: Add documentation for the vm_ops->access function] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Nick Piggin	0d71d10a42	mm: remove nopfn There are no users of nopfn in the tree. Remove it. [hugh@veritas.com: fix build error] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Adrian Bunk	c748e1340e	mm/vmstat.c: proper externs This patch adds proper extern declarations for five variables in include/linux/vmstat.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
KOSAKI Motohiro	e4048e5dc4	page allocator: inline some __alloc_pages() wrappers Two zonelist patch series rewrote __page_alloc() largely. Now, it is just a wrapper function. Inlining them will save a function call. [akpm@linux-foundation.org: export __alloc_pages_internal] Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Johannes Weiner	ffc6421f07	mm: unexport __alloc_bootmem_core() This function has no external callers, so unexport it. Also fix its naming inconsistency. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Johannes Weiner	b61bfa3c46	mm: move bootmem descriptors definition to a single place There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
FUJITA Tomonori	8b05c7e6e1	add a helper function to test if an object is on the stack lib/debugobjects.c has a function to test if an object is on the stack. The block layer and ide needs it (they need to avoid DMA from/to stack buffers). This patch moves the function to include/linux/sched.h so that everyone can use it. lib/debugobjects.c uses current->stack but this patch uses a task_stack_page() accessor, which is a preferable way to access the stack. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Akinobu Mita	e108526e77	move memory_read_from_buffer() from fs.h to string.h James Bottomley warns that inclusion of linux/fs.h in a low level driver was always a danger signal. This patch moves memory_read_from_buffer() from fs.h to string.h and fixes includes in existing memory_read_from_buffer() users. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Bob Moore <robert.moore@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:13 -07:00
Matthew Wilcox	b552068999	Remove __DECLARE_SEMAPHORE_GENERIC There are no users of __DECLARE_SEMAPHORE_GENERIC in the kernel Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-07-24 08:31:21 -04:00
Artem Bityutskiy	85c6e6e282	UBI: amend commentaries Hch asked not to use "unit" for sub-systems, let it be so. Also some other commentaries modifications. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:56 +03:00
Artem Bityutskiy	a5bf619041	UBI: add ubi_sync() interface To flush MTD device caches. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:56 +03:00
Linus Torvalds	7f9dce3837	Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: hrtick_enabled() should use cpu_active() sched, x86: clean up hrtick implementation sched: fix build error, provide partition_sched_domains() unconditionally sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed cpu hotplug: Make cpu_active_map synchronization dependency clear cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) sched: rework of "prioritize non-migratable tasks over migratable ones" sched: reduce stack size in isolated_cpu_setup() Revert parts of "ftrace: do not trace scheduler functions" Fixed up conflicts in include/asm-x86/thread_info.h (due to the TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr() introduction).	2008-07-23 19:36:53 -07:00
Linus Torvalds	26dcce0fab	Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually	2008-07-23 18:37:44 -07:00
Linus Torvalds	d7b6de14a0	Merge branch 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softlockup: fix invalid proc_handler for softlockup_panic softlockup: fix watchdog task wakeup frequency softlockup: fix watchdog task wakeup frequency softlockup: show irqtrace softlockup: print a module list on being stuck softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds softlockup: fix softlockup_thresh fix softlockup: fix softlockup_thresh unaligned access and disable detection at runtime softlockup: allow panic on lockup	2008-07-23 18:34:13 -07:00
Linus Torvalds	30d38542ec	Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm * 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (85 commits) [ARM] pxa: add base support for PXA930 Handheld Platform (aka SAAR) [ARM] pxa: add base support for PXA930 Evaluation Board (aka TavorEVB) [ARM] pxa: add base support for PXA930 (aka Tavor-P) [ARM] Update mach-types [ARM] pxa: make littleton to use the new smc91x platform data [ARM] pxa: make zylonite to use the new smc91x platform data [ARM] pxa: make mainstone to use the new smc91x platform data [ARM] pxa: make lubbock to use new smc91x platform data [NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data [NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable [NET] smc91x: add SMC91X_NOWAIT flag to platform data [NET] smc91x: favor the use of SMC91X_USE_* instead of SMC_CAN_USE_* [NET] smc91x: remove "irq_flags" from "struct smc91x_platdata" [ARM] 5146/1: pxa2xx: convert all boards to call pxa2xx_transceiver_mode helper Support for LCD on e740 e750 e400 and e800 e-series PDAs E-series UDC support PXA UDC - allow use of inverted GPIO for pullup Add e350 support Fix broken e-series build E-series GPIO / IRQ definitions. ...	2008-07-23 18:24:08 -07:00
Dan Williams	d8e64406a0	md: delay notification of 'active_idle' to the recovery thread sysfs_notify might sleep, so do not call it from md_safemode_timeout. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-23 13:09:48 -07:00
Linus Torvalds	20b7997e8a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: sdhci: highmem capable PIO routines sg: reimplement sg mapping iterator mmc_test: print message when attaching to card mmc: Remove Russell as primecell mci maintainer mmc_block: bounce buffer highmem support sdhci: fix bad warning from commit `c8b3e02` sdhci: add warnings for bad buffers in ADMA path mmc_test: test oversized sg lists mmc_test: highmem tests s3cmci: ensure host stopped on machine shutdown au1xmmc: suspend/resume implementation s3cmci: fixes for section mismatch warnings pxamci: trivial fix of DMA alignment register bit clearing	2008-07-23 12:04:34 -07:00
Linus Torvalds	5554b35933	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (24 commits) I/OAT: I/OAT version 3.0 support I/OAT: tcp_dma_copybreak default value dependent on I/OAT version I/OAT: Add watchdog/reset functionality to ioatdma iop_adma: cleanup iop_chan_xor_slot_count iop_adma: document how to calculate the minimum descriptor pool size iop_adma: directly reclaim descriptors on allocation failure async_tx: make async_tx_test_ack a boolean routine async_tx: remove depend_tx from async_tx_sync_epilog async_tx: export async_tx_quiesce async_tx: fix handling of the "out of descriptor" condition in async_xor async_tx: ensure the xor destination buffer remains dma-mapped async_tx: list_for_each_entry_rcu() cleanup dmaengine: Driver for the Synopsys DesignWare DMA controller dmaengine: Add slave DMA interface dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap dmaengine: Add dma_client parameter to device_alloc_chan_resources dmatest: Simple DMA memcpy test client dmaengine: DMA engine driver for Marvell XOR engine iop-adma: fix platform driver hotplug/coldplug dmaengine: track the number of clients using a channel ... Fixed up conflict in drivers/dca/dca-sysfs.c manually	2008-07-23 12:03:18 -07:00
Linus Torvalds	e669e8179d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (60 commits) ide: small whitespace fixes ide: ide-cd_ioctl.c fix sparse integer as NULL pointer warnings ide: ide-cd.c fix sparse endianness warnings ide-cd: convert to using the new atapi_flags ide: remove unused PC_FLAG_DRQ_INTERRUPT ide-scsi: convert to using the new atapi_flags ide-tape: convert to using the new atapi_flags ide-floppy: convert to using the new atapi_flags (take 2) ide: add per-device flags ide: use rq->cmd instead of pc->c in atapi common code ide-scsi: pass packet command in rq->cmd ide-tape: pass packet command in rq->cmd ide-tape: make room for packet command ids in rq->cmd ide-floppy: pass packet command in rq->cmd ide: remove pc->callback member from ide_atapi_pc ide-scsi: use drive->pc_callback instead of pc->callback ide-tape: use drive->pc_callback instead of pc->callback ide-floppy: use drive->pc_callback instead of pc->callback ide: push pc callback pointer into the ide_drive_t structure drivers/ide/ide-tape.c: remove double kfree ...	2008-07-23 11:59:09 -07:00
Dmitry Torokhov	a822bea796	Input: serio - mark serio_register_driver() __must_check Also remove extra declaration of serio_register_driver(). Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-23 14:01:49 -04:00
Borislav Petkov	ac77ef8b03	ide: remove unused PC_FLAG_DRQ_INTERRUPT There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	ea68d270ff	ide-floppy: convert to using the new atapi_flags (take 2) while at it, remove PC_FLAG_ZIP_DRIVE from the packed command flags altogether and query the drive type through drive->atapi_flags. v2: ide-floppy fix. There should be no functionality change resulting from this patch. [bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags] Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	3b8ac5398c	ide: add per-device flags Push device flags up into ide_drive_t. There should be no functionality change resulting from this patch. [bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags] Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	8bcda3bc49	ide: remove pc->callback member from ide_atapi_pc There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:00 +02:00
Borislav Petkov	d7c26ebb5b	ide: push pc callback pointer into the ide_drive_t structure Refrain from carrying the callback ptr with every packet command since the callback function is only one anyways. ide_drive_t is probably not the most suitable place for it right now but is the more sane solution. Besides, these structs are going to be reorganized anyways during the generic ide rewrite. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz	8a69580e1e	ide: add ide_host_free() helper (take 2) * Add ide_host_free() helper and convert ide_host_remove() to use it. * Fix handling of ide_host_register() failure in ide_host_add(), icside.c, ide-generic.c, falconide.c and sgiioc4.c. While at it: * Fix handling of ide_host_alloc_all() failure in ide-generic.c. * Fix handling of ide_host_alloc() failure in falconide.c (also return the correct error value if no device is found). v2: * falconide build fix. (From Stephen Rothwell) Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz	6f904d0152	ide: add ide_host_add() helper Add ide_host_add() helper which does ide_host_alloc()+ide_host_register(), then convert ide_setup_pci_device[s](), ide_legacy_device_add() and some host drivers to use it. While at it: * Fix ide_setup_pci_device[s](), ide_arm.c, gayle.c, ide-4drives.c, macide.c, q40ide.c, cmd640.c and cs5520.c to return correct error value. * -ENOENT -> -ENOMEM in rapide.c, ide-h8300.c, ide-generic.c, au1xxx-ide.c and pmac.c * -ENODEV -> -ENOMEM in palm_bk3710.c, ide_platform.c and delkin_cb.c * -1 -> -ENOMEM in ide-pnp.c Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz	48c3c10726	ide: add struct ide_host (take 3) * Add struct ide_host which keeps pointers to host's ports. * Add ide_host_alloc[_all]() and ide_host_remove() helpers. * Pass 'struct ide_host host' instead of 'u8 idx' to ide_device_add[_all]() and rename it to ide_host_register[_all](). * Convert host drivers and core code to use struct ide_host. * Remove no longer needed ide_find_port(). * Make ide_find_port_slot() static. * Unexport ide_unregister(). v2: * Add missing 'struct ide_host host' to macide.c. v3: Fix build problem in pmac.c (s/ide_alloc_host/ide_host_alloc/) (Noticed by Stephen Rothwell). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz	374e042c3e	ide: add struct ide_tp_ops (take 2) * Add struct ide_tp_ops for transport methods. * Add 'const struct ide_tp_ops tp_ops' to struct ide_port_info and ide_hwif_t. Set the default hwif->tp_ops in ide_init_port_data(). * Set host driver specific hwif->tp_ops in ide_init_port(). * Export ide_exec_command(), ide_read_status(), ide_read_altstatus(), ide_read_sff_dma_status(), ide_set_irq(), ide_tf_{load,read}() and ata_{in,out}put_data(). * Convert host drivers and core code to use struct ide_tp_ops. * Remove no longer needed default_hwif_transport(). * Cleanup ide_hwif_t from methods that are now in struct ide_tp_ops. While at it: * Use struct ide_port_info in falconide.c and q40ide.c. * Rename ata_{in,out}put_data() to ide_{in,out}put_data(). v2: * Fix missing convertion in ns87415.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	d6276b5f5c	ide: add 'config' field to hw_regs_t Add 'config' field to hw_regs_t and use it to set hwif->config_data in ide_init_port_hw(), then convert ide_legacy_init_one() to use hw->config. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	3b2a5c7149	ide: filter out "default" transfer mode values in set_xfer_rate() * Filter out "default" transfer mode values (0x00 - default PIO mode, 0x01 - default PIO mode w/ IORDY disabled) in write handler for obsoleted /proc/ide/hd?/settings:current_speed setting. Allowing "default" transfer mode values is a dangerous thing to do as we don't support programming controller to the "default" transfer mode and devices often use different values for the default and maximum PIO mode (i.e. PIO2 default and PIO4 maximum) so the controller will stay programmed for higher PIO mode while device will use the lower PIO mode. There is no functionality loss as by using special IOCTLs device can still be programmed to "default" transfer modes (it is only useful for debugging/testing purposes anyway). * Remove no longer needed IDE_HFLAG_ABUSE_SET_DMA_MODE host flag, it was previously used by few host drivers to program the controller to PIO0 timings for "default" transfer mode == 0x01 (although some host drivers would program invalid PIO timings instead). * Cleanup ide_set_xfer_rate() and add BUG_ON(). Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	ba4b2e607e	ide: remove dead Virtual DMA support Lets remove dead Virtual DMA support for now so it doesn't clutter core IDE code (it can be bring back when there is a need for it): * Remove IDE_HFLAG_VDMA host flag. * Remove ide_drive_t.vdma flag. * cs5520.c: remove stale FIXMEs, cs5520_dma_host_set() and cs5520_dma_ops (also there is no longer a need to set IDE_HFLAG_NO_ATAPI_DMA). There should be no functional changes caused by this patch. Cc: TAKADA Yoshihito <takada@mbf.nifty.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:55 +02:00
Bartlomiej Zolnierkiewicz	761052e676	ide: remove ->INB, ->OUTB and ->OUTBSYNC methods * Remove no longer needed ->INB, ->OUTB and ->OUTBSYNC methods. Then: * Remove no longer used default_hwif_[mm]iops() and ide_[mm_]outbsync(). * Cleanup SuperIO handling in ns87415.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz	1823649b5a	ide: add ide_read_bcount_and_ireason() helper Add ide_read_bcount_and_ireason() helper and use it instead of ->INB in {cdrom_newpc,ide_pc}_intr(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz	92eb43800a	ide: use ->tf_read in ide_read_error() * Add IDE_TFLAG_IN_FEATURE taskfile flag for reading Feature register and handle it in ->tf_read. * Convert ide_read_error() to use ->tf_read instead of ->INB, then uninline and export it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:53 +02:00
Bartlomiej Zolnierkiewicz	6e6afb3b74	ide: add ->set_irq method Add ->set_irq method for setting nIEN bit of ATA Device Control register and use it instead of ide_set_irq(). While at it: * Use ->set_irq in init_irq() and do_reset1(). * Don't use HWIF() macro in ide_check_pm_state(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	1f6d8a0fd8	ide: add ->read_altstatus method * Remove ide_read_altstatus() inline helper. * Add ->read_altstatus method for reading ATA Alternate Status register and use it instead of ->INB. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	b73c7ee25d	ide: add ->read_status method * Remove ide_read_status() inline helper. * Add ->read_status method for reading ATA Status register and use it instead of ->INB. While at it: * Don't use HWGROUP() macro. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	c6dfa867bb	ide: add ->exec_command method Add ->exec_command method for writing ATA Command register and use it instead of ->OUTBSYNC. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	ebb00fb55d	ide: factor out simplex handling from ide_pci_dma_base() * Factor out simplex handling from ide_pci_dma_base() to ide_pci_check_simplex(). * Set hwif->dma_base early in ->init_dma method / ide_hwif_setup_dma() and reset it in ide_init_port() if DMA initialization fails. * Use ->read_sff_dma_status instead of ->INB in ide_pci_dma_base(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	81e8d5a34f	ide: remove ide_setup_dma() Export sff_dma_ops and then remove ide_setup_dma(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	cab7f8eda4	ide: remove ->dma_{status,command} fields from ide_hwif_t * Use ->dma_base + offset instead of ->dma_{status,command} and remove no longer needed ->dma_{status,command}. While at it: * Use ATA_DMA_* defines. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	b2f951aabc	ide: add ->read_sff_dma_status method Add ->read_sff_dma_status method for reading DMA Status register and use it instead of ->INB. While at it: * Use inb() directly in ns87415.c::ns87415_dma_end(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:50 +02:00
Bartlomiej Zolnierkiewicz	c97c6aca75	ide: pass hw_regs_t-s to ide_device_add[_all]() (take 3) * Add 'hw_regs_t *hws' argument to ide_device_add[_all]() and convert host drivers + ide_legacy_init_one() + ide_setup_pci_device[s]() to use it instead of calling ide_init_port_hw() directly. [ However if host has > 1 port we must still set hwif->chipset to hint consecutive ide_find_port() call that the previous slot is occupied. ] Unexport ide_init_port_hw(). v2: * Use defines instead of hard-coded values in buddha.c, gayle.c and q40ide.c. (Suggested by Geert Uytterhoeven) * Better patch description. v3: * Fix build problem in ide-cs.c. (Noticed by Stephen Rothwell) There should be no functional changes caused by this patch. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:50 +02:00
Roland Dreier	95d04f0735	IB/mlx4: Add support for memory management extensions and local DMA L_Key Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-23 08:12:26 -07:00
Rafi Rubin	f472f80034	HID: add n-trig digitizer usage This adds a hid usage that is reported by the N-Trig digitizer in the Dell Latitude XT screen. Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu> Signed-off-by: Vojtech Pavlik <vojtech@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-07-23 15:25:21 +02:00
Tejun Heo	137d3edb48	sg: reimplement sg mapping iterator This is alternative implementation of sg content iterator introduced by commit 83e7d317... from Pierre Ossman in next-20080716. As there's already an sg iterator which iterates over sg entries themselves, name this sg_mapping_iterator. Slightly edited description from the original implementation follows. Iteration over a sg list is not that trivial when you take into account that memory pages might have to be mapped before being used. Unfortunately, that means that some parts of the kernel restrict themselves to directly accesible memory just to not have to deal with the mess. This patch adds a simple iterator system that allows any code to easily traverse an sg list and not have to deal with all the details. The user can decide to consume part of the iteration. Also, iteration can be stopped and resumed later if releasing the kmap between iteration steps is necessary. These features are useful to implement piecemeal sg copying for interrupt drive PIO for example. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-23 14:42:09 +02:00
Nate Case	f46e9203d9	leds: Add support for Philips PCA955x I2C LED drivers This driver supports the PCA9550, PCA9551, PCA9552, and PCA9553 LED driver chips. Signed-off-by: Nate Case <ncase@xes-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Anton Vorontsov	781a54e766	leds: mark led_classdev.default_trigger as const LED classdev core doesn't modify memory pointed by the default_trigger, so mark it as const and we'll able to pass const char *s without getting compiler warnings. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Riku Voipio	e14fa82439	leds: Add pca9532 led driver NXP pca9532 is a LED dimmer/controller attached to i2c bus. It allows attaching upto 16 leds which can either be on, off or dimmed and/or blinked with the two PWM modulators available. This driver is a "new-style" i2c driver that adheres to the driver model and implements the led framework api. Since the leds connected to the driver are platform specific, it is only useful when platform data is passed to the driver to define what leds are connected to which pins. Signed-off-by: Riku Voipio <riku.voipio@iki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Linus Torvalds	c010b2f76c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits) ipw2200: Call netif_*_queue() interfaces properly. netxen: Needs to include linux/vmalloc.h [netdrvr] atl1d: fix !CONFIG_PM build r6040: rework init_one error handling r6040: bump release number to 0.18 r6040: handle RX fifo full and no descriptor interrupts r6040: change the default waiting time r6040: use definitions for magic values in descriptor status r6040: completely rework the RX path r6040: call napi_disable when puting down the interface and set lp->dev accordingly. mv643xx_eth: fix NETPOLL build r6040: rework the RX buffers allocation routine r6040: fix scheduling while atomic in r6040_tx_timeout r6040: fix null pointer access and tx timeouts r6040: prefix all functions with r6040 rndis_host: support WM6 devices as modems at91_ether: use netstats in net_device structure sfc: Create one RX queue and interrupt per CPU package by default sfc: Use a separate workqueue for resets sfc: I2C adapter initialisation fixes ...	2008-07-22 19:09:51 -07:00
David S. Miller	7cf75262a4	Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-07-22 17:54:47 -07:00
Maciej Sosnowski	7f1b358a23	I/OAT: I/OAT version 3.0 support This patch adds to ioatdma and dca modules support for Intel I/OAT DMA engine ver.3 (aka CB3 device). The main features of I/OAT ver.3 are: * 8 single channel DMA devices (8 channels total) * 8 DCA providers, each can accept 2 requesters * 8-bit TAG values and 32-bit extended APIC IDs Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-22 17:30:57 -07:00
Rafael J. Wysocki	e5899e1b7d	PCI PM: make more PCI PM core functionality available to drivers Make more PCI PM core functionality available to drivers * Export pci_pme_capable() so that it can be called directly by drivers (for example, tg3 needs that). * Move the state choosing part of pci_prepare_to_sleep() to a separate function, pci_target_state(), that can be called directly by drivers (for example, tg3 needs that). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-22 14:25:38 -07:00
Roland Dreier	47b374752a	IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg Make the struct name consistent with other WQE segment struct types defined in <linux/mlx4/qp.h>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:19:39 -07:00
Adrian Bunk	8086cd451f	netns: make get_proc_net() static get_proc_net() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-22 14:19:19 -07:00
Dave Jones	d29f749e25	net: Fix build failure with 'make mandocs'. The function header comments have to go with the functions they are documenting, or things go horribly wrong when we try to process them with the docbook tools. Warning(include/linux/netdevice.h:1006): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1033): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1067): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1093): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1474): No description found for parameter 'txq' Error(net/core/dev.c:1674): cannot understand prototype: 'u32 simple_tx_hashrnd; ' Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-22 14:09:06 -07:00
Linus Torvalds	6eaaaac974	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: remove CONFIG_KMOD from core kernel code remove CONFIG_KMOD from lib remove CONFIG_KMOD from sparc64 rework try_then_request_module to do less in non-modular kernels remove mention of CONFIG_KMOD from documentation make CONFIG_KMOD invisible modules: Take a shortcut for checking if an address is in a module module: turn longs into ints for module sizes Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs module: reorder struct module to save space on 64 bit builds module: generic each_symbol iterator function module: don't use stop_machine for waiting rmmod	2008-07-22 13:17:15 -07:00
Linus Torvalds	06b8147c5d	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (49 commits) powerpc: Fix build bug with binutils < 2.18 and GCC < 4.2 powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded fbdev: Teaches offb about palette on radeon r5xx/r6xx powerpc/cell/edac: Log a syndrome code in case of correctable error powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code powerpc: Indicate which oprofile counters to use while in compat mode powerpc/boot: Change spaces to tabs powerpc: Remove duplicate 6xx option in Kconfig powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S powerpc: Use PPC_LONG_ALIGN in uaccess.h powerpc: Add a #define for aligning to a long-sized boundary powerpc: Fix OF parsing of 64 bits PCI addresses powerpc: Use WARN_ON(1) instead of __WARN() powerpc: Fix support for latencytop powerpc/ps3: Update ps3_defconfig powerpc/ps3: Add a sub-match id to ps3_system_bus powerpc: Add a 6xx defconfig powerpc/dma: Use the struct dma_attrs in iommu code powerpc/cell: Add support for power button of future IBM cell blades powerpc/cell: Cleanup sysreset_hack for IBM cell blades ...	2008-07-22 13:16:01 -07:00
Linus Torvalds	53baaaa968	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (79 commits) arm: bus_id -> dev_name() and dev_set_name() conversions sparc64: fix up bus_id changes in sparc core code 3c59x: handle pci_name() being const MTD: handle pci_name() being const HP iLO driver sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute sysdev: Add utility functions for simple int/ulong variable sysdev attributes sysdev: Pass the attribute to the low level sysdev show/store function driver core: Suppress sysfs warnings for device_rename(). kobject: Transmit return value of call_usermodehelper() to caller sysfs-rules.txt: reword API stability statement debugfs: Implement debugfs_remove_recursive() HOWTO: change email addresses of James in HOWTO always enable FW_LOADER unless EMBEDDED=y uio-howto.tmpl: use unique output names uio-howto.tmpl: use standard copyright/legal markings sysfs: don't call notify_change sysdev: fix debugging statements in registration code. kobject: should use kobject_put() in kset-example kobject: reorder kobject to save space on 64 bit builds ...	2008-07-22 13:13:47 -07:00
Laurent Pinchart	217d5a5195	fs_enet: Remove unused fields in the fs_mii_bb_platform_info structure. The mdio_port, mdio_bit, mdc_port and mdc_bit fields in the fs_mii_bb_platform_info structure are left-overs from the move to the Phy Abstraction Layer subsystem. They are not used anymore and can be safely removed. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-22 16:08:58 -04:00
Paul Fulghum	e5590717af	synclink_gt: add serial bit order control Add control of hardware serial bit order between LSB first (default/standard) and MSB first. Signed-off-by: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:29 -07:00
Alan Cox	9e98966c7b	tty: rework break handling Some hardware needs to do break handling itself and may have partial support only. Make break_ctl return an error code. Add a tty driver flag so you can indicate driver hardware side break support. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:28 -07:00
Alan Cox	01e1abb2c2	tty: Split ldisc code into its own file Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:27 -07:00
Alan Cox	95da310e66	usb_serial: API all change USB serial likes to use port->tty back pointers for the real work it does and to do so without any actual locking. Unfortunately when you consider hangup events, hangup/parallel reopen or even worse hangup followed by parallel close events the tty->port and port->tty pointers are not guaranteed to be the same as port->tty is the active tty while tty->port is the port the tty may or may not still be attached to. So rework the entire API to pass the tty struct. For console cases we need to pass both for now. This shows up multiple drivers that immediately crash with USB console some of which have been fixed in the process. Longer term we need a proper tty as console abstraction Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:22 -07:00
Juergen Beisert	0c36ec3147	gpio: gpio driver for max7301 SPI GPIO expander Maxim's MAX7301 is an SPI GPIO expander with 28 GPIOs. Note: MAX7301's interrupt feature is not supported yet. [akpm@linux-foundation.org: coding-style fixes] [g.liakhovetski@pengutronix.de: Fix inaccuracies in comments, check spi_setup() return code, mask off high byte in max7301_read()] Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 09:59:40 -07:00
John Reiser	6519108746	execve filename: document and export via auxiliary vector The Linux kernel puts the filename argument of execve() into the new address space. Many developers are surprised to learn this. Those who know and could use it, object "But it's not documented." Those who want to use it dislike the expression (char *)(1+ strlen(env[-1+ n_env]) + env[-1+ n_env]) because it requires locating the last original environment variable, and assumes that the filename follows the characters. This patch documents the insertion of the filename, and makes it easier to find by adding a new tag AT_EXECFN in the ElfXX_auxv_t; see <elf.h>. In many cases readlink("/proc/self/exe",) gives the same answer. But if all the original pages get unmapped, then the kernel erases the symlink for /proc/self/exe. This can happen when a program decompressor does a good job of cleaning up after uncompressing directly to memory, so that the address space of the target program looks the same as if compression had never happened. One example is http://upx.sourceforge.net . One notable use of the underlying concept (what path containED the executable) is glibc expanding $ORIGIN in DT_RUNPATH. In practice for the near term, it may be a good idea for user-mode code to use both /proc/self/exe and AT_EXECFN as fall-back methods for each other. /proc/self/exe can fail due to unmapping, AT_EXECFN can fail because it won't be present on non-new systems. The auxvec or {AT_EXECFN}.d_val also can get overwritten, although in nearly all cases this would be the result of a bug. The runtime cost is one NEW_AUX_ENT using two words of stack space. The underlying value is maintained already as bprm->exec; setup_arg_pages() in fs/exec.c slides it for stack_shift, etc. Signed-off-by: John Reiser <jreiser@BitWagon.com> Cc: Roland McGrath <roland@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 09:59:40 -07:00
Johannes Berg	a1ef5adb4c	remove CONFIG_KMOD from core kernel code Always compile request_module when the kernel allows modules. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:31 +10:00
Johannes Berg	df648c9fbe	rework try_then_request_module to do less in non-modular kernels This reworks try_then_request_module to only invoke the "lookup" function "x" once when the kernel is not modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:29 +10:00
Denys Vlasenko	2f0f2a334b	module: turn longs into ints for module sizes This shrinks module.o and each *.ko file. And finally, structure members which hold length of module code (four such members there) and count of symbols are converted from longs to ints. We cannot possibly have a module where 32 bits won't be enough to hold such counts. For one, module loading checks module size for sanity before loading, so such insanely big module will fail that test first. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:27 +10:00
Denys Vlasenko	f7f5b67557	Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs module.c and module.h conatains code for finding exported symbols which are declared with EXPORT_UNUSED_SYMBOL, and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway (because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then). This patch adds required #ifdefs. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:27 +10:00
Richard Kennedy	af5406895a	module: reorder struct module to save space on 64 bit builds reorder struct module to save space on 64 bit builds. saves 1 cacheline_size (128 on default x86_64 & 64 on AMD Opteron/athlon) when CONFIG_MODULE_UNLOAD=y. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:26 +10:00
Benjamin Herrenschmidt	8725f25acc	Merge commit 'origin/master' Manually fixed up: drivers/net/fs_enet/fs_enet-main.c	2008-07-22 17:12:37 +10:00
Ingo Molnar	76c3bb15d6	Merge branch 'linus' into x86/x2apic	2008-07-22 09:06:21 +02:00
Greg Kroah-Hartman	eadcf0d704	MTD: handle pci_name() being const This changes the MTD core to handle pci_name() now returning a constant string. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:03 -07:00
Andi Kleen	9800794ac1	sysdev: Add utility functions for simple int/ulong variable sysdev attributes This adds a new sysdev_ext_attribute that stores a pointer to the variable it manages and some utility functions/macro to easily use them. Previously all users wrote custom macros to generate show/store functions for each variable, with this it is possible to avoid that in many cases. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Andi Kleen	4a0b2b4dbe	sysdev: Pass the attribute to the low level sysdev show/store function This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Cornelia Huck	36ce6dad6e	driver core: Suppress sysfs warnings for device_rename(). driver core: Suppress sysfs warnings for device_rename(). Renaming network devices to an already existing name is not something we want sysfs to print a scary warning for, since the callers can deal with this correctly. So let's introduce sysfs_create_link_nowarn() which gets rid of the common warning. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:01 -07:00
Haavard Skinnemoen	9505e63756	debugfs: Implement debugfs_remove_recursive() debugfs_remove_recursive() will remove a dentry and all its children. Drivers can use this to zap their whole debugfs tree so that they don't need to keep track of every single debugfs dentry they created. It may fail to remove the whole tree in certain cases: sh-3.2# rmmod atmel-mci < /sys/kernel/debug/mmc0/ios/clock mmc0: card b368 removed atmel_mci atmel_mci.0: Lost dma0chan1, falling back to PIO sh-3.2# ls /sys/kernel/debug/mmc0/ ios But I'm not sure if that case can be handled in any sane manner. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Cc: Pierre Ossman <drzeus-list@drzeus.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:59 -07:00
Richard Kennedy	a231934bdf	kobject: reorder kobject to save space on 64 bit builds reorder kobject to save space on 64 bit builds. shrinks from 72 to 64 bytes & moves allocated kobject to a smaller slab. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:56 -07:00
Uwe Kleine-König	6d8333c24d	UIO: minor style and comment fixes Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Signed-off-by: Hans J. Koch <hjk@linutronix.de>	2008-07-21 21:54:55 -07:00
Hans J. Koch	328a14e70e	UIO: Add write function to allow irq masking Sometimes it is necessary to enable/disable the interrupt of a UIO device from the userspace part of the driver. With this patch, the UIO kernel driver can implement an "irqcontrol()" function that does this. Userspace can write an s32 value to /dev/uioX (usually 0 or 1 to turn the irq off or on). The UIO core will then call the driver's irqcontrol function. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Acked-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:55 -07:00
Greg Kroah-Hartman	b98cb4b7fe	driver core: remove DEVICE_ID_SIZE define There is no such thing as a "device id size" in the driver core, so remove the define and fix up any users of this odd define in the rest of the kernel. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:53 -07:00
Kay Sievers	ca52a49846	driver core: remove DEVICE_NAME_SIZE define There is no such thing as a "device name size" in the driver core, so remove the define and fix up any users of this odd define in the rest of the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:53 -07:00
Kay Sievers	aab0de2451	driver core: remove KOBJ_NAME_LEN define Kobjects do not have a limit in name size since a while, so stop pretending that they do. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:52 -07:00
Matthew Wilcox	d2a3b9146e	class: add lockdep infrastructure This adds the infrastructure to properly handle lockdep issues when the internal class semaphore is changed to a mutex. Matthew wrote the original patch, and Greg fixed it up to work properly with the class_create() function. From: Matthew Wilcox <matthew@wil.cx> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:52 -07:00
Greg Kroah-Hartman	7c71448b8a	class: move driver core specific parts to a private structure This moves the portions of struct class that are dynamic (kobject and lock and lists) out of the main structure and into a dynamic, private, structure. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:51 -07:00
Greg Kroah-Hartman	695794ae0c	Driver Core: add ability for class_find_device to start in middle of list This mirrors the functionality that driver_find_device has as well. We add a start variable, and all callers of the function are fixed up at the same time. The block layer will be using this new functionality in a follow-on patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	93562b5376	Driver Core: add ability for class_for_each_device to start in middle of list This mirrors the functionality that driver_for_each_device has as well. We add a start variable, and all callers of the function are fixed up at the same time. The block layer will be using this new functionality in a follow-on patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	4e10673944	device create: convert device_create_drvdata to device_create Now that device_create() has been audited, rename things back to the original call to be sane. Keep the device_create_drvdata macro around to make merges easier. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	ccea44fadc	driver core: remove device_create() There are no more users of this, and it is racy. Use device_create_drvdata() or device_create_vargs() instead. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Dan Williams	e105b8bfc7	sysfs: add /sys/dev/{char,block} to lookup sysfs path by major:minor Why?: There are occasions where userspace would like to access sysfs attributes for a device but it may not know how sysfs has named the device or the path. For example what is the sysfs path for /dev/disk/by-id/ata-ST3160827AS_5MT004CK? With this change a call to stat(2) returns the major:minor then userspace can see that /sys/dev/block/8:32 links to /sys/block/sdc. What are the alternatives?: 1/ Add an ioctl to return the path: Doable, but sysfs is meant to reduce the need to proliferate ioctl interfaces into the kernel, so this seems counter productive. 2/ Use udev to create these symlinks: Also doable, but it adds a udev dependency to utilities that might be running in a limited environment like an initramfs. 3/ Do a full-tree search of sysfs. [kay.sievers@vrfy.org: fix duplicate registrations] [kay.sievers@vrfy.org: cleanup suggestions] Cc: Neil Brown <neilb@suse.de> Cc: Tejun Heo <htejun@gmail.com> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Reviewed-by: SL Baur <steve@xemacs.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Mark Lord <lkml@rtr.ca> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:40 -07:00
Mark Nelson	1ed6af7344	powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU implementation to use this code. Dynamic mappings can be weakly or strongly ordered on an individual basis but the fixed mapping has to be either completely strong or completely weak. This is currently decided by a kernel boot option (pass iommu_fixed=weak for a weakly ordered fixed linear mapping, strongly ordered is the default). Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:36 +10:00
Wolfgang Grandegger	18ad7a61e1	of_gpio: Should use new <linux/gpio.h> header Since commit `7560fa60fc` (gpio: <linux/gpio.h> and "no GPIO support here" stubs) drivers can use GPIOs if they're available, but don't require them. This patch actually enables this feature. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:30 +10:00
Linus Torvalds	93ded9b8fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits) usb-storage: revert DMA-alignment change for Wireless USB USB: use reset_resume when normal resume fails usb_gadget: composite cdc gadget fault handling usb gadget: minor USBCV fix for composite framework USB: Fix bug with byte order in isp116x-hcd.c fio write/read USB: fix double kfree in ipaq in error case USB: fix build error in cdc-acm for CONFIG_PM=n USB: remove board-specific UP2OCR configuration from pxa27x-udc USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx USB: Fix pointer/int cast in USB devio code usb gadget: g_cdc dependso on NET USB: Au1xxx-usb: suspend/resume support. USB: Au1xxx-usb: clean up ohci/ehci bus glue sources. usbfs: don't store bad pointers in registration usbfs: fix race between open and unregister usbfs: simplify the lookup-by-minor routines usbfs: send disconnect signals when device is unregistered USB: Force unbinding of drivers lacking reset_resume or other methods USB: ohci-pnx4008: I2C cleanups and fixes USB: debug port converter does not accept more than 8 byte packets ...	2008-07-21 15:42:53 -07:00
Alan Stern	78d9a487ee	USB: Force unbinding of drivers lacking reset_resume or other methods This patch (as1024) takes care of a FIXME issue: Drivers that don't have the necessary suspend, resume, reset_resume, pre_reset, or post_reset methods will be unbound and their interface reprobed when one of the unsupported events occurs. This is made slightly more difficult by the fact that bind operations won't work during a system sleep transition. So instead the code has to defer the operation until the transition ends. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:40 -07:00
Ming Lei	742120c631	USB: fix usb_reset_device and usb_reset_composite_device(take 3) This patch renames the existing usb_reset_device in hub.c to usb_reset_and_verify_device and renames the existing usb_reset_composite_device to usb_reset_device. Also the new usb_reset_and_verify_device does't need to be EXPORTED . The idea of the patch is that external interface driver should warn the other interfaces' driver of the same device before and after reseting the usb device. One interface driver shoud call _old_ usb_reset_composite_device instead of _old_ usb_reset_device since it can't assume the device contains only one interface. The _old_ usb_reset_composite_device is safe for single interface device also. we rename the two functions to make the change easily. This patch is under guideline from Alan Stern. Signed-off-by: Ming Lei <tom.leiming@gmail.com>	2008-07-21 15:16:33 -07:00
Ming Lei	625f694936	USB: remove interface parameter of usb_reset_composite_device From the current implementation of usb_reset_composite_device function, the iface parameter is no longer useful. This function doesn't do something special for the iface usb_interface,compared with other interfaces in the usb_device. So remove the parameter and fix the related caller. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:32 -07:00
Alan Stern	f579c2b46f	USB Gadget: documentation update This patch (as1102) clarifies two points in the USB Gadget kerneldoc: Request completion callbacks are always made with interrupts disabled; Device controllers may not support STALLing the status stage of a control transfer after the data stage is over. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:27 -07:00
Felipe Balbi	e0d795e4f3	usb: irda: cleanup on ir-usb module General cleanup on ir-usb module. Introduced a common header that could be used also on usb gadget framework. Lot's of cleanups and now using macros from the header file. Signed-off-by: Felipe Balbi <me@felipebalbi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:27 -07:00
David Brownell	40982be52d	usb gadget: composite gadget core Add <linux/usb/composite.h> interfaces for composite gadget drivers, and basic implementation support behind it: - struct usb_function ... groups one or more interfaces into a function managed as one unit within a configuration, to which it's added by usb_add_function(). - struct usb_configuration ... groups one or more such functions into a configuration managed as one unit by a driver, to which it's added by usb_add_config(). These operate at either high or full/low speeds and at a given bMaxPower. - struct usb_composite_driver ... groups one or more such configurations into a gadget driver, which may be registered or unregistered. - struct usb_composite_dev ... a usb_composite_driver manages this; it wraps the usb_gadget exposed by the controller driver. This also includes some basic kerneldoc. How to use it (the short version): provide a usb_composite_driver with a bind() that calls usb_add_config() for each of the needed configurations. The configurations in turn have bind() calls, which will usb_add_function() for each function required. Each function's bind() allocates resources needed to perform its tasks, like endpoints; sometimes configurations will allocate resources too. Separate patches will convert most gadget drivers to this infrastructure. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:01 -07:00
David Brownell	a4c39c41bf	usb gadget: descriptor copying support Define three new descriptor manipulation utilities, for use when setting up functions that may have multiple instances: usb_copy_descriptors() to copy a vector of descriptors usb_free_descriptors() to free the copy usb_find_endpoint() to find a copied version These will be used as follows. Functions will continue to have static tables of descriptors they update, now used as __initdata templates. When a function creates a new instance, it patches those tables with relevant interface and string IDs, plus endpoint assignments. Then it copies those morphed descriptors, associates the copies with the new function instance, and records the endpoint descriptors to use when activating the endpoints. When initialization is done, only the copies remain in memory. The copies are freed on driver removal. This ensures that each instance has descriptors which hold the right instance-specific data. Two instances in the same configuration will obviously never share the same interface IDs or use the same endpoints. Instances in different configurations won't do so either, which means this is slightly less memory-efficient in some cases. This also includes a bugfix to the epautoconf code that shows up with this usage model. It must replace the previous endpoint number when updating the template descriptors, not just mask in a few more bits. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:00 -07:00
Adrian Bunk	ea05af61a8	USB: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:55 -07:00
Alan Stern	9da82bd464	USB: implement "soft" unbinding This patch (as1091) changes the way usbcore handles interface unbinding. If the interface's driver supports "soft" unbinding (a new flag in the driver structure) then in-flight URBs are not cancelled and endpoints are not disabled. Instead the driver is allowed to continue communicating with the device (although of course it should stop before its disconnect routine returns). The purpose of this change is to allow drivers to do a clean shutdown when they get unbound from a device that is still plugged in. Killing all the URBs and disabling the endpoints before calling the driver's disconnect method doesn't give the driver any control over what happens, and it can leave devices in indeterminate states. For example, when usb-storage unbinds it doesn't want to stop while in the middle of transmitting a SCSI command. The soft_unbind flag is added because in the past, a number of drivers have experienced problems related to ongoing I/O after their disconnect routine returned. Hence "soft" unbinding is made available only to drivers that claim to support it. The patch also replaces "interface_to_usbdev(intf)" with "udev" in a couple of places, a minor simplification. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:54 -07:00
Greg Kroah-Hartman	1b26da1510	USB: handle pci_name() being const This changes usb_create_hcd() to be able to handle the fact that pci_name() has changed to a constant string. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:46 -07:00
Linus Torvalds	6d52dcbe56	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpufreq: remove CVS keywords [CPUFREQ] change cpu freq arrays to per_cpu variables	2008-07-21 15:10:37 -07:00
David S. Miller	ebb36a9781	ipv6: __KERNEL__ ifdef struct ipv6_devconf Based upon a report by Olaf Hering. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 13:41:16 -07:00
Arjan van de Ven	6579e57b31	net: Print the module name as part of the watchdog message As suggested by Dave: This patch adds a function to get the driver name from a struct net_device, and consequently uses this in the watchdog timeout handler to print as part of the message. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 13:31:48 -07:00
Linus Torvalds	e89970aa93	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netfilter: nf_conntrack_sctp: fix sparse warnings netfilter: nf_nat_sip: c= is optional for session netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function netfilter: nfnetlink_log: send complete hardware header netfilter: xt_time: fix time's time_mt()'s use of do_div() netfilter: accounting rework: ct_extend + 64bit counters (v4) netlink: add NLA_PUT_BE64 macro netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM hdlcdrv: Fix CRC calculation. Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe." net: In __netif_schedule() use WARN_ON instead of BUG_ON net: Improve simple_tx_hash(). pkt_sched: Remove unused variable skb in dev_deactivate_queue function. sunhme: Remove stop/wake TX queue calls in set-multicast-list handler. ucc_geth: do not touch net queue in adjust_link phylib callback gianfar: do not touch net queue in adjust_link phylib callback atl1: Do not wake queue before queue has been started.	2008-07-21 11:29:52 -07:00
Linus Torvalds	72a73693aa	Merge branch 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits) x86: remove extra calling to get ext cpuid level x86: use setup_clear_cpu_cap() when disabling the lapic KVM: fix exception entry / build bug, on 64-bit x86: add unknown_nmi_panic kernel parameter x86, VisWS: turn into generic arch, eliminate leftover files x86: add ->pre_time_init to x86_quirks x86: extend and use x86_quirks to clean up NUMAQ code x86: introduce x86_quirks x86: improve debug printout: add target bootmem range in early_res_to_bootmem() Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM x86: remove arch_get_ram_range x86: Add a debugfs interface to dump PAT memtype x86: Add a arch directory for x86 under debugfs x86: i386: reduce boot fixmap space i386/xen: add proper unwind annotations to xen_sysenter_target x86: reduce force_mwait visibility x86: reduce forbid_dac's visibility x86: fix two modpost warnings x86: check function status in EDD boot code x86_64: ia32_signal.c: remove signal number conversion ...	2008-07-21 10:34:25 -07:00
Linus Torvalds	b7e6f62fe2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm crypt: add merge dm table: remove merge_bvec sector restriction dm: linear add merge dm: introduce merge_bvec_fn dm snapshot: use per device mempools dm snapshot: fix race during exception creation dm snapshot: track snapshot reads dm mpath: fix test for reinstate_path dm mpath: return parameter error dm io: remove struct padding dm log: make dm_dirty_log init and exit static dm mpath: free path selector on invalid args	2008-07-21 10:30:10 -07:00
Linus Torvalds	8a392625b6	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: (52 commits) md: Protect access to mddev->disks list using RCU md: only count actual openers as access which prevent a 'stop' md: linear: Make array_size sector-based and rename it to array_sectors. md: Make mddev->array_size sector-based. md: Make super_type->rdev_size_change() take sector-based sizes. md: Fix check for overlapping devices. md: Tidy up rdev_size_store a bit: md: Remove some unused macros. md: Turn rdev->sb_offset into a sector-based quantity. md: Make calc_dev_sboffset() return a sector count. md: Replace calc_dev_size() by calc_num_sectors(). md: Make update_size() take the number of sectors. md: Better control of when do_md_stop is allowed to stop the array. md: get_disk_info(): Don't convert between signed and unsigned and back. md: Simplify restart_array(). md: alloc_disk_sb(): Return proper error value. md: Simplify sb_equal(). md: Simplify uuid_equal(). md: sb_equal(): Fix misleading printk. md: Fix a typo in the comment to cmd_match(). ...	2008-07-21 10:29:12 -07:00
Eric Leblond	72961ecf84	netfilter: nfnetlink_log: send complete hardware header This patch adds some fields to NFLOG to be able to send the complete hardware header with all necessary informations. It sends to userspace: * the type of hardware link * the lenght of hardware header * the hardware header Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 10:11:00 -07:00
Krzysztof Piotr Oledzki	584015727a	netfilter: accounting rework: ct_extend + 64bit counters (v4) Initially netfilter has had 64bit counters for conntrack-based accounting, but it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are still required, for example for "connbytes" extension. However, 64bit counters waste a lot of memory and it was not possible to enable/disable it runtime. This patch: - reimplements accounting with respect to the extension infrastructure, - makes one global version of seq_print_acct() instead of two seq_print_counters(), - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n), - makes it possible to enable/disable it at runtime by sysctl or sysfs, - extends counters from 32bit to 64bit, - renames ip_conntrack_counter -> nf_conn_counter, - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT), - set initial accounting enable state based on CONFIG_NF_CT_ACCT - removes buggy IPCT_COUNTER_FILLING event handling. If accounting is enabled newly created connections get additional acct extend. Old connections are not changed as it is not possible to add a ct_extend area to confirmed conntrack. Accounting is performed for all connections with acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct". Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 10:10:58 -07:00
Ingo Molnar	eb6a12c242	Merge branch 'linus' into cpus4096-for-linus Conflicts: net/sunrpc/svc.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 17:19:50 +02:00
Ingo Molnar	acee709cab	Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus	2008-07-21 16:37:17 +02:00
Milan Broz	f6fccb1213	dm: introduce merge_bvec_fn Introduce a bvec merge function for device mapper devices for dynamic size restrictions. This code ensures the requested biovec lies within a single target and then calls a target-specific function to check against any constraints imposed by underlying devices. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-07-21 12:00:37 +01:00
NeilBrown	4b80991c6c	md: Protect access to mddev->disks list using RCU All modifications and most access to the mddev->disks list are made under the reconfig_mutex lock. However there are three places where the list is walked without any locking. If a reconfig happens at this time, havoc (and oops) can ensue. So use RCU to protect these accesses: - wrap them in rcu_read_{,un}lock() - use list_for_each_entry_rcu - add to the list with list_add_rcu - delete from the list with list_del_rcu - delay the 'free' with call_rcu rather than schedule_work Note that export_rdev did a list_del_init on this list. In almost all cases the entry was not in the list anymore so it was a no-op and so safe. It is no longer safe as after list_del_rcu we may not touch the list_head. An audit shows that export_rdev is called: - after unbind_rdev_from_array, in which case the delete has already been done, - after bind_rdev_to_array fails, in which case the delete isn't needed. - before the device has been put on a list at all (e.g. in add_new_disk where reading the superblock fails). - and in autorun devices after a failure when the device is on a different list. So remove the list_del_init call from export_rdev, and add it back immediately before the called to export_rdev for that last case. Note also that ->same_set is sometimes used for lists other than mddev->list (e.g. candidates). In these cases rcu is not needed. Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
NeilBrown	f2ea68cf42	md: only count actual openers as access which prevent a 'stop' Open isn't the only thing that increments ->active. e.g. reading /proc/mdstat will increment it briefly. So to avoid false positives in testing for concurrent access, introduce a new counter that counts just the number of times the md device it open. Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
Andre Noll	d6e2215052	md: linear: Make array_size sector-based and rename it to array_sectors. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
Andre Noll	f233ea5c9e	md: Make mddev->array_size sector-based. This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:22 +10:00
Dmitry Torokhov	908cf4b925	Merge master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 into next	2008-07-21 00:55:14 -04:00
Linus Torvalds	14b395e35d	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...	2008-07-20 21:21:46 -07:00
Linus Torvalds	ae0645a451	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: let asic3 use mem resource instead of bus_shift mfd: remove DS1WM register definitions from asic3.h mfd: add ASIC3_CONFIG_GPIO templates mfd: fix the asic3 irq demux code mfd: asic3 should depend on gpiolib mfd: fix asic3 config array initialisation mfd: move asic3 probe functions into __init section mfd: Use uppercase only for asic3 macros and defines mfd: use dev_* macros for asic3 debugging mfd: New asic3 gpio configuration code mfd: asic3 children platform data removal mfd: asic3 gpiolib support	2008-07-20 21:16:27 -07:00
Linus Torvalds	f894d18380	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (277 commits) V4L/DVB (8415): gspca: Infinite loop in i2c_w() of etoms. V4L/DVB (8414): videodev/cx18: fix get_index bug and error-handling lock-ups V4L/DVB (8411): videobuf-dma-contig.c: fix 64-bit build for pre-2.6.24 kernels V4L/DVB (8410): sh_mobile_ceu_camera: fix 64-bit compiler warnings V4L/DVB (8397): video: convert select VIDEO_ZORAN_ZR36060 into depends on V4L/DVB (8396): video: Fix Kbuild dependency for VIDEO_IR_I2C V4L/DVB (8395): saa7134: Fix Kbuild dependency of ir-kbd-i2c V4L/DVB (8394): ir-common: CodingStyle fix: move EXPORT_SYMBOL_GPL to their proper places V4L/DVB (8393): media/video: Fix depencencies for VIDEOBUF V4L/DVB (8392): media/Kconfig: Convert V4L1_COMPAT select into "depends on" V4L/DVB (8390): videodev: add comment and remove magic number. V4L/DVB (8389): videodev: simplify get_index() V4L/DVB (8387): Some cosmetic changes V4L/DVB (8381): ov7670: fix compile warnings V4L/DVB (8380): saa7115: use saa7115_auto instead of saa711x as the autodetect driver name. V4L/DVB (8379): saa7127: Make device detection optional V4L/DVB (8378): cx18: move cx18_av_vbi_setup to av-core.c and rename to cx18_av_std_setup V4L/DVB (8377): ivtv/cx18: ensure the default control values are correct V4L/DVB (8376): cx25840: move cx25840_vbi_setup to core.c and rename to cx25840_std_setup V4L/DVB (8374): gspca: No conflict of 0c45:6011 with the sn9c102 driver. ...	2008-07-20 21:14:42 -07:00
Linus Torvalds	f076ab8d04	Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits) KVM: Adjust smp_call_function_mask() callers to new requirements KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts KVM: x86 emulator: emulate clflush KVM: MMU: improve invalid shadow root page handling KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues KVM: check injected pic irq within valid pic irqs KVM: x86 emulator: Fix HLT instruction KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized KVM: VMX: Add ept_sync_context in flush_tlb KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held x86: KVM guest: make kvm_smp_prepare_boot_cpu() static KVM: SVM: fix suspend/resume support KVM: s390: rename private structures KVM: s390: Set guest storage limit and offset to sane values KVM: Fix memory leak on guest exit KVM: s390: dont allocate dirty bitmap KVM: move slots_lock acquision down to vapic_exit KVM: VMX: Fake emulate Intel perfctr MSRs KVM: VMX: Fix a wrong usage of vmcs_config ...	2008-07-20 21:13:26 -07:00
Linus Torvalds	db6d8c7a40	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits) iucv: Fix bad merging. net_sched: Add size table for qdiscs net_sched: Add accessor function for packet length for qdiscs net_sched: Add qdisc_enqueue wrapper highmem: Export totalhigh_pages. ipv6 mcast: Omit redundant address family checks in ip6_mc_source(). net: Use standard structures for generic socket address structures. ipv6 netns: Make several "global" sysctl variables namespace aware. netns: Use net_eq() to compare net-namespaces for optimization. ipv6: remove unused macros from net/ipv6.h ipv6: remove unused parameter from ip6_ra_control tcp: fix kernel panic with listening_get_next tcp: Remove redundant checks when setting eff_sacks tcp: options clean up tcp: Fix MD5 signatures for non-linear skbs sctp: Update sctp global memory limit allocations. sctp: remove unnecessary byteshifting, calculate directly in big-endian sctp: Allow only 1 listening socket with SO_REUSEADDR sctp: Do not leak memory on multiple listen() calls sctp: Support ipv6only AF_INET6 sockets. ...	2008-07-20 17:43:29 -07:00
Linus Torvalds	f7df406dce	Merge branch 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6 * 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6: configfs: Allow ->make_item() and ->make_group() to return detailed errors. Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."	2008-07-20 17:17:52 -07:00
Alan Cox	44b7d1b37f	tty: add more tty_port fields Move more bits into the tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:38 -07:00
Alan Cox	77451e53e0	cyclades: use tty_port Switch cyclades to use the new tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:38 -07:00
Alan Cox	f8ae476416	stallion: use tty_port Switch the stallion driver to use the tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:37 -07:00
Alan Cox	b02f5ad6a3	istallion: use tty_port Switch istallion to use the new tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:37 -07:00
Alan Cox	b5391e29f4	gs: use tty_port Switch drivers using the old "generic serial" driver to use the tty_port structures Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	4982d6b37a	esp: use tty_port Switch esp to use the new tty_port structures Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	7a4d29f426	tty.h: clean up Coding style clean up and white space tidy Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	df4f4dd429	serial: use tty_port Switch the serial_core based drivers to use the new tty_port structure. We can't quite use all of it yet because of the dynamically allocated extras in the serial_core layer. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:35 -07:00
Alan Cox	6f67048cd0	tty: Introduce a tty_port common structure Every tty driver has its own concept of a port structure and because they all differ we cannot extract commonality. Begin fixing this by creating a structure drivers can elect to use so that over time we can push fields into this and create commonality and then introduce common methods. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:35 -07:00
Alan Cox	a352def21a	tty: Ldisc revamp Move the line disciplines towards a conventional ->ops arrangement. For the moment the actual 'tty_ldisc' struct in the tty is kept as part of the tty struct but this can then be changed if it turns out that when it all settles down we want to refcount ldiscs separately to the tty. Pull the ldisc code out of /proc and put it with our ldisc code. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:34 -07:00
Haavard Skinnemoen	6bb0e3a59a	Subject: [PATCH 1/2] serial: Add flush_buffer() operation to uart_ops Serial drivers using DMA (like the atmel_serial driver) tend to get very confused when the xmit buffer is flushed and nobody told them. They also tend to spew a lot of garbage since the DMA engine keeps running after the buffer is flushed and possibly refilled with unrelated data. This patch adds a new flush_buffer operation to the uart_ops struct, along with a call to it from uart_flush_buffer() right after the xmit buffer has been cleared. The driver can implement this in order to syncronize its internal DMA state with the xmit buffer when the buffer is flushed. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:34 -07:00
Philipp Zabel	99cdb0c8c5	mfd: let asic3 use mem resource instead of bus_shift The bus_shift parameter in platform_data is not needed as we can tell the driver with the IOMEM_RESOURCE whether the ASIC is located on a 16bit or 32bit memory bus. The htc-egpio driver uses a more descriptive bus_width parameter, but for drivers where the register map size fixed, we don't even need this. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:44 +02:00
Philipp Zabel	279cac484e	mfd: remove DS1WM register definitions from asic3.h There is a dedicated ds1wm driver, no need to duplicate this information here. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:24 +02:00
Philipp Zabel	4a67b528e0	mfd: add ASIC3_CONFIG_GPIO templates As ASIC3 GPIO alternate function configuration is expected to be similar for several devices, it is convenient to define descriptive macros. This patch is inspired by the PXA MFP configuration, the alternate functions were observed on hx4700 and blueangel. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:19 +02:00
Samuel Ortiz	3b8139f8b1	mfd: Use uppercase only for asic3 macros and defines Let's be consistent and use uppercase only, for both macro and defines. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:55:14 +02:00
Samuel Ortiz	3b26bf1722	mfd: New asic3 gpio configuration code The ASIC3 GPIO configuration code is a bit obscure and hardly readable. This patch changes it so that it is now more readable and understandable, by being more explicit. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:55:00 +02:00
Samuel Ortiz	1effe5bc6c	mfd: asic3 children platform data removal Platform devices should be dynamically allocated, and each supported device should have its own platform data. For now we just remove this buggy code. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:54:51 +02:00
Samuel Ortiz	6f2384c4bd	mfd: asic3 gpiolib support ASIC3 is, among other things, a GPIO extender. We should thus have it supporting the current gpiolib API. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:52:38 +02:00
Jean Delvare	5a367dfb73	V4L/DVB (8246): tvaudio: Stop I2C driver ID abuse The tvaudio driver is using "official" I2C device IDs for internal purpose. There must be some historical reason behind this but anyway, it shouldn't do that. As the stored values are never used, the easiest way to fix the problem is simply to remove them altogether. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:18:38 -03:00
Jean Delvare	be99af6679	V4L/DVB (8245): ovcamchip: Delete stray I2C bus ID I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses this driver ID, so we can remove the reference. As far as I can see, the Cypress FX2 webcam is handled by a different driver (dvb-usb). Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:18:34 -03:00
Hans de Goede	ab8f12cf8e	V4L/DVB (8197): gspca: pac207 frames no more decoded in the subdriver. videodev2: New pixfmt pac207: Remove the specific decoding. main: get_buff_size operation added for the subdriver. Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl> Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:17:02 -03:00
Hans de Goede	54ab92ca05	V4L/DVB (8194): gspca: Fix the format of the low resolution mode of spca561. The low (half) res modes of the spca561 are not spca561 compressed, but are raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG bayer format used by the spca561 in low res mode. Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl> Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:16:47 -03:00
Jean-Francois Moine	6a7eba24e4	V4L/DVB (8157): gspca: all subdrivers - remaning subdrivers added - remove the decoding helper and some specific frame decodings Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:14:49 -03:00
Tobias Lorenz	1d0ba5f378	V4L/DVB (7942): Hardware frequency seek ioctl interface Signed-off-by: Tobias Lorenz <tobias.lorenz@gmx.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:07:12 -03:00
Marcelo Tosatti	34d4cb8fca	KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction Flush the shadow mmu before removing regions to avoid stale entries. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Tan, Li	9ef621d3be	KVM: Support mixed endian machines Currently kvmtrace is not portable. This will prevent from copying a trace file from big-endian target to little-endian workstation for analysis. In the patch, kernel outputs metadata containing a magic number to trace log, and changes 64-bit words to be u64 instead of a pair of u32s. Signed-off-by: Tan Li <li.tan@intel.com> Acked-by: Jerone Young <jyoung5@us.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:32 +03:00
Laurent Vivier	5f94c1741b	KVM: Add coalesced MMIO support (common part) This patch adds all needed structures to coalesce MMIOs. Until an architecture uses it, it is not compiled. Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that can be coalesced: - KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone. It requests one parameter (struct kvm_coalesced_mmio_zone) which defines a memory area where MMIOs can be coalesced until the next switch to user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX. - KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone). The userspace client can check kernel coalesced MMIO availability by asking ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability. The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported, or the page offset where will be stored the ring buffer. The page offset depends on the architecture. After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively to the address of the start of th kvm_run structure. The MMIO ring buffer is defined by the structure kvm_coalesced_mmio_ring. [akio: fix oops during guest shutdown] Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:31 +03:00
Laurent Vivier	92760499d0	KVM: kvm_io_device: extend in_range() to manage len and write attribute Modify member in_range() of structure kvm_io_device to pass length and the type of the I/O (write or read). This modification allows to use kvm_io_device with coalesced MMIO. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:30 +03:00
Avi Kivity	7cc8883074	KVM: Remove decache_vcpus_on_cpu() and related callbacks Obsoleted by the vmx-specific per-cpu list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Mike Travis	80422d3431	cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP * Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify) * Fix a semantic error in CPUMASK_ALLOC * Add a bit of commentry to cpumask.h Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:12 +02:00
Jussi Kivilinna	175f9c1bba	net_sched: Add size table for qdiscs Add size table functions for qdiscs and calculate packet size in qdisc_enqueue(). Based on patch by Patrick McHardy http://marc.info/?l=linux-netdev&m=115201979221729&w=2 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-20 00:08:47 -07:00
YOSHIFUJI Hideaki	230b183921	net: Use standard structures for generic socket address structures. Use sockaddr_storage{} for generic socket address storage and ensures proper alignment. Use sockaddr{} for pointers to omit several casts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-19 22:35:47 -07:00
Adam Langley	4389dded77	tcp: Remove redundant checks when setting eff_sacks Remove redundant checks when setting eff_sacks and make the number of SACKs a compile time constant. Now that the options code knows how many SACK blocks can fit in the header, we don't need to have the SACK code guessing at it. Signed-off-by: Adam Langley <agl@imperialviolet.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-19 00:07:02 -07:00
David S. Miller	3072367300	pkt_sched: Manage qdisc list inside of root qdisc. Idea is from Patrick McHardy. Instead of managing the list of qdiscs on the device level, manage it in the root qdisc of a netdev_queue. This solves all kinds of visibility issues during qdisc destruction. The way to iterate over all qdiscs of a netdev_queue is to visit the netdev_queue->qdisc, and then traverse it's list. The only special case is to ignore builting qdiscs at the root when dumping or doing a qdisc_lookup(). That was not needed previously because builtin qdiscs were not added to the device's qdisc_list. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 22:50:15 -07:00
Matthew Garrett	92c4989092	Input: add switch for dock events Add a SW_DOCK switch to input.h. ACPI docks currently send their docking status as a uevent, but not all docks are ACPI or correspond to a device. In that case, it makes more sense to simply generate an input event on docking or undocking. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-19 00:52:43 -04:00
Mark Brown	5ec461d083	Input: add microphone insert switch definition Add a new switch type to the input API for reporting microphone insertion. This will be used by the ALSA jack reporting API. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-19 00:52:36 -04:00
Patrick McHardy	8913336a7e	packet: add PACKET_RESERVE sockopt Add new sockopt to reserve some headroom in the mmaped ring frames in front of the packet payload. This can be used f.i. when the VLAN header needs to be (re)constructed to avoid moving the entire payload. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 18:05:19 -07:00
venkatesh.pallipadi@intel.com	ae79cdaacb	x86: Add a arch directory for x86 under debugfs Add a directory for x86 arch under debugfs. Can be used to accumulate all x86 specific debugfs files. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 17:22:04 -07:00
Ingo Molnar	a208f37a46	Merge branch 'linus' into x86/x2apic	2008-07-18 22:50:34 +02:00
Mike Travis	77586c2bda	cpumask: Provide a generic set of CPUMASK_ALLOC macros * Provide a generic set of CPUMASK_ALLOC macros patterned after the SCHED_CPUMASK_ALLOC macros. This is used where multiple cpumask_t variables are declared on the stack to reduce the amount of stack space required. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:03:00 +02:00
Mike Travis	65c0118453	cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr * This patch replaces the dangerous lvalue version of cpumask_of_cpu with new cpumask_of_cpu_ptr macros. These are patterned after the node_to_cpumask_ptr macros. In general terms, if there is a cpumask_of_cpu_map[] then a pointer to the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map is provided when there is a large NR_CPUS count, reducing greatly the amount of code generated and stack space used for cpumask_of_cpu(). The pointer to the cpumask_t value is needed for calling set_cpus_allowed_ptr() to reduce the amount of stack space needed to pass the cpumask_t value. If there isn't a cpumask_of_cpu_map[], then a temporary variable is declared and filled in with value from cpumask_of_cpu(cpu) as well as a pointer variable pointing to this temporary variable. Afterwards, the pointer is used to reference the cpumask value. The compiler will optimize out the extra dereference through the pointer as well as the stack space used for the pointer, resulting in identical code. A good example of the orthogonal usages is in net/sunrpc/svc.c: case SVC_POOL_PERCPU: { unsigned int cpu = m->pool_to[pidx]; cpumask_of_cpu_ptr(cpumask, cpu); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, cpumask); return 1; } case SVC_POOL_PERNODE: { unsigned int node = m->pool_to[pidx]; node_to_cpumask_ptr(nodecpumask, node); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, nodecpumask); return 1; } Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:02:57 +02:00
Ingo Molnar	bb2c018b09	Merge branch 'linus' into cpus4096 Conflicts: drivers/acpi/processor_throttling.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:00:54 +02:00
Ingo Molnar	9b610fda0d	Merge branch 'linus' into timers/nohz	2008-07-18 19:53:16 +02:00
Thomas Gleixner	b8f8c3cf0a	nohz: prevent tick stop outside of the idle loop Jack Ren and Eric Miao tracked down the following long standing problem in the NOHZ code: scheduler switch to idle task enable interrupts Window starts here ----> interrupt happens (does not set NEED_RESCHED) irq_exit() stops the tick ----> interrupt happens (does set NEED_RESCHED) return from schedule() cpu_idle(): preempt_disable(); Window ends here The interrupts can happen at any point inside the race window. The first interrupt stops the tick, the second one causes the scheduler to rerun and switch away from idle again and we end up with the tick disabled. The fact that it needs two interrupts where the first one does not set NEED_RESCHED and the second one does made the bug obscure and extremly hard to reproduce and analyse. Kudos to Jack and Eric. Solution: Limit the NOHZ functionality to the idle loop to make sure that we can not run into such a situation ever again. cpu_idle() { preempt_disable(); while(1) { tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we are in the idle loop while (!need_resched()) halt(); tick_nohz_restart_sched_tick(); <- disables NOHZ mode preempt_enable_no_resched(); schedule(); preempt_disable(); } } In hindsight we should have done this forever, but ... /me grabs a large brown paperbag. Debugged-by: Jack Ren <jack.ren@marvell.com>, Debugged-by: eric miao <eric.y.miao@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-18 18:10:28 +02:00
Lai Jiangshan	5127bed588	rcu classic: new algorithm for callbacks-processing(v2) This is v2, it's a little deference from v1 that I had send to lkml. use ACCESS_ONCE use rcu_batch_after/rcu_batch_before for batch # comparison. rcutorture test result: (hotplugs: do cpu-online/offline once per second) No CONFIG_NO_HZ: OK, 12hours No CONFIG_NO_HZ, hotplugs: OK, 12hours CONFIG_NO_HZ=y: OK, 24hours CONFIG_NO_HZ=y, hotplugs: Failed. (Failed also without my patch applied, exactly the same bug occurred, http://lkml.org/lkml/2008/7/3/24) v1's email thread: http://lkml.org/lkml/2008/6/2/539 v1's description: The code/algorithm of the implement of current callbacks-processing is very efficient and technical. But when I studied it and I found a disadvantage: In multi-CPU systems, when a new RCU callback is being queued(call_rcu[_bh]), this callback will be invoked after the grace period for the batch with batch number = rcp->cur+2 has completed very very likely in current implement. Actually, this callback can be invoked after the grace period for the batch with batch number = rcp->cur+1 has completed. The delay of invocation means that latency of synchronize_rcu() is extended. But more important thing is that the callbacks usually free memory, and these works are delayed too! it's necessary for reclaimer to free memory as soon as possible when left memory is few. A very simple way can solve this problem: a field(struct rcu_head::batch) is added to record the batch number for the RCU callback. And when a new RCU callback is being queued, we determine the batch number for this callback(head->batch = rcp->cur+1) and we move this callback to rdp->donelist if we find that head->batch <= rcp->completed when we process callbacks. This simple way reduces the wait time for invocation a lot. (about 2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems) This is my algorithm. But I do not add any field for struct rcu_head in my implement. We just need to memorize the last 2 batches and their batch number, because these 2 batches include all entries that for whom the grace period hasn't completed. So we use a special linked-list rather than add a field. Please see the comment of struct rcu_data. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 16:07:33 +02:00
Lai Jiangshan	3cac97cbb1	rcu classic: simplify the next pending batch use a batch number(rcp->pending) instead of a flag(rcp->next_pending) rcu_start_batch() need to change this flag, so mb()s is needed for memory-access safe. but(after this patch applied) rcu_start_batch() do not change this batch number(rcp->pending), rcp->pending is managed by __rcu_process_callbacks only, and troublesome mb()s are eliminated. And codes look simpler and clearer. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 16:07:32 +02:00
Ingo Molnar	1b427c153a	sched: fix build error, provide partition_sched_domains() unconditionally provide an empty partition_sched_domains() definition for the UP case: include/linux/cpuset.h: In function ‘rebuild_sched_domains': include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 14:02:46 +02:00
Max Krasnyansky	e761b77252	cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) This is based on Linus' idea of creating cpu_active_map that prevents scheduler load balancer from migrating tasks to the cpu that is going down. It allows us to simplify domain management code and avoid unecessary domain rebuilds during cpu hotplug event handling. Please ignore the cpusets part for now. It needs some more work in order to avoid crazy lock nesting. Although I did simplfy and unify domain reinitialization logic. We now simply call partition_sched_domains() in all the cases. This means that we're using exact same code paths as in cpusets case and hence the test below cover cpusets too. Cpuset changes to make rebuild_sched_domains() callable from various contexts are in the separate patch (right next after this one). This not only boots but also easily handles while true; do make clean; make -j 8; done and while true; do on-off-cpu 1; done at the same time. (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing). Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing this on right now in gnome-terminal and things are moving just fine. Also this is running with most of the debug features enabled (lockdep, mutex, etc) no BUG_ONs or lockdep complaints so far. I believe I addressed all of the Dmitry's comments for original Linus' version. I changed both fair and rt balancer to mask out non-active cpus. And replaced cpu_is_offline() with !cpu_active() in the main scheduler code where it made sense (to me). Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Gregory Haskins <ghaskins@novell.com> Cc: dmitry.adamushko@gmail.com Cc: pj@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 13:22:25 +02:00
Pavel Emelyanov	b6fcbdb4f2	proc: consolidate per-net single-release callers They are symmetrical to single_open ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 04:07:44 -07:00
Pavel Emelyanov	de05c557b2	proc: consolidate per-net single_open callers There are already 7 of them - time to kill some duplicate code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 04:07:21 -07:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
David S. Miller	8387400092	pkt_sched: Kill netdev_queue lock. We can simply use the qdisc->q.lock for all of the qdisc tree synchronization. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:30 -07:00
David S. Miller	ead81cc5fc	netdevice: Move qdisc_list back into net_device proper. And give it it's own lock. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:26 -07:00
David S. Miller	37437bb2e1	pkt_sched: Schedule qdiscs instead of netdev_queue. When we have shared qdiscs, packets come out of the qdiscs for multiple transmit queues. Therefore it doesn't make any sense to schedule the transmit queue when logically we cannot know ahead of time the TX queue of the SKB that the qdisc->dequeue() will give us. Just for sanity I added a BUG check to make sure we never get into a state where the noop_qdisc is scheduled. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:20 -07:00
David S. Miller	e2627c8c22	pkt_sched: Make QDISC_RUNNING a qdisc state. Currently it is associated with a netdev_queue, but when we have qdisc sharing that no longer makes any sense. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:18 -07:00
David S. Miller	d3b753db7c	pkt_sched: Move gso_skb into Qdisc. We liberate any dangling gso_skb during qdisc destruction. It really only matters for the root qdisc. But when qdiscs can be shared by multiple netdev_queue objects, we can't have the gso_skb in the netdev_queue any more. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:18 -07:00
David S. Miller	92831bc395	netdev: Kill plain netif_schedule() No more users. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:16 -07:00
David S. Miller	eae792b722	netdev: Add netdev->select_queue() method. Devices or device layers can set this to control the queue selection performed by dev_pick_tx(). This function runs under RCU protection, which allows overriding functions to have some way of synchronizing with things like dynamic ->real_num_tx_queues adjustments. This makes the spinlock prefetch in dev_queue_xmit() a little bit less effective, but that's the price right now for correctness. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:10 -07:00
David S. Miller	e3c50d5d25	netdev: netdev_priv() can now be sane again. The private area of a netdev is now at a fixed offset once more. Unfortunately, some assumptions that netdev_priv() == netdev->priv crept back into the tree. In particular this happened in the loopback driver. Make it use netdev->ml_priv. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:09 -07:00
David S. Miller	6b0fb1261a	netdev: Kill struct net_device_subqueue and netdev->egress_subqueue* No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:08 -07:00
David S. Miller	fd2ea0a79f	net: Use queue aware tests throughout. This effectively "flips the switch" by making the core networking and multiqueue-aware drivers use the new TX multiqueue structures. Non-multiqueue drivers need no changes. The interfaces they use such as netif_stop_queue() degenerate into an operation on TX queue zero. So everything "just works" for them. Code that really wants to do "X" to all TX queues now invokes a routine that does so, such as netif_tx_wake_all_queues(), netif_tx_stop_all_queues(), etc. pktgen and netpoll required a little bit more surgery than the others. In particular the pktgen changes, whilst functional, could be largely improved. The initial check in pktgen_xmit() will sometimes check the wrong queue, which is mostly harmless. The thing to do is probably to invoke fill_packet() earlier. The bulk of the netpoll changes is to make the code operate solely on the TX queue indicated by by the SKB queue mapping. Setting of the SKB queue mapping is entirely confined inside of net/core/dev.c:dev_pick_tx(). If we end up needing any kind of special semantics (drops, for example) it will be implemented here. Finally, we now have a "real_num_tx_queues" which is where the driver indicates how many TX queues are actually active. With IGB changes from Jeff Kirsher. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:07 -07:00
David S. Miller	1d8ae3fdeb	pkt_sched: Remove RR scheduler. This actually fixes a bug added by the RR scheduler changes. The ->bands and ->prio2band parameters were being set outside of the sch_tree_lock() and thus could result in strange behavior and inconsistencies. It might be possible, in the new design (where there will be one qdisc per device TX queue) to allow similar functionality via a TX hash algorithm for RR but I really see no reason to export this aspect of how these multiqueue cards actually implement the scheduling of the the individual DMA TX rings and the single physical MAC/PHY port. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:04 -07:00
David S. Miller	09e83b5d7d	netdev: Kill NETIF_F_MULTI_QUEUE. There is no need for a feature bit for something that can be tested by simply checking the TX queue count. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:03 -07:00
David S. Miller	e8a0464cc9	netdev: Allocate multiple queues for TX. alloc_netdev_mq() now allocates an array of netdev_queue structures for TX, based upon the queue_count argument. Furthermore, all accesses to the TX queues are now vectored through the netdev_get_tx_queue() and netdev_for_each_tx_queue() interfaces. This makes it easy to grep the tree for all things that want to get to a TX queue of a net device. Problem spots which are not really multiqueue aware yet, and only work with one queue, can easily be spotted by grepping for all netdev_get_tx_queue() calls that pass in a zero index. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:00 -07:00
Dan Williams	0839875e0c	async_tx: make async_tx_test_ack a boolean routine Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:56 -07:00
Dan Williams	3dce017137	async_tx: remove depend_tx from async_tx_sync_epilog All callers of async_tx_sync_epilog have called async_tx_quiesce on the depend_tx, so async_tx_sync_epilog need only call the callback to complete the operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:55 -07:00
Dan Williams	d2c52b7983	async_tx: export async_tx_quiesce Replace open coded "wait and acknowledge" instances with async_tx_quiesce. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:55 -07:00
Joel Becker	a6795e9ebb	configfs: Allow ->make_item() and ->make_group() to return detailed errors. The configfs operations ->make_item() and ->make_group() currently return a new item/group. A return of NULL signifies an error. Because of this, -ENOMEM is the only return code bubbled up the stack. Multiple folks have requested the ability to return specific error codes when these operations fail. This patch adds that ability by changing the ->make_item/group() ops to return ERR_PTR() values. These errors are bubbled up appropriately. NULL returns are changed to -ENOMEM for compatibility. Also updated are the in-kernel users of configfs. This is a rework of reverted commit `11c3b79218`. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 15:21:29 -07:00
Joel Becker	f89ab8619e	Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors." This reverts commit `11c3b79218`. The code will move to PTR_ERR(). Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 14:53:48 -07:00
Linus Torvalds	5b664cb235	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: [PATCH] ocfs2: fix oops in mmap_truncate testing configfs: call drop_link() to cleanup after create_link() failure configfs: Allow ->make_item() and ->make_group() to return detailed errors. configfs: Fix failing mkdir() making racing rmdir() fail configfs: Fix deadlock with racing rmdir() and rename() configfs: Make configfs_new_dirent() return error code instead of NULL configfs: Protect configfs_dirent s_links list mutations configfs: Introduce configfs_dirent_lock ocfs2: Don't snprintf() without a format. ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs ocfs2/net: Silence build warnings on sparc64 ocfs2: Handle error during journal load ocfs2: Silence an error message in ocfs2_file_aio_read() ocfs2: use simple_read_from_buffer() ocfs2: fix printk format warnings with OCFS2_FS_STATS=n [PATCH 2/2] ocfs2: Instrument fs cluster locks [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option	2008-07-17 10:55:51 -07:00
Roland McGrath	f470021adb	ptrace children revamp ptrace no longer fiddles with the children/sibling links, and the old ptrace_children list is gone. Now ptrace, whether of one's own children or another's via PTRACE_ATTACH, just uses the new ptraced list instead. There should be no user-visible difference that matters. The only change is the order in which do_wait() sees multiple stopped children and stopped ptrace attachees. Since wait_task_stopped() was changed earlier so it no longer reorders the children list, we already know this won't cause any new problems. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-16 18:02:33 -07:00
Linus Torvalds	dc7c65db28	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits) Revert "x86/PCI: ACPI based PCI gap calculation" PCI: remove unnecessary volatile in PCIe hotplug struct controller x86/PCI: ACPI based PCI gap calculation PCI: include linux/pm_wakeup.h for device_set_wakeup_capable PCI PM: Fix pci_prepare_to_sleep x86/PCI: Fix PCI config space for domains > 0 Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n PCI: Simplify PCI device PM code PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep PCI ACPI: Rework PCI handling of wake-up ACPI: Introduce new device wakeup flag 'prepared' ACPI: Introduce acpi_device_sleep_wake function PCI: rework pci_set_power_state function to call platform first PCI: Introduce platform_pci_power_manageable function ACPI: Introduce acpi_bus_power_manageable function PCI: make pci_name use dev_name PCI: handle pci_name() being const PCI: add stub for pci_set_consistent_dma_mask() PCI: remove unused arch pcibios_update_resource() functions PCI: fix pci_setup_device()'s sprinting into a const buffer ... Fixed up conflicts in various files (arch/x86/kernel/setup_64.c, arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c, drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86 and ACPI updates manually.	2008-07-16 17:25:46 -07:00
Kumar Gala	b219108cba	fs_enet: Remove !CONFIG_PPC_CPM_NEW_BINDING code Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING. Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h) Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:49 -05:00
Scott Wood	d87eb12785	gianfar: Add magic packet and suspend/resume support. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:47 -05:00
Scott Wood	d49747bdfb	powerpc/mpc83xx: Power Management support Basic PM support for 83xx. Standby is implemented as sleep. Suspend-to-RAM is implemented as "deep sleep" (with the processor turned off) on 831x. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:30 -05:00
Linus Torvalds	8a0ca91e1d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (68 commits) sdio_uart: Fix SDIO break control to now return success or an error mmc: host driver for Ricoh Bay1Controllers sdio: sdio_io.c Fix sparse warnings sdio: fix the use of hard coded timeout value. mmc: OLPC: update vdd/powerup quirk comment mmc: fix spares errors of sdhci.c mmc: remove multiwrite capability wbsd: fix bad dma_addr_t conversion atmel-mci: Driver for Atmel on-chip MMC controllers mmc: fix sdio_io sparse errors mmc: wbsd.c fix shadowing of 'dma' variable MMC: S3C24XX: Refuse incorrectly aligned transfers MMC: S3C24XX: Add maintainer entry MMC: S3C24XX: Update error debugging. MMC: S3C24XX: Add media presence test to request handling. MMC: S3C24XX: Fix use of msecs where jiffies are needed MMC: S3C24XX: Add MODULE_ALIAS() entries for the platform devices MMC: S3C24XX: Fix s3c2410_dma_request() return code check. MMC: S3C24XX: Allow card-detect on non-IRQ capable pin MMC: S3C24XX: Ensure host->mrq->data is valid ... Manually fixed up bogus executable bits on drivers/mmc/core/sdio_io.c and include/linux/mmc/sdio_func.h when merging.	2008-07-16 15:17:52 -07:00
Linus Torvalds	9c1be0c471	Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6 * 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: include to compilation UBIFS: add new flash file system UBIFS: add brief documentation MAINTAINERS: add UBIFS section do_mounts: allow UBI root device name VFS: export sync_sb_inodes VFS: move inode_lock into sync_sb_inodes	2008-07-16 15:02:57 -07:00
Linus Torvalds	42fdd144a4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) IDE: Report errors during drive reset back to user space Update documentation of HDIO_DRIVE_RESET ioctl IDE: Remove unused code IDE: Fix HDIO_DRIVE_RESET handling hd.c: remove the #include <linux/mc146818rtc.h> update the BLK_DEV_HD help text move ide/legacy/hd.c to drivers/block/ ide/legacy/hd.c: use late_initcall() remove BLK_DEV_HD_ONLY ide: endian annotations in ide-floppy.c ide-floppy: zero out the whole struct ide_atapi_pc on init ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path ide-cd: move request prep from cdrom_start_rw_cont to rq issue path ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path ide-cd: fold cdrom_start_seek into ide_cd_do_request ide-cd: simplify request issuing path ide-cd: mv ide_do_rw_cdrom ide_cd_do_request ide-cd: cdrom_start_seek: remove unused argument block ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block ...	2008-07-16 14:53:54 -07:00
Linus Torvalds	4314652bb4	Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6 * 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits) Fix FADT parsing Add the ability to reset the machine using the RESET_REG in ACPI's FADT table. ACPI: use dev_printk when possible PNPACPI: add support for HP vendor-specific CCSR descriptors PNP: avoid legacy IDE IRQs PNP: convert resource options to single linked list ISAPNP: handle independent options following dependent ones PNP: remove extra 0x100 bit from option priority PNP: support optional IRQ resources PNP: rename pnp_register__resource() local variables PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR PNP: centralize resource option allocations PNP: remove redundant pnp_can_configure() check PNP: make resource assignment functions return 0 (success) or -EBUSY (failure) PNP: in debug resource dump, make empty list obvious PNP: improve resource assignment debug PNP: increase I/O port & memory option address sizes PNP: introduce pnp_irq_mask_t typedef PNP: make resource option structures private to PNP subsystem PNP: define PNP-specific IORESOURCE_IO_ flags alongside IRQ, DMA, MEM ...	2008-07-16 14:52:12 -07:00
Martin K. Petersen	d442cc44c0	block: Trivial fix for blk_integrity_rq() Fail integrity check gracefully when request does not have a bio attached (BLOCK_PC). Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-16 14:51:41 -07:00
Linus Torvalds	8df1b049bc	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits) NFSv4: Remove BKL from the nfsv4 state recovery SUNRPC: Remove the BKL from the callback functions NFS: Remove BKL from the readdir code NFS: Remove BKL from the symlink code NFS: Remove BKL from the sillydelete operations NFS: Remove the BKL from the rename, rmdir and unlink operations NFS: Remove BKL from NFS lookup code NFS: Remove the BKL from nfs_link() NFS: Remove the BKL from the inode creation operations NFS: Remove BKL usage from open() NFS: Remove BKL usage from the write path NFS: Remove the BKL from the permission checking code NFS: Remove attribute update related BKL references NFS: Remove BKL requirement from attribute updates NFS: Protect inode->i_nlink updates using inode->i_lock nfs: set correct fl_len in nlmclnt_test() SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier SUNRPC: None of rpcb_create's callers wants a privileged source port SUNRPC: Introduce a specific rpcb_create for contacting localhost ...	2008-07-16 14:49:49 -07:00
Bjorn Helgaas	1f32ca31e7	PNP: convert resource options to single linked list ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of a device, i.e., the possibilities an OS bus driver has when it assigns I/O port, MMIO, and other resources to the device. PNP used to maintain this "possible resource setting" information in one independent option structure and a list of dependent option structures for each device. Each of these option structures had lists of I/O, memory, IRQ, and DMA resources, for example: dev independent options ind-io0 -> ind-io1 ... ind-mem0 -> ind-mem1 ... ... dependent option set 0 dep0-io0 -> dep0-io1 ... dep0-mem0 -> dep0-mem1 ... ... dependent option set 1 dep1-io0 -> dep1-io1 ... dep1-mem0 -> dep1-mem1 ... ... ... This data structure was designed for ISAPNP, where the OS configures device resource settings by writing directly to configuration registers. The OS can write the registers in arbitrary order much like it writes PCI BARs. However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces that perform device configuration, and it is important to pass the desired settings to those interfaces in the correct order. The OS learns the correct order by using firmware interfaces that return the "current resource settings" and "possible resource settings," but the option structures above doesn't store the ordering information. This patch replaces the independent and dependent lists with a single list of options. For example, a device might have possible resource settings like this: dev options ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ... All the possible settings are in the same list, in the order they come from the firmware "possible resource settings" list. Each entry is tagged with an independent/dependent flag. Dependent entries also have a "set number" and an optional priority value. All dependent entries must be assigned from the same set. For example, the OS can use all the entries from dependent set 0, or all the entries from dependent set 1, but it cannot mix entries from set 0 with entries from set 1. Prior to this patch PNP didn't keep track of the order of this list, and it assigned all independent options first, then all dependent ones. Using the example above, that resulted in a "desired configuration" list like this: ind->io0 -> ind->io1 -> depN-io0 ... instead of the list the firmware expects, which looks like this: ind->io0 -> depN-io0 -> ind-io1 ... Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:07 +02:00
Bjorn Helgaas	d5ebde6ef5	PNP: support optional IRQ resources This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when assigning resources to a device. If the flag is set and we are unable to assign an IRQ to the device, we can leave the IRQ disabled but allow the overall resource allocation to succeed. Some devices request an IRQ, but can run without an IRQ (possibly with degraded performance). This flag lets us run the device without the IRQ instead of just leaving the device disabled. This is a reimplementation of this previous change by Rene Herman <rene.herman@gmail.com>: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43 I reimplemented this for two reasons: - to prepare for converting all resource options into a single linked list, as opposed to the per-resource-type lists we have now, and - to preserve the order and number of resource options. In PNPBIOS and ACPI, we configure a device by giving firmware a list of resource assignments. It is important that this list has exactly the same number of resources, in the same order, as the "template" list we got from the firmware in the first place. The problem of a sound card MPU401 being left disabled for want of an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:07 +02:00
Bjorn Helgaas	a1802c4295	PNP: make resource option structures private to PNP subsystem Nothing outside the PNP subsystem should need access to a device's resource options, so this patch moves the option structure declarations to a private header file. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	08c9f262f2	PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED in a private header file, but put those flags in struct resource.flags fields. Better to make them IORESOURCE_IO_* flags like the existing IRQ, DMA, and MEM flags. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	57fd51a8be	PNP: add pnp_possible_config() -- can a device could be configured this way? As part of a heuristic to identify modem devices, 8250_pnp.c checks to see whether a device can be configured at any of the legacy COM port addresses. This patch moves the code that traverses the PNP "possible resource options" from 8250_pnp.c to the PNP subsystem. This encapsulation is important because a future patch will change the implementation of those resource options. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	aee3ad815d	PNP: replace pnp_resource_table with dynamically allocated resources PNP used to have a fixed-size pnp_resource_table for tracking the resources used by a device. This table often overflowed, so we've had to increase the table size, which wastes memory because most devices have very few resources. This patch replaces the table with a linked list of resources where the entries are allocated on demand. This removes messages like these: pnpacpi: exceeded the max number of IO resources 00:01: too many I/O port resources References: http://bugzilla.kernel.org/show_bug.cgi?id=9535 http://bugzilla.kernel.org/show_bug.cgi?id=9740 http://lkml.org/lkml/2007/11/30/110 This patch also changes the way PNP uses the IORESOURCE_UNSET, IORESOURCE_AUTO, and IORESOURCE_DISABLED flags. Prior to this patch, the pnp_resource_table entries used the flags like this: IORESOURCE_UNSET This table entry is unused and available for use. When this flag is set, we shouldn't look at anything else in the resource structure. This flag is set when a resource table entry is initialized. IORESOURCE_AUTO This resource was assigned automatically by pnp_assign_{io,mem,etc}(). This flag is set when a resource table entry is initialized and cleared whenever we discover a resource setting by reading an ISAPNP config register, parsing a PNPBIOS resource data stream, parsing an ACPI _CRS list, or interpreting a sysfs "set" command. Resources marked IORESOURCE_AUTO are reinitialized and marked as IORESOURCE_UNSET by pnp_clean_resource_table() in these cases: - before we attempt to assign resources automatically, - if we fail to assign resources automatically, - after disabling a device IORESOURCE_DISABLED Set by pnp_assign_{io,mem,etc}() when automatic assignment fails. Also set by PNPBIOS and PNPACPI for: - invalid IRQs or GSI registration failures - invalid DMA channels - I/O ports above 0x10000 - mem ranges with negative length After this patch, there is no pnp_resource_table, and the resource list entries use the flags like this: IORESOURCE_UNSET This flag is no longer used in PNP. Instead of keeping IORESOURCE_UNSET entries in the resource list, we remove entries from the list and free them. IORESOURCE_AUTO No change in meaning: it still means the resource was assigned automatically by pnp_assign_{port,mem,etc}(), but these functions now set the bit explicitly. We still "clean" a device's resource list in the same places, but rather than reinitializing IORESOURCE_AUTO entries, we just remove them from the list. Note that IORESOURCE_AUTO entries are always at the end of the list, so removing them doesn't reorder other list entries. This is because non-IORESOURCE_AUTO entries are added by the ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the sysfs "set" command. In each of these cases, we completely free the resource list first. IORESOURCE_DISABLED In addition to the cases where we used to set this flag, ISAPNP now adds an IORESOURCE_DISABLED resource when it reads a configuration register with a "disabled" value. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>	2008-07-16 23:27:05 +02:00
Bjorn Helgaas	20bfdbba72	PNP: make pnp_{port,mem,etc}_start(), et al work for invalid resources Some callers use pnp_port_start() and similar functions without making sure the resource is valid. This patch makes us fall back to returning the initial values if the resource is not valid or not even present. This mostly preserves the previous behavior, where we would just return the initial values set by pnp_init_resource_table(). The original 2.6.25 code didn't range-check the "bar", so it would return garbage if the bar exceeded the table size. This code returns sensible values instead. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>	2008-07-16 23:27:05 +02:00
Rafael J. Wysocki	ebb12db51f	Freezer: Introduce PF_FREEZER_NOSIG The freezer currently attempts to distinguish kernel threads from user space tasks by checking if their mm pointer is unset and it does not send fake signals to kernel threads. However, there are kernel threads, mostly related to networking, that behave like user space tasks and may want to be sent a fake signal to be frozen. Introduce the new process flag PF_FREEZER_NOSIG that will be set by default for all kernel threads and make the freezer only send fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide the set_freezable_with_signal() function to be called by the kernel threads that want to be sent a fake signal for freezing. This patch should not change the freezer's observable behavior. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:03 +02:00
Elias Oltmanns	3ef5eb424e	IDE: Remove unused code Remove some code which has been made obsolete and hasn't worked properly before anyway. Part of the infrastructure may be reintroduced in a follow up patch to implement a working command aborting facility. Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk> Cc: "Randy Dunlap" <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:48 +02:00
Elias Oltmanns	79e36a9f54	IDE: Fix HDIO_DRIVE_RESET handling Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken in various ways. Most importantly, it is treated as an out of band request in an illegal way which may very likely lead to system lock ups. Use the drive's request queue to avoid this problem (and fix a locking issue for free along the way). Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk> Cc: "Randy Dunlap" <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:48 +02:00
Bartlomiej Zolnierkiewicz	e6d95bd149	ide: ->port_init_devs -> ->init_dev Change ->port_init_devs method to take 'ide_drive_t ' as an argument instead of 'ide_hwif_t ' and rename it to ->init_dev. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:42 +02:00
Bartlomiej Zolnierkiewicz	c56c5648a3	ide: set hwif->dev in ide_init_port_hw() (take 2) * Add 'parent' field to hw_regs_t for optional parent device pointer (needed by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw(). * Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly. v2: * Update scc_pata.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz	63b51c6d1d	ide: make ide_hwifs[] static Move ide_hwifs[] from ide.c to ide-probe.c and make it static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz	9ad5409375	ide: move PIO blacklist to ide-pio-blacklist.c Move PIO blacklist to ide-pio-blacklist.c. While at it: - fix comment - fix whitespace damage There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	3e153cfb5e	ide: remove no longer used ide_pio_timings[] Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	c9d6c1a237	ide: move ide_pio_cycle_time() to ide-timings.c All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS so move the function from ide-lib.c to ide-timings.c. While at it: - convert ide_pio_cycle_time() to use ide_timing_find_mode() - cleanup ide_pio_cycle_time() a bit There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	f06ab3402a	ide: convert ide-timing.h to ide-timings.c library (take 2) * Don't include ide-timing.h in cs5535 and sis5513 host drivers (they don't need it currently). * Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS config option to be selected by host drivers using the library. While at it: - fix ide_timing_find_mode() placement v2: * Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>) There should be no functional changes caused by this patch. Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:37 +02:00
Bartlomiej Zolnierkiewicz	3be53f3f21	ide: move some bits from ide-timing.h to <linux/ide.h> Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h> from drivers/ide/ide-timing.h. While at it: - use u8/u16 instead of short for struct ide_timing fields - use enum for IDE_TIMING_* There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:36 +02:00
Linus Torvalds	45158894d4	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (249 commits) powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES powerpc: Fix a build problem on ppc32 with new DMA_ATTRs ibm_newemac: Add MII mode support to the EMAC RGMII bridge. powerpc: Don't spin on sync instruction at boot time powerpc: Add VSX load/store alignment exception handler powerpc: fix giveup_vsx to save registers correctly powerpc: support for latencytop powerpc: Remove unnecessary condition when sanity-checking WIMG bits powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT powerpc: Add driver for Barrier Synchronization Register powerpc: mman.h export fixups powerpc/fsl: update crypto node definition and device tree instances powerpc/fsl: Refactor device bindings powerpc/85xx: Minor fixes for 85xxds and 8536ds board. powerpc: Add 82xx/83xx/86xx to 6xx Multiplatform powerpc/85xx: publish of device for cds platforms powerpc/booke: don't reinitialize time base powerpc/86xx: Refactor pic init powerpc/CPM: Add i2c pins to dts and board setup cpm_uart: Support uart_wait_until_sent() ...	2008-07-15 19:04:58 -07:00
Linus Torvalds	89a93f2f48	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits) [SCSI] scsi_dh: fix kconfig related build errors [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h [SCSI] make struct scsi_{host,target}_type static [SCSI] fix locking in host use of blk_plug_device() [SCSI] zfcp: Cleanup external header file [SCSI] zfcp: Cleanup code in zfcp_erp.c [SCSI] zfcp: zfcp_fsf cleanup. [SCSI] zfcp: consolidate sysfs things into one file. [SCSI] zfcp: Cleanup of code in zfcp_aux.c [SCSI] zfcp: Cleanup of code in zfcp_scsi.c [SCSI] zfcp: Move status accessors from zfcp to SCSI include file. [SCSI] zfcp: Small QDIO cleanups [SCSI] zfcp: Adapter reopen for large number of unsolicited status [SCSI] zfcp: Fix error checking for ELS ADISC requests [SCSI] zfcp: wait until adapter is finished with ERP during auto-port [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver [SCSI] sg: Add target reset support [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC [SCSI] sd: Move scsi_disk() accessor function to sd.h ...	2008-07-15 18:58:04 -07:00
Benjamin Herrenschmidt	84c3d4aaec	Merge commit 'origin/master' Manual merge of: arch/powerpc/Kconfig arch/powerpc/kernel/stacktrace.c arch/powerpc/mm/slice.c arch/ppc/kernel/smp.c	2008-07-16 11:07:59 +10:00
Trond Myklebust	e89e896d31	Merge branch 'devel' into next Conflicts: fs/nfs/file.c Fix up the conflict with Jon Corbet's bkl-removal tree	2008-07-15 18:34:16 -04:00
Ingo Molnar	82638844d9	Merge branch 'linus' into cpus4096 Conflicts: arch/x86/xen/smp.c kernel/sched_rt.c net/iucv/iucv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 00:29:07 +02:00
Chuck Lever	c2e1b09ff2	SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon Introduce a new API to register RPC services on IPv6 interfaces to allow the NFS server and lockd to advertise on IPv6 networks. Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind protocol version 4 to contact the local rpcbind daemon. The version 4 SET/UNSET procedures allow services to register address families besides AF_INET, register at specific network interfaces, and register transport protocols besides UDP and TCP. All of this functionality is exposed via the new rpcb_v4_register() kernel API. A user-space rpcbind daemon implementation that supports version 4 of the rpcbind protocol is required in order to make use of this new API. Note that rpcbind version 3 is sufficient to support the new rpcbind facilities listed above, but most extant implementations use version 4. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:55 -04:00
Ingo Molnar	1e09481365	Merge branch 'linus' into core/softlockup Conflicts: kernel/softlockup.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 23:12:58 +02:00
Linus Torvalds	59190f4213	Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) generic-ipi: more merge fallout generic-ipi: merge fix x86, visws: use mach-default/entry_arch.h x86, visws: fix generic-ipi build generic-ipi: fixlet generic-ipi: fix s390 build bug generic-ipi: fix linux-next tree build failure fix: "smp_call_function: get rid of the unused nonatomic/retry argument" fix: "smp_call_function: get rid of the unused nonatomic/retry argument" fix "smp_call_function: get rid of the unused nonatomic/retry argument" on_each_cpu(): kill unused 'retry' parameter smp_call_function: get rid of the unused nonatomic/retry argument sh: convert to generic helpers for IPI function calls parisc: convert to generic helpers for IPI function calls mips: convert to generic helpers for IPI function calls m32r: convert to generic helpers for IPI function calls arm: convert to generic helpers for IPI function calls alpha: convert to generic helpers for IPI function calls ia64: convert to generic helpers for IPI function calls powerpc: convert to generic helpers for IPI function calls ... Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually	2008-07-15 14:12:03 -07:00
Chuck Lever	367c8c7bd9	lockd: Pass "struct sockaddr *" to new failover-by-IP function Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to allow for future support of IPv6. Also provide additional sanity checking in failover_unlock_ip() when constructing the server's IP address. As an added bonus, provide clean kerneldoc comments on related NLM interfaces which were recently added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 16:11:29 -04:00
Ingo Molnar	1a781a777b	Merge branch 'generic-ipi' into generic-ipi-for-linus Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 21:55:59 +02:00
Ingo Molnar	6c9fcaf2ee	Merge branch 'core/rcu' into core/rcu-for-linus	2008-07-15 21:10:12 +02:00
Jeff Layton	6cde4de807	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this function, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_lock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. Since nlmsvc_testlock() now just uses the caller's reference, we no longer need to get or release it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 14:53:33 -04:00
Jeff Layton	8f920d5e29	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this functions, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_testlock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. We take a reference to host in the place where nlmsvc_testlock() previous did a new lookup, so the reference counting is unchanged from before. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 14:26:52 -04:00
Linus Torvalds	b312bf359e	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: AHCI: Remove an unnecessary flush from ahci_qc_issue AHCI: speed up resume [libata] Add support for VPD page b1 ata: endianness annotations in pata drivers libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc [libata] sata_svw: update code comments relating to data corruption libata/ahci: enclosure management support libata: improve EH internal command timeout handling libata: use ULONG_MAX to terminate reset timeout table libata: improve EH retry delay handling libata: consistently use msecs for time durations	2008-07-15 11:18:10 -07:00
Linus Torvalds	dc221eae08	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (56 commits) i2c: Add detection capability to new-style drivers i2c: Call client_unregister for new-style devices too i2c: Clean up old chip drivers i2c-ibm_iic: Register child nodes i2c: New-style EEPROM driver using device IDs i2c: Export the i2c_bus_type symbol i2c-au1550: Fix PM support i2c-dev: Delete empty detach_client callback i2c: Drop stray references to lm_sensors i2c: Check for ACPI resource conflicts i2c-ocores: basic PM support i2c-sibyte: SWARM I2C board initialization i2c-i801: Fix handling of error conditions i2c-i801: Rename local variable temp to status i2c-i801: Properly report bus arbitration loss i2c-i801: Remove verbose debugging messages i2c-algo-pcf: Drop unused struct members i2c-algo-pcf: Multi-master lost-arbitration improvement i2c: Deprecate the legacy gpio drivers i2c-pxa: Initialize early ...	2008-07-15 11:16:05 -07:00
Linus Torvalds	98339cbd36	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits) ide-floppy: fix unfortunate function naming ide-tape: unify idetape_create_read/write_cmd ide: add ide_pc_intr() helper ide-{floppy,scsi}: read Status Register before stopping DMA engine ide-scsi: add more debugging to idescsi_pc_intr() ide-scsi: use pc->callback ide-floppy: add more debugging to idefloppy_pc_intr() ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled ide-tape: add ide_tape_io_buffers() helper ide-tape: factor out DSC handling from idetape_pc_intr() ide-{floppy,tape}: move checking of ->failed_pc to ->callback ide: add ide_issue_pc() helper ide: add PC_FLAG_DRQ_INTERRUPT pc flag ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc() ide: add ide_transfer_pc() helper ide-scsi: set drive->scsi flag for devices handled by the driver ide-{cd,floppy,tape}: remove checking for drive->scsi ide: add PC_FLAG_ZIP_DRIVE pc flag ide-tape: factor out waiting for good ireason from idetape_transfer_pc() ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc() ...	2008-07-15 11:15:36 -07:00
Bartlomiej Zolnierkiewicz	646c0cb6c4	ide: add ide_pc_intr() helper * ide-tape.c: add 'drive' argument to idetape_update_buffers(). * Add generic ide_pc_intr() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. * ide-tape.c: remove no longer needed DBG_PC_INTR. There should be no functional changes caused by this patch (unless the debugging is explicitely compiled in). Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:22:03 +02:00
Bartlomiej Zolnierkiewicz	6bf1641ca1	ide: add ide_issue_pc() helper Add generic ide_issue_pc() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:22:00 +02:00
Bartlomiej Zolnierkiewicz	28c7214bd8	ide: add PC_FLAG_DRQ_INTERRUPT pc flag Add PC_FLAG_DRQ_INTERRUPT pc flag, set it in ide_do_request() and check for it (instead of checking for IDE_FLAG_DRQ_INTERRUPT) in ide*_issue_pc(). This is a preparation for adding generic ide_issue_pc() helper. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:59 +02:00
Bartlomiej Zolnierkiewicz	594c16d8dd	ide: add ide_transfer_pc() helper * Add ide-atapi.c file for generic ATAPI support together with CONFIG_IDE_ATAPI config option. * Add generic ide_transfer_pc() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:58 +02:00
Bartlomiej Zolnierkiewicz	5d41893c0f	ide: add PC_FLAG_ZIP_DRIVE pc flag Add PC_FLAG_ZIP_DRIVE pc flag, set it in idefloppy_do_request() and check for it (instead of checking for IDEFLOPPY_FLAG_ZIP_DRIVE) in idefloppy_transfer_pc(). This is a preparation for adding generic ide_transfer_pc() helper. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:57 +02:00
Bartlomiej Zolnierkiewicz	5e33109582	ide-{floppy,tape}: PC_FLAG_DMA_RECOMMENDED -> PC_FLAG_DMA_OK * Use PC_FLAG_DMA_OK flag instead of PC_FLAG_DMA_RECOMMENDED one. * Remove no longer used PC_FLAG_DMA_RECOMMENDED flag. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz	1b06e92aa0	ide-{floppy,tape}: merge pc->idefloppy_callback and pc->idetape_callback Merge pc->idefloppy_callback and pc->idetape_callback into pc->callback. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz	92f5daff2b	ide-tape: make pc->idetape_callback void There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:55 +02:00
FUJITA Tomonori	63f5abb095	ide: remove action argument in ide_do_drive_cmd ide_do_drive_cmd is called only with ide_preempt action argument. So we can remove the action argument in ide_do_drive_cmd and ide_action_t typedef. This patch also includes two minor cleanups: 1) ide_do_drive_cmd always succeeds so we don't need the return value; 2) the callers use blk_rq_init before ide_do_drive_cmd so there is no need to initialize rq->errors. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:51 +02:00
Bartlomiej Zolnierkiewicz	ff07488346	ide: remove drive->ctl Remove drive->ctl (it is always equal to 0x08 after init time). While at it: * Use ATA_DEVCTL_OBS define. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz	0fd04dcc2e	ide: use ->OUTBSYNC in ide_set_irq() Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz	f8c4bd0ab2	ide: pass 'hwif ' instead of 'drive ' to ->OUTBSYNC method There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz	1357214461	ide: remove ->mmio flag from ide_hwif_t Since scc_pata host driver no longer uses IDE PCI layer / ide_dma_setup() and all other ->mmio users set also IDE_HFLAG_MMIO host flag we can safely remove ->mmio flag. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz	ed4af48fd6	ide: move IRQ unmasking out from ->tf_load method Move IRQ unmasking out from ->tf_load method to its users. There should be no functional changes caused by this patch (SELECT_MASK() is NOP except for hpt366, icside and sgiioc4). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz	9a410e79b5	ide: remove IDE_TFLAG_NO_SELECT_MASK taskfile flag Always call SELECT_MASK(..., 0) in ide_tf_load() (needs to be done to match ide_set_irq(..., 1)) and then remove IDE_TFLAG_NO_SELECT_MASK taskfile flag. This change should only affect hpt366 and icside host drivers since ->maskproc(..., 0) for sgiioc4 is equivalent to ide_set_irq(..., 1). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz	931ee0dc5c	ide: remove obsoleted "ide=" kernel parameters * Remove obsoleted "ide=" kernel parameters. * Remove no longer needed: - ide_setup() - parse_options() - __setup("", ...) - module_param(options, ...) * Use module_{init,exit}() for MODULE=y case and remove MODULE ifdef. * Make ide_acpi and ide_doubler variables static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:47 +02:00
Bartlomiej Zolnierkiewicz	30e5ee4d1a	ide: remove obsoleted "idebus=" kernel parameter * Remove obsoleted "idebus=" kernel parameter. * Remove no longer needed ide_system_bus_speed() and system_bus_clock() (together with idebus_parameter and system_bus_speed variables). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:46 +02:00
FUJITA Tomonori	681a561b7e	block: unexport blk_end_sync_rq All the users of blk_end_sync_rq has gone (they are converted to use blk_execute_rq). This unexports blk_end_sync_rq. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:45 +02:00
FUJITA Tomonori	124cafc5eb	ide: remove ide_init_drive_cmd ide_init_drive_cmd just calls blk_rq_init. This converts the users of ide_init_drive_cmd to use blk_rq_init directly and removes ide_init_drive_cmd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:44 +02:00
Linus Torvalds	61d97f4fcf	Merge branch 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: remove extraneous checks in manage.c genirq: Expose default irq affinity mask (take 3)	2008-07-15 10:39:22 -07:00
Linus Torvalds	38c46578ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: [GFS2] Fix GFS2's use of do_div() in its quota calculations [GFS2] Remove unused declaration [GFS2] Remove support for unused and pointless flag [GFS2] Replace rgrp "recent list" with mru list [GFS2] Allow local DF locks when holding a cached EX glock [GFS2] Fix delayed demote race [GFS2] don't call permission() [GFS2] Fix module building [GFS2] Glock documentation [GFS2] Remove all_list from lock_dlm [GFS2] Remove obsolete conversion deadlock avoidance code [GFS2] Remove remote lock dropping code [GFS2] kernel panic mounting volume [GFS2] Revise readpage locking [GFS2] Fix ordering of args for list_add [GFS2] trivial sparse lock annotations [GFS2] No lock_nolock [GFS2] Fix ordering bug in lock_dlm [GFS2] Clean up the glock core	2008-07-15 10:38:46 -07:00
Linus Torvalds	e7849f16c1	Merge branch 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: cputopology: always define CPU topology information, clean up cpu topology: always define CPU topology information	2008-07-15 10:32:39 -07:00
Linus Torvalds	8d2567a620	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits) ext4: Documention update for new ordered mode and delayed allocation ext4: do not set extents feature from the kernel ext4: Don't allow nonextenst mount option for large filesystem ext4: Enable delalloc by default. ext4: delayed allocation i_blocks fix for stat ext4: fix delalloc i_disksize early update issue ext4: Handle page without buffers in ext4_*_writepage() ext4: Add ordered mode support for delalloc ext4: Invert lock ordering of page_lock and transaction start in delalloc mm: Add range_cont mode for writeback ext4: delayed allocation ENOSPC handling percpu_counter: new function percpu_counter_sum_and_set ext4: Add delayed allocation support in data=writeback mode vfs: add hooks for ext4's delayed allocation support jbd2: Remove data=ordered mode support using jbd buffer heads ext4: Use new framework for data=ordered mode in JBD2 jbd2: Implement data=ordered mode handling via inodes vfs: export filemap_fdatawrite_range() ext4: Fix lock inversion in ext4_ext_truncate() ext4: Invert the locking order of page_lock and transaction start ...	2008-07-15 08:36:38 -07:00
Benzi Zbit	62a7573ee9	sdio: fix the use of hard coded timeout value. This adds reading and using of enable_timeout from the CIS Signed-off-by: Benzi Zbit <benzi.zbit@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 15:47:03 +02:00
Pierre Ossman	23af60398a	mmc: remove multiwrite capability Relax requirements on host controllers and only require that they do not report a transfer count than is larger than the actual one (i.e. a lower value is okay). This is how many other parts of the kernel behaves so upper layers should already be prepared to handle that scenario. This gives us a performance boost on MMC cards. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:49 +02:00
Tomas Winkler	6d37333163	mmc: fix sdio_io sparse errors This patch fixes sdio_io sparse errors. This fix changes signature of API functions, changing unsigned char -> u8 unsigned short -> u16 unsigned long -> u32 - this was probably a bug in 64 bit platforms Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:48 +02:00
Pierre Ossman	ad3868b2ec	mmc,sdio: helper function for transfer padding There are a lot of crappy controllers out there that cannot handle all the request sizes that the MMC/SD/SDIO specifications require. In case the card driver can pad the data to overcome the problems, this commit adds a helper that calculates how much that padding should be. A corresponding helper is also added for SDIO, but it can also deal with all the complexities of splitting up a large transfer efficiently. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:44 +02:00
Anton Vorontsov	08f80bb519	mmc: change .get_ro() callback semantics Now get_ro() callback must return 0/1 values for its logical states, and negative errno values in case of error. If particular host instance doesn't support RO/WP switch, it should return -ENOSYS. This patch changes some hosts in two ways: 1. Now functions should be smart to not return negative values in "RO asserted" case (particularly gpio_ calls could return negative values for the outermost GPIOs). Also, board code usually passes get_ro() callbacks that directly return gpioreg & bit result, so at91_mci, imxmmc, pxamci and mmc_spi's get_ro() handlers need take special care when returning platform's values to the mmc core. 2. In case of host instance didn't implement get_ro() callback, it should really return -ENOSYS and let the mmc core decide what to do about it (mmc core thinks the same way as the hosts, so it isn't functional change). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Anton Vorontsov	619ef4b421	mmc_spi: add support for card-detection polling This patch adds new platform data variable "caps", so platforms could pass theirs capabilities into MMC core (for example, platforms without interrupt on the CD line will most probably want to pass MMC_CAP_NEEDS_POLL). New platform get_cd() callback provided to optimize polling. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Anton Vorontsov	28f52482b4	mmc: add support for card-detection polling Some hosts (and boards that use mmc_spi) do not use interrupts on the CD line, so they can't trigger mmc_detect_change. We want to poll the card and see if there was a change. 1 second poll interval seems resonable. This patch also implements .get_cd() host operation, that could be used by the hosts that are able to report card-detect status without need to talk MMC. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Adrian Bunk	150a55683b	include/linux/mmc/mmc.h: remove CVS tags This patch removes a CVS tag that wasn't updated for a long time. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Pierre Ossman	4489428ab5	sdhci: support JMicron secondary interface JMicron chips sometimes have two interfaces to work around limitations in Microsoft's sdhci driver. This patch allows us to use either interface. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:40 +02:00
David S. Miller	e308a5d806	netdev: Add netdev->addr_list_lock protection. Add netif_addr_{lock,unlock}{,_bh}() helpers. Use them to protect operations that operate on or read the network device unicast and multicast address lists. Also use them in cases where the code simply wants to block calls into the driver's ->set_rx_mode() and ->set_multicast_list() methods. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:13:44 -07:00
David S. Miller	f1f28aa351	netdev: Add addr_list_lock to struct net_device. This will be used to protect the per-device unicast and multicast address lists, as well as the callbacks into the drivers which configure such state such as ->set_rx_mode() and ->set_multicast_list(). Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:08:33 -07:00
Ron Livne	521e575b9a	IB/mlx4: Add support for blocking multicast loopback packets Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Patrick McHardy	393e52e33c	packet: deliver VLAN TCI to userspace Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can properly deal with hardware VLAN tagging/stripping. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:50:39 -07:00
Patrick McHardy	bbd6ef87c5	packet: support extensible, 64 bit clean mmaped ring structure The tpacket_hdr is not 64 bit clean due to use of an unsigned long and can't be extended because the following struct sockaddr_ll needs to be at a fixed offset. Add support for a version 2 tpacket protocol that removes these limitations. Userspace can query the header size through a new getsockopt option and change the protocol version through a setsockopt option. The changes needed to switch to the new protocol version are: 1. replace struct tpacket_hdr by struct tpacket2_hdr 2. query header len and save 3. set protocol version to 2 - set up ring as usual 4. for getting the sockaddr_ll, use (void )hdr + TPACKET_ALIGN(hdrlen) instead of (void )hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr)) Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:50:15 -07:00
Patrick McHardy	bc1d0411b8	vlan: deliver packets received with VLAN acceleration to network taps When VLAN header stripping is used, packets currently bypass packet sockets (and other network taps) completely. For locally existing VLANs, they appear directly on the VLAN device, for unknown VLANs they are silently dropped. Add a new function netif_nit_deliver() to deliver incoming packets to all network interface taps and use it in __vlan_hwaccel_rx() to make VLAN packets visible on the underlying device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:30 -07:00
Patrick McHardy	6aa895b047	vlan: Don't store VLAN tag in cb Use a real skb member to store the skb to avoid clashes with qdiscs, which are allowed to use the cb area themselves. As currently only real devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit invalidation is neccessary. The new member fills a hole on 64 bit, the skb layout changes from: __u32 mark; /* 172 4 / sk_buff_data_t transport_header; / 176 4 / sk_buff_data_t network_header; / 180 4 / sk_buff_data_t mac_header; / 184 4 / sk_buff_data_t tail; / 188 4 / / --- cacheline 3 boundary (192 bytes) --- / sk_buff_data_t end; / 192 4 / / XXX 4 bytes hole, try to pack / to __u32 mark; / 172 4 / __u16 vlan_tci; / 176 2 / / XXX 2 bytes hole, try to pack / sk_buff_data_t transport_header; / 180 4 / sk_buff_data_t network_header; / 184 4 */ Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:06 -07:00
Benjamin Herrenschmidt	43d2548bb2	Merge commit '85082fd7cbe3173198aac0eb5e85ab1edcc6352c' into test-build Manual fixup of: arch/powerpc/Kconfig	2008-07-15 15:44:51 +10:00
David S. Miller	925068dcdc	Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-07-14 22:30:17 -07:00
Max Krasnyansky	f271b2cc78	tun: Fix/rewrite packet filtering logic Please see the following thread to get some context on this http://marc.info/?l=linux-netdev&m=121564433018903&w=2 Basically the issue is that current multi-cast filtering stuff in the TUN/TAP driver is seriously broken. Original patch went in without proper review and ACK. It was broken and confusing to start with and subsequent patches broke it completely. To give you an idea of what's broken here are some of the issues: - Very confusing comments throughout the code that imply that the character device is a network interface in its own right, and that packets are passed between the two nics. Which is completely wrong. - Wrong set of ioctls is used for setting up filters. They look like shortcuts for manipulating state of the tun/tap network interface but in reality manipulate the state of the TX filter. - ioctls that were originally used for setting address of the the TX filter got "fixed" and now set the address of the network interface itself. Which made filter totaly useless. - Filtering is done too late. Instead of filtering early on, to avoid unnecessary wakeups, filtering is done in the read() call. The list goes on and on :) So the patch cleans all that up. It introduces simple and clean interface for setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering before enqueuing the packets. TX filtering is useful in the scenarios where TAP is part of a bridge, in which case it gets all broadcast, multicast and potentially other packets when the bridge is learning. So for example Ethernet tunnelling app may want to setup TX filters to avoid tunnelling multicast traffic. QEMU and other hypervisors can push RX filtering that is currently done in the guest into the host context therefore saving wakeups and unnecessary data transfer. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:18:19 -07:00
David S. Miller	fc943b12e4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2008-07-14 20:40:34 -07:00
Patrick McHardy	72d9794f44	net-sched: cls_flow: add perturbation support Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 20:36:32 -07:00
David S. Miller	0c4c8cae44	Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6	2008-07-14 20:32:07 -07:00

... 13 14 15 16 17 ...

12658 Commits