linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-10-31 18:28:19 +00:00

Author	SHA1	Message	Date
Adrian McMenamin	1795cf48b3	sh/maple: clean maple bus code This patch cleans up the handling of the maple bus queue to remove the risk of races when adding packets. It also removes references to the redundant connect and disconnect functions. Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-07-29 22:10:56 +09:00
FUJITA Tomonori	8978b74253	generic, x86: fix add iommu_num_pages helper function This IOMMU helper function doesn't work for some architectures: http://marc.info/?l=linux-kernel&m=121699304403202&w=2 It also breaks POWER and SPARC builds: http://marc.info/?l=linux-kernel&m=121730388001890&w=2 Currently, only x86 IOMMUs use this so let's move it to x86 for now. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-29 12:12:48 +02:00
Avi Kivity	ed84862433	KVM: Advertise synchronized mmu support to userspace Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:34:02 +03:00
Andrea Arcangeli	e930bffe95	KVM: Synchronize guest physical memory map to host virtual memory map Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:33:53 +03:00
Linus Torvalds	5dfb66ba8c	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: accept pure device as a parent, not only platform_device mfd: add platform_data to mfd_cell mfd: Coding style fixes mfd: Use to_platform_device instead of container_of	2008-07-28 18:15:41 -07:00
Linus Torvalds	1d9b9f6a53	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits) x86/PCI: use dev_printk when possible PCI: add D3 power state avoidance quirk PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot PCI: add an option to allow ASPM enabled forcibly PCI: disable ASPM on pre-1.1 PCIe devices PCI: disable ASPM per ACPI FADT setting PCI MSI: Don't disable MSIs if the mask bit isn't supported PCI: handle 64-bit resources better on 32-bit machines PCI: rewrite PCI BAR reading code PCI: document pci_target_state PCI hotplug: fix typo in pcie hotplug output x86 gart: replace to_pages macro with iommu_num_pages x86, AMD IOMMU: replace to_pages macro with iommu_num_pages iommu: add iommu_num_pages helper function dma-coherent: add documentation to new interfaces Cris: convert to using generic dma-coherent mem allocator Sh: use generic per-device coherent dma allocator ARM: support generic per-device coherent dma mem Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE x86: use generic per-device dma coherent allocator ...	2008-07-28 18:14:24 -07:00
Dmitry Baryshkov	424f525a12	mfd: accept pure device as a parent, not only platform_device Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-29 01:30:26 +02:00
Hisashi Hifumi	8ab22b9abb	vfs: pagecache usage optimization for pagesize!=blocksize When we read some part of a file through pagecache, if there is a pagecache of corresponding index but this page is not uptodate, read IO is issued and this page will be uptodate. I think this is good for pagesize == blocksize environment but there is room for improvement on pagesize != blocksize environment. Because in this case a page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate. So I suggest that when all buffers which correspond to a part of a file that we want to read are uptodate, use this pagecache and copy data from this pagecache to user buffer even if a page is not uptodate. This can reduce read IO and improve system throughput. I wrote a benchmark program and got result number with this program. This benchmark do: 1: mount and open a test file. 2: create a 512MB file. 3: close a file and umount. 4: mount and again open a test file. 5: pwrite randomly 300000 times on a test file. offset is aligned by IO size(1024bytes). 6: measure time of preading randomly 100000 times on a test file. The result was: 2.6.26 330 sec 2.6.26-patched 226 sec Arch:i386 Filesystem:ext3 Blocksize:1024 bytes Memory: 1GB On ext3/4, a file is written through buffer/block. So random read/write mixed workloads or random read after random write workloads are optimized with this patch under pagesize != blocksize environment. This test result showed this. The benchmark program is as follows: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <time.h> #include <stdlib.h> #include <string.h> #include <sys/mount.h> #define LEN 1024 #define LOOP 1024512 / 512MB / main(void) { unsigned long i, offset, filesize; int fd; char buf[LEN]; time_t t1, t2; if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } memset(buf, 0, LEN); fd = open("/root/test1/testfile", O_CREAT\|O_RDWR\|O_TRUNC); if (fd < 0) { perror("cannot open file\n"); exit(1); } for (i = 0; i < LOOP; i++) write(fd, buf, LEN); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } fd = open("/root/test1/testfile", O_RDWR); if (fd < 0) { perror("cannot open file\n"); exit(1); } filesize = LEN LOOP; for (i = 0; i < 300000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pwrite(fd, buf, LEN, offset); } printf("start test\n"); time(&t1); for (i = 0; i < 100000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pread(fd, buf, LEN, offset); } time(&t2); printf("%ld sec\n", t2-t1); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } } Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@ucw.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	cddb8a5c14	mmu-notifiers: core With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages. There are secondary MMUs (with secondary sptes and secondary tlbs) too. sptes in the kvm case are shadow pagetables, but when I say spte in mmu-notifier context, I mean "secondary pte". In GRU case there's no actual secondary pte and there's only a secondary tlb because the GRU secondary MMU has no knowledge about sptes and every secondary tlb miss event in the MMU always generates a page fault that has to be resolved by the CPU (this is not the case of KVM where the a secondary tlb miss will walk sptes in hardware and it will refill the secondary tlb transparently to software if the corresponding spte is present). The same way zap_page_range has to invalidate the pte before freeing the page, the spte (and secondary tlb) must also be invalidated before any page is freed and reused. Currently we take a page_count pin on every page mapped by sptes, but that means the pages can't be swapped whenever they're mapped by any spte because they're part of the guest working set. Furthermore a spte unmap event can immediately lead to a page to be freed when the pin is released (so requiring the same complex and relatively slow tlb_gather smp safe logic we have in zap_page_range and that can be avoided completely if the spte unmap event doesn't require an unpin of the page previously mapped in the secondary MMU). The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know when the VM is swapping or freeing or doing anything on the primary MMU so that the secondary MMU code can drop sptes before the pages are freed, avoiding all page pinning and allowing 100% reliable swapping of guest physical address space. Furthermore it avoids the code that teardown the mappings of the secondary MMU, to implement a logic like tlb_gather in zap_page_range that would require many IPI to flush other cpu tlbs, for each fixed number of spte unmapped. To make an example: if what happens on the primary MMU is a protection downgrade (from writeable to wrprotect) the secondary MMU mappings will be invalidated, and the next secondary-mmu-page-fault will call get_user_pages and trigger a do_wp_page through get_user_pages if it called get_user_pages with write=1, and it'll re-establishing an updated spte or secondary-tlb-mapping on the copied page. Or it will setup a readonly spte or readonly tlb mapping if it's a guest-read, if it calls get_user_pages with write=0. This is just an example. This allows to map any page pointed by any pte (and in turn visible in the primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an full MMU with both sptes and secondary-tlb like the shadow-pagetable layer with kvm), or a remote DMA in software like XPMEM (hence needing of schedule in XPMEM code to send the invalidate to the remote node, while no need to schedule in kvm/gru as it's an immediate event like invalidating primary-mmu pte). At least for KVM without this patch it's impossible to swap guests reliably. And having this feature and removing the page pin allows several other optimizations that simplify life considerably. Dependencies: 1) mm_take_all_locks() to register the mmu notifier when the whole VM isn't doing anything with "mm". This allows mmu notifier users to keep track if the VM is in the middle of the invalidate_range_begin/end critical section with an atomic counter incraese in range_begin and decreased in range_end. No secondary MMU page fault is allowed to map any spte or secondary tlb reference, while the VM is in the middle of range_begin/end as any page returned by get_user_pages in that critical section could later immediately be freed without any further ->invalidate_page notification (invalidate_range_begin/end works on ranges and ->invalidate_page isn't called immediately before freeing the page). To stop all page freeing and pagetable overwrites the mmap_sem must be taken in write mode and all other anon_vma/i_mmap locks must be taken too. 2) It'd be a waste to add branches in the VM if nobody could possibly run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of mmu notifiers, but this already allows to compile a KVM external module against a kernel with mmu notifiers enabled and from the next pull from kvm.git we'll start using them. And GRU/XPMEM will also be able to continue the development by enabling KVM=m in their config, until they submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n). This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM are all =n. The mmu_notifier_register call can fail because mm_take_all_locks may be interrupted by a signal and return -EINTR. Because mmu_notifier_reigster is used when a driver startup, a failure can be gracefully handled. Here an example of the change applied to kvm to register the mmu notifiers. Usually when a driver startups other allocations are required anyway and -ENOMEM failure paths exists already. struct kvm kvm_arch_create_vm(void) { struct kvm kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL); + int err; if (!kvm) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); + kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops; + err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm); + if (err) { + kfree(kvm); + return ERR_PTR(err); + } + return kvm; } mmu_notifier_unregister returns void and it's reliable. The patch also adds a few needed but missing includes that would prevent kernel to compile after these changes on non-x86 archs (x86 didn't need them by luck). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix mm/filemap_xip.c build] [akpm@linux-foundation.org: fix mm/mmu_notifier.c build] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	7906d00cd1	mmu-notifiers: add mm_take_all_locks() operation mm_take_all_locks holds off reclaim from an entire mm_struct. This allows mmu notifiers to register into the mm at any time with the guarantee that no mmu operation is in progress on the mm. This operation locks against the VM for all pte/vma/mm related operations that could ever happen on a certain mm. This includes vmtruncate, try_to_unmap, and all page faults. The caller must take the mmap_sem in write mode before calling mm_take_all_locks(). The caller isn't allowed to release the mmap_sem until mm_drop_all_locks() returns. mmap_sem in write mode is required in order to block all operations that could modify pagetables and free pages without need of altering the vma layout (for example populate_range() with nonlinear vmas). It's also needed in write mode to avoid new anon_vmas to be associated with existing vmas. A single task can't take more than one mm_take_all_locks() in a row or it would deadlock. mm_take_all_locks() and mm_drop_all_locks are expensive operations that may have to take thousand of locks. mm_take_all_locks() can fail if it's interrupted by signals. When mmu_notifier_register returns, we must be sure that the driver is notified if some task is in the middle of a vmtruncate for the 'mm' where the mmu notifier was registered (mmu_notifier_invalidate_range_start/end is run around the vmtruncation but mmu_notifier_register can run after mmu_notifier_invalidate_range_start and before mmu_notifier_invalidate_range_end). Same problem for rmap paths. And we've to remove page pinning to avoid replicating the tlb_gather logic inside KVM (and GRU doesn't work well with page pinning regardless of needing tlb_gather), so without mm_take_all_locks when vmtruncate frees the page, kvm would have no way to notice that it mapped into sptes a page that is going into the freelist without a chance of any further mmu_notifier notification. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Andrea Arcangeli	6beeac76f5	mmu-notifiers: add list_del_init_rcu() Introduce list_del_init_rcu() and document it. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:20 -07:00
Mike Rapoport	56edb58be1	mfd: add platform_data to mfd_cell Adding platform_data to mfd_cell allows passing of platform data directly to the platform_device created for each cell and thus reuse of existing drivers. On the other side it can be used as a hook to mfd_cell itself removing the need in mfd_get_cell method. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-29 01:23:32 +02:00
Alan Cox	979b1791e5	PCI: add D3 power state avoidance quirk Libata has some hacks to deal with certain controllers going silly in D3 state. The right way to handle this is to keep a PCI device flag for such devices. That can then be generalised for no ATA devices with power problems. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 15:12:11 -07:00
Shaohua Li	149e16372a	PCI: disable ASPM on pre-1.1 PCIe devices Disable ASPM on pre-1.1 PCIe devices, as many of them don't implement it correctly. Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 14:56:57 -07:00
Shaohua Li	5fde244d39	PCI: disable ASPM per ACPI FADT setting The ACPI FADT table includes an ASPM control bit. If the bit is set, do not enable ASPM since it may indicate that the platform doesn't actually support the feature. Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 14:56:09 -07:00
Ingo Molnar	9e3ee1c39c	Merge branch 'linus' into cpus4096 Conflicts: kernel/stop_machine.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 23:32:00 +02:00
Jesse Barnes	29111f579f	Merge branch 'x86/iommu' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus	2008-07-28 14:31:10 -07:00
Linus Torvalds	e56b3bc794	cpu masks: optimize and clean up cpumask_of_cpu() Clean up and optimize cpumask_of_cpu(), by sharing all the zero words. Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns creating a huge array of constant bitmasks, realize that the zero words can be shared. In other words, on a 64-bit architecture, we only ever need 64 of these arrays - with a different bit set in one single world (with enough zero words around it so that we can create any bitmask by just offsetting in that big array). And then we just put enough zeroes around it that we can point every single cpumask to be one of those things. So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each, with one bit set in each array - 2MB memory total), we have exactly 64 arrays instead, each 8k bits in size (64kB total). And then we just point cpumask(n) to the right position (which we can calculate dynamically). Once we have the right arrays, getting "cpumask(n)" ends up being: static inline const cpumask_t get_cpu_mask(unsigned int cpu) { const unsigned long p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG]; p -= cpu / BITS_PER_LONG; return (const cpumask_t *)p; } This brings other advantages and simplifications as well: - we are not wasting memory that is just filled with a single bit in various different places - we don't need all those games to re-create the arrays in some dense format, because they're already going to be dense enough. if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory is a non-issue (especially since by doing this "overlapping" trick we probably get better cache behaviour anyway). [ mingo@elte.hu: Converted Linus's mails into a commit. See: http://lkml.org/lkml/2008/7/27/156 http://lkml.org/lkml/2008/7/28/320 Also applied a family filter - which also has the side-effect of leaving out the bits where Linus calls me an idio... Oh, never mind ;-) ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 22:20:41 +02:00
Ingo Molnar	414f746d23	Merge branch 'linus' into cpus4096	2008-07-28 21:14:43 +02:00
Linus Torvalds	f934fb19ef	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: add driver for Atmel integrated touchscreen controller Input: ads7846 - optimize order of calculating Rt in ads7846_rx() Input: ads7846 - fix sparse endian warnings Input: uinput - remove duplicate include Input: serio - offload resume to kseriod Input: serio - mark serio_register_driver() __must_check	2008-07-28 09:59:26 -07:00
Ben Dooks	7f71ac9374	mfd: Coding style fixes Fix some coding style fixes in the mfd core driver. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-28 18:29:09 +02:00
Linus Torvalds	d9089c296b	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Disable 64K hugetlb support when doing 64K SPU mappings powerpc/powermac: Fixup default serial port device for pmac_zilog powerpc/powermac: Use sane default baudrate for SCC debugging powerpc/mm: Implement _PAGE_SPECIAL & pte_special() for 64-bit powerpc: Show processor cache information in sysfs powerpc: Make core id information available to userspace powerpc: Make core sibling information available to userspace powerpc/vio: More fallout from dma_mapping_error API change ibmveth: Fix multiple errors with dma_mapping_error conversion powerpc/pseries: Fix CMO sysdev attribute API change fallout powerpc: Enable tracehook for the architecture powerpc: Add TIF_NOTIFY_RESUME support for tracehook powerpc: Add asm/syscall.h with the tracehook entry points powerpc: Make syscall tracing use tracehook.h helpers powerpc: Call tracehook_signal_handler() when setting up signal frames powerpc: Update cpu_sibling_maps dynamically powerpc: register_cpu_online should be __cpuinit powerpc: kill useless SMT code in prom_hold_cpus powerpc: Fix 8xx build failure powerpc: Fix vio build warnings ...	2008-07-28 09:05:35 -07:00
Linus Torvalds	b10a8b7238	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (72 commits) sh: SuperH Mobile CEU and camera platform data for AP325RXA sh: Update smc911x platform data for AP325RXA sh: SuperH Mobile LCDC platform data for AP325RXA sh: Add SuperH Mobile CEU platform data for Migo-R sh: Add SuperH Mobile LCDC platform data for Migo-R sh: Move asid_cache() out of ifdef to fix SH-3/4 nommu build. sh: Workaround for __put_user_asm() bug with gcc 4.x on big-endian. sh: Wire up new syscalls. sh: fix uImage Entry Point sh_keysc: remove request_mem_region() and release_mem_region() sh: Don't miss pending signals returning to user mode after signal processing sh: Use clk_always_enable() on sh7366 sh: Use clk_always_enable() on sh7343 / SE77343 sh: Use clk_always_enable() on sh7722 / Migo-R / SE7722 sh: Use clk_always_enable() on sh7723 / ap325rxa sh: Introduce clk_always_enable() function sh: Show all clocks and their state in /proc/clocks sh: Merge sh7343 and sh7722 clock code sh: Add SuperH Mobile MSTPCR bits to clock framework sh: Use arch_flags to simplify sh7722 siu clock code ...	2008-07-28 08:41:13 -07:00
Linus Torvalds	37eaf8c746	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: fix up ftrace.c stop_machine: Wean existing callers off stop_machine_run() stop_machine(): stop_machine_run() changed to use cpu mask Hotplug CPU: don't check cpu_online after take_cpu_down Simplify stop_machine stop_machine: add ALL_CPUS option module: fix build warning with !CONFIG_KALLSYMS	2008-07-28 08:37:46 -07:00
Jeremy Fitzhardinge	d974ae379a	generic, memparse(): constify argument memparse()'s first argument can be const, so it should be. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 15:05:23 +02:00
Adrian McMenamin	306cfd630a	maple: tidy maple_driver code by removing redundant connect/disconnect The connect and disconnect functions are unnecessary - everything they do can be accomplished in the initial probe - so remove them. Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-07-28 18:10:30 +09:00
Barry Naujok	9403540c06	dcache: Add case-insensitive support d_ci_add() routine This add a dcache entry to the dcache for lookup, but changing the name that is associated with the entry rather than the one passed in to the lookup routine. First, it sees if the case-exact match already exists in the dcache and uses it if one exists. Otherwise, it allocates a new node with the new name and splices it into the dcache. Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov. Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:58:39 +10:00
Benjamin Herrenschmidt	d65d830ca0	Merge commit 'gcl/gcl-next'	2008-07-28 16:30:40 +10:00
Rusty Russell	eeec4fad96	stop_machine(): stop_machine_run() changed to use cpu mask Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all cpus), pass a cpumask_t. Allow NULL for the common case (where we don't care which CPU the function is run on): temporary cpumask_t's are usually considered bad for stack space. This deprecates stop_machine_run, to be removed soon when all the callers are dead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-28 12:16:30 +10:00
Rusty Russell	ffdb5976c4	Simplify stop_machine stop_machine creates a kthread which creates kernel threads. We can create those threads directly and simplify things a little. Some care must be taken with CPU hotunplug, which has special needs, but that code seems more robust than it was in the past. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2008-07-28 12:16:29 +10:00
Jason Baron	5c2aed6225	stop_machine: add ALL_CPUS option -allow stop_mahcine_run() to call a function on all cpus. Calling stop_machine_run() with a 'ALL_CPUS' invokes this new behavior. stop_machine_run() proceeds as normal until the calling cpu has invoked 'fn'. Then, we tell all the other cpus to call 'fn'. Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> CC: Adrian Bunk <bunk@stusta.de> CC: Andi Kleen <andi@firstfloor.org> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Christoph Hellwig <hch@infradead.org> CC: mingo@elte.hu CC: akpm@osdl.org	2008-07-28 12:16:28 +10:00
Mauro Carvalho Chehab	c2f90e9536	Merge ../linux-2.6	2008-07-27 22:23:18 -03:00
Andrea Righi	940389b8af	task IO accounting: move all IO statistics in struct task_io_accounting Simplify the code of include/linux/task_io_accounting.h. It is also more reasonable to have all the task i/o-related statistics in a single struct (task_io_accounting). Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 16:12:28 -07:00
Mauro Carvalho Chehab	eb703027ac	Merge ../linux-2.6	2008-07-27 18:11:53 -03:00
Linus Torvalds	837b41b5de	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: state userland requirements in Kconfig help firewire: avoid memleak after phy config transmit failure firewire: fw-ohci: TSB43AB22/A dualbuffer workaround firewire: queue the right number of data firewire: warn on unfinished transactions during card removal firewire: small fw_fill_request cleanup firewire: fully initialize fw_transaction before marking it pending firewire: fix race of bus reset with request transmission	2008-07-27 10:24:06 -07:00
Andrea Righi	5995477ab7	task IO accounting: improve code readability Put all i/o statistics in struct proc_io_accounting and use inline functions to initialize and increment statistics, removing a lot of single variable assignments. This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and CONFIG_TASK_IO_ACCOUNTING=y). text data bss dec hex filename 11651 0 0 11651 2d83 kernel/exit.o.before 11619 0 0 11619 2d63 kernel/exit.o.after 10886 132 136 11154 2b92 kernel/fork.o.before 10758 132 136 11026 2b12 kernel/fork.o.after 3082029 807968 4818600 8708597 84e1f5 vmlinux.o.before 3081869 807968 4818600 8708437 84e155 vmlinux.o.after Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-27 09:58:20 -07:00
Mauro Carvalho Chehab	50cb993ea6	Merge ../linux-2.6	2008-07-27 12:25:57 -03:00
Mauro Carvalho Chehab	9fa0f6db3a	V4L/DVB (8522): videodev2: Fix merge conflict Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 12:24:02 -03:00
Hans Verkuil	c1d7f4f164	V4L/DVB (8524): videodev: copy the VID_TYPE defines to videodev.h The VID_TYPE defines are V4L1 specific, so copy them back to videodev.h. In videodev2.h ensure that they are not used in the kernel (you need to include videodev.h instead) and mark them are deprecated. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:07:12 -03:00
Jean-Francois Moine	1250ac6d4a	V4L/DVB (8518): gspca: Remove the remaining frame decoding functions from the subdrivers. SPCA505 and SPCA508 added in the pixel formats. Decode functions and associated resources removed in spca505, 506 and 508. The decode routines are now found in the V4L library. Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:06:42 -03:00
Mauro Carvalho Chehab	9993e51c0c	V4L/DVB (8502): videodev2.h: CodingStyle cleanups Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-27 11:06:20 -03:00
Linus Torvalds	8be1a6d6c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files mlx4_core: Add VLAN tag field to WQE control segment struct RDMA/nes: CM connection setup/teardown rework IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG IPoIB/cm: Connected mode is no longer EXPERIMENTAL RDMA/ucm: BKL is not needed for ib_ucm_open() RDMA/ucma: BKL is not needed for ucma_open()	2008-07-26 20:40:36 -07:00
Linus Torvalds	9ee08c2df4	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (57 commits) [MTD] [NAND] subpage read feature as a way to increase performance. CPUFREQ: S3C24XX NAND driver frequency scaling support. [MTD][NAND] au1550nd: remove unused variable [MTD] jedec_probe: Fix SST 16-bit chip detection [MTD][MTDPART] Fix a division by zero bug [MTD][MTDPART] Cleanup and document the erase region handling [MTD][MTDPART] Handle most checkpatch findings [MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition [MTD] physmap: resume already suspended chips on failure to suspend [MTD] physmap: Fix suspend/resume/shutdown bugs. [MTD] [NOR] Fix -ETIMEO errors in CFI driver [MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y [JFFS2] Use .unlocked_ioctl [MTD] Fix const assignment in the MTD command line partitioning driver [MTD] [NOR] gen_probe: No debug message when debugging is disabled [MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers [MTD] [MAPS] Remove the bast-flash driver. [MTD] [NAND] fsl_elbc_nand: ecclayout cleanups [MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT [MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips ...	2008-07-26 20:30:56 -07:00
Linus Torvalds	eaf0ba5ef6	Merge branch 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace * 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace: tracehook: comment fixes	2008-07-26 20:29:39 -07:00
Linus Torvalds	bdee6ac7d1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: atmel-mci: debugfs support mmc: Add per-card debugfs support mmc: Export internal host state through debugfs imxmmc: fix crash when no platform data is provided imxmmc: fix platform resources imxmmc: remove DEBUG definition mmc_spi: put signals to low power off fix	2008-07-26 20:27:31 -07:00
Linus Torvalds	4836e30078	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits) [PATCH] fix RLIM_NOFILE handling [PATCH] get rid of corner case in dup3() entirely [PATCH] remove remaining namei_{32,64}.h crap [PATCH] get rid of indirect users of namei.h [PATCH] get rid of __user_path_lookup_open [PATCH] f_count may wrap around [PATCH] dup3 fix [PATCH] don't pass nameidata to __ncp_lookup_validate() [PATCH] don't pass nameidata to gfs2_lookupi() [PATCH] new (local) helper: user_path_parent() [PATCH] sanitize __user_walk_fd() et.al. [PATCH] preparation to __user_walk_fd cleanup [PATCH] kill nameidata passing to permission(), rename to inode_permission() [PATCH] take noexec checks to very few callers that care Re: [PATCH 3/6] vfs: open_exec cleanup [patch 4/4] vfs: immutable inode checking cleanup [patch 3/4] fat: dont call notify_change [patch 2/4] vfs: utimes cleanup [patch 1/4] vfs: utimes: move owner check into inode_change_ok() [PATCH] vfs: use kstrdup() and check failing allocation ...	2008-07-26 20:23:44 -07:00
Linus Torvalds	5c7c204aec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: Add layer1 over IP support Add mISDN HFC multiport driver Add mISDN HFC PCI driver Add mISDN DSP Add mISDN core files Define AF_ISDN and PF_ISDN Add mISDN driver	2008-07-26 20:19:41 -07:00
Linus Torvalds	2284284281	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix ip_rt_frag_needed rt_is_expired netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences netfilter: fix double-free and use-after free netfilter: arptables in netns for real netfilter: ip{,6}tables_security: fix future section mismatch selinux: use nf_register_hooks() netfilter: ebtables: use nf_register_hooks() Revert "pkt_sched: sch_sfq: dump a real number of flows" qeth: use dev->ml_priv instead of dev->priv syncookies: Make sure ECN is disabled net: drop unused BUG_TRAP() net: convert BUG_TRAP to generic WARN_ON drivers/net: convert BUG_TRAP to generic WARN_ON	2008-07-26 20:17:56 -07:00
Andrea Righi	510a35d4a4	hugetlb: remove unused variable warning Remove the following warning when CONFIG_HUGETLB_PAGE is not set: ipc/shm.c: In function `shm_get_stat': ipc/shm.c:565: warning: unused variable `h' [akpm@linux-foundation.org: use tabs, not spaces] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 20:16:47 -07:00
Al Viro	3f8206d496	[PATCH] get rid of indirect users of namei.h fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all. Several places in the tree needed direct include. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:42 -04:00
Al Viro	964bd18362	[PATCH] get rid of __user_path_lookup_open Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:41 -04:00
Al Viro	516e0cc564	[PATCH] f_count may wrap around make it atomic_long_t; while we are at it, get rid of useless checks in affs, hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:40 -04:00
Al Viro	2d8f30380a	[PATCH] sanitize __user_walk_fd() et.al. * do not pass nameidata; struct path is all the callers want. * switch to new helpers: user_path_at(dfd, pathname, flags, &path) user_path(pathname, &path) user_lpath(pathname, &path) user_path_dir(pathname, &path) (fail if not a directory) The last 3 are trivial macro wrappers for the first one. * remove nameidata in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:34 -04:00
Al Viro	f419a2e3b6	[PATCH] kill nameidata passing to permission(), rename to inode_permission() Incidentally, the name that gives hundreds of false positives on grep is not a good idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:31 -04:00
Miklos Szeredi	9767d74957	[patch 1/4] vfs: utimes: move owner check into inode_change_ok() Add a new ia_valid flag: ATTR_TIMES_SET, to handle the UTIMES_OMIT/UTIMES_NOW and UTIMES_NOW/UTIMES_OMIT cases. In these cases neither ATTR_MTIME_SET nor ATTR_ATIME_SET is in the flags, yet the POSIX draft specifies that permission checking is performed the same way as if one or both of the times was explicitly set to a timestamp. See the path "vfs: utimensat(): fix error checking for {UTIME_NOW,UTIME_OMIT} case" by Michael Kerrisk for the patch introducing this behavior. This is a cleanup, as well as allowing filesystems (NFS/fuse/...) to perform their own permission checking instead of the default. CC: Ulrich Drepper <drepper@redhat.com> CC: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:25 -04:00
Li Zefan	88b387824f	[PATCH] vfs: use kstrdup() and check failing allocation - use kstrdup() instead of kmalloc() + memcpy() - return NULL if allocating ->mnt_devname failed - mnt_devname should be const Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:24 -04:00
Al Viro	b77b0646ef	[PATCH] pass MAY_OPEN to vfs_permission() explicitly ... and get rid of the last "let's deduce mask from nameidata->flags" bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:22 -04:00
Al Viro	a110343f0d	[PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess * MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS * MAY_ACCESS on fuse should affect only the last step of pathname resolution * fchdir() and chroot() should pass MAY_ACCESS, for the same reason why chdir() needs that. * now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be removed; it has no business being in nameidata. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:21 -04:00
Al Viro	7f2da1e7d0	[PATCH] kill altroot long overdue... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:20 -04:00
Al Viro	8bb79224b8	[PATCH] permission checks for chdir need special treatment only on the last step ... so we ought to pass MAY_CHDIR to vfs_permission() instead of having it triggered on every step of preceding pathname resolution. LOOKUP_CHDIR is killed by that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:19 -04:00
Miklos Szeredi	db2e747b14	[patch 5/5] vfs: remove mode parameter from vfs_symlink() Remove the unused mode parameter from vfs_symlink and callers. Thanks to Tetsuo Handa for noticing. CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:18 -04:00
Miklos Szeredi	2f1936b877	[patch 3/5] vfs: change remove_suid() to file_remove_suid() All calls to remove_suid() are made with a file pointer, because (similarly to file_update_time) it is called when the file is written. Clean up callers by passing in a file instead of a dentry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:16 -04:00
Al Viro	e6305c43ed	[PATCH] sanitize ->permission() prototype * kill nameidata * argument; map the 3 bits in ->flags anybody cares about to new MAY_... ones and pass with the mask. * kill redundant gfs2_iop_permission() * sanitize ecryptfs_permission() * fix remaining places where ->permission() instances might barf on new MAY_... found in mask. The obvious next target in that direction is permission(9) folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:14 -04:00
Al Viro	9043476f72	[PATCH] sanitize proc_sysctl * keep references to ctl_table_head and ctl_table in /proc/sys inodes * grab the former during operations, use the latter for access to entry if that succeeds * have ->d_compare() check if table should be seen for one who does lookup; that allows us to avoid flipping inodes - if we have the same name resolve to different things, we'll just keep several dentries and ->d_compare() will reject the wrong ones. * have ->lookup() and ->readdir() scan the table of our inode first, then walk all ctl_table_header and scan ->attached_by for those that are attached to our directory. * implement ->getattr(). * get rid of insane amounts of tree-walking * get rid of the need to know dentry in ->permission() and of the contortions induced by that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:12 -04:00
Al Viro	ae7edecc9b	[PATCH] sysctl: keep track of tree relationships In a sense, that's the heart of the series. It's based on the following property of the trees we are actually asked to add: they can be split into stem that is already covered by registered trees and crown that is entirely new. IOW, if a/b and a/c/d are introduced by our tree, then a/c is also introduced by it. That allows to associate tree and table entry with each node in the union; while directory nodes might be covered by many trees, only one will cover the node by its crown. And that will allow much saner logics for /proc/sys in the next patches. This patch introduces the data structures needed to keep track of that. When adding a sysctl table, we find a "parent" one. Which is to say, find the deepest node on its stem that already is present in one of the tables from our table set or its ancestor sets. That table will be our parent and that node in it - attachment point. Add our table to list anchored in parent, have it refer the parent and contents of attachment point. Also remember where its crown lives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:11 -04:00
Al Viro	f7e6ced406	[PATCH] allow delayed freeing of ctl_table_header Refcount the sucker; instead of freeing it by the end of unregistration just drop the refcount and free only when it hits zero. Make sure that we _always_ make ->unregistering non-NULL in start_unregistering(). That allows anybody to get a reference to such puppy, preventing its freeing and reuse. It does not block unregistration. Anybody who holds such a reference can * try to grab a "use" reference (ctl_head_grab()); that will succeeds if and only if it hadn't entered unregistration yet. If it succeeds, we can use it in all normal ways until we release the "use" reference (with ctl_head_finish()). Note that this relies on having ->unregistering become non-NULL in all cases when one starts to unregister the sucker. * keep pointers to ctl_table entries; they can be freed if the entire thing is unregistered. However, if ctl_head_grab() succeeds, we know that unregistration had not happened (and will not happen until ctl_head_finish()) and such pointers can be used safely. IOW, now we can have inodes under /proc/sys keep references to ctl_table entries, protecting them with references to ctl_table_header and grabbing the latter for the duration of operations that require access to ctl_table. That won't cause deadlocks, since unregistration will not be stopped by mere keeping a reference to ctl_table_header. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:09 -04:00
Al Viro	734550921e	[PATCH] beginning of sysctl cleanup - ctl_table_set New object: set of sysctls [currently - root and per-net-ns]. Contains: pointer to parent set, list of tables and "should I see this set?" method (->is_seen(set)). Current lists of tables are subsumed by that; net-ns contains such a beast. ->lookup() for ctl_table_root returns pointer to ctl_table_set instead of that to ->list of that ctl_table_set. [folded compile fixes by rdd for configs without sysctl] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:08 -04:00
Denys Vlasenko	d2d9648ec6	[PATCH] reuse xxx_fifo_fops for xxx_pipe_fops Merge fifo and pipe file_operations. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:06 -04:00
Pekka Enberg	93bc4e89c2	netfilter: fix double-free and use-after free As suggested by Patrick McHardy, introduce a __krealloc() that doesn't free the original buffer to fix a double-free and use-after-free bug introduced by me in netfilter that uses RCU. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Tested-by: Dieter Ries <clip2@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-26 17:49:33 -07:00
Karsten Keil	af69fb3a8f	Add mISDN HFC multiport driver Enable support for cards with Cologne Chip AG's HFC multiport chip. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 02:00:43 +02:00
Karsten Keil	960366cf8d	Add mISDN DSP Enable support for digital audio processing capability. This module may be used for special applications that require cross connecting of bchannels, conferencing, dtmf decoding echo cancelation, tone generation, and Blowfish encryption and decryption. It may use hardware features if available. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:56:38 +02:00
Karsten Keil	1b2b03f8e5	Add mISDN core files Add mISDN core files Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:54:58 +02:00
Karsten Keil	04578dd330	Define AF_ISDN and PF_ISDN Define the address and protocol family value for mISDN. Signed-off-by: Karsten Keil <kkeil@suse.de>	2008-07-27 01:47:00 +02:00
Haavard Skinnemoen	f4b7f927b5	mmc: Add per-card debugfs support For each card successfully added to the bus, create a subdirectory under the host's debugfs root with information about the card. At the moment, only a single file is added to the card directory for all cards: "state". It reflects the "state" field in struct mmc_card, indicating whether the card is present, readonly, etc. For MMC and SD cards (not SDIO), another file is added: "status". Reading this file will ask the card about its current status and return it. This can be useful if the card just refuses to respond to any commands, which might indicate that the card state is not what the MMC core thinks it is (due to a missing stop command, for example.) Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-27 01:26:17 +02:00
Haavard Skinnemoen	6edd8ee60a	mmc: Export internal host state through debugfs When CONFIG_DEBUG_FS is set, create a few files under /sys/kernel/debug containing information about an mmc host's internal state. Currently, just a single file is created, "ios", which contains information about the current operating parameters for the bus (clock speed, bus width, etc.) Host drivers can add additional files and directories under the host's root directory by passing the debugfs_root field in struct mmc_host as the 'parent' parameter to debugfs_create_*. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-27 01:26:16 +02:00
Roland McGrath	a9906a1919	tracehook: comment fixes This fixes some typos and errors in <linux/tracehook.h> comments. No code changes. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:41:26 -07:00
Linus Torvalds	fb3b806144	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels x86, RDC321x: remove gpio.h complications x86, RDC321x: add to mach-default crashdump: fix undefined reference to `elfcorehdr_addr' flag parameters: fix compile error of sys_epoll_create1	2008-07-26 13:25:05 -07:00
Adrian Bunk	9580d85f9c	drivers/char/rtc.c: make 2 functions static The following functions can now become static: - rtc_interrupt() - rtc_get_rtc_time() Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Bernhard Walle <bwalle@suse.de> Acked-by: Paul Gortmaker <p_gortmaker@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	7c363b8c65	mm/swapfile.c: make code static This patch makes the following needlessly global code static: - swap_lock - nr_swapfiles - struct swap_list Signed-off-by: Adrian Bunk <bunk@kernel.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	15f59adae0	make mm/memory.c:print_bad_pte() static This patch makes the needlessly global print_bad_pte() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Adrian Bunk	9d8fddfb17	mm/allocpercpu.c: make 4 functions static This patch makes the following needlessly global functions static: - percpu_depopulate() - __percpu_depopulate_mask() - percpu_populate() - __percpu_populate_mask() Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:12 -07:00
Roland McGrath	bbc698636e	task_current_syscall This adds the new function task_current_syscall() on machines where the asm/syscall.h interface is supported (CONFIG_HAVE_ARCH_TRACEHOOK). It's exported for modules to use in the future. This function safely samples the state of a blocked thread to collect what system call it is blocked in, and the six system call argument registers. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:10 -07:00
Roland McGrath	85ba2d862e	tracehook: wait_task_inactive This extends wait_task_inactive() with a new argument so it can be used in a "soft" mode where it will check for the task changing state unexpectedly and back off. There is no change to existing callers. This lays the groundwork to allow robust, noninvasive tracing that can try to sample a blocked thread but back off safely if it wakes up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	828c365cc8	tracehook: asm/syscall.h This adds asm-generic/syscall.h, which documents what a real asm-ARCH/syscall.h file should define. This is not used yet, but will provide all the machine-dependent details of examining a user system call about to begin, in progress, or just ended. Each arch should add an asm-ARCH/syscall.h that defines all the entry points documented in asm-generic/syscall.h, as short inlines if possible. This lets us write new tracing code that understands user system call registers, without any new arch-specific work. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	64b1208d5b	tracehook: TIF_NOTIFY_RESUME This adds tracehook.h inlines to enable a new arch feature in support of user debugging/tracing. This is not used yet, but it lays the groundwork for a debugger to be able to wrangle a task that's possibly running, without interrupting its syscalls in progress. Each arch should define TIF_NOTIFY_RESUME, and in their entry.S code treat it much like TIF_SIGPENDING. That is, it causes you to take the slow path when returning to user mode, where you get the full user-mode state accessible as for signal handling or ptrace. The arch code should check TIF_NOTIFY_RESUME after handling TIF_SIGPENDING. When it's set, clear it and then call tracehook_notify_resume(). In future, tracing code will call set_notify_resume() when it wants to get a callback in tracehook_notify_resume(). Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	b787f7ba67	tracehook: force signal_pending() This defines a new hook tracehook_force_sigpending() that lets tracing code decide to force TIF_SIGPENDING on in recalc_sigpending(). This is not used yet, so it compiles away to nothing for now. It lays the groundwork for new tracing code that can interrupt a task synthetically without actually sending a signal. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	2b2a1ff64a	tracehook: death This moves the ptrace logic in task death (exit_notify) into tracehook.h inlines. Some code is rearranged slightly to make things nicer. There is no change, only cleanup. There is one hook called with the tasklist_lock write-locked, as ptrace needs. There is also a new hook called after exit_state changes and without locks. This is a better place for tracing work to be in the future, since it doesn't delay the whole system with locking. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	fa00b80b3c	tracehook: job control This defines the tracehook_notify_jctl() hook to formalize the ptrace effects on the job control notifications. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	7bcf6a2ca5	tracehook: get_signal_to_deliver This defines the tracehook_get_signal() hook to allow tracing code to slip in before normal signal dequeuing. This lays the groundwork for new tracing features that can inject synthetic signals outside the normal queue or control the disposition of delivered signals. The calling convention lets tracehook_get_signal() decide both exactly what will happen and what signal number to report in the handler/exit. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	283d7559e7	tracehook: syscall This adds standard tracehook.h inlines for arch code to call when TIF_SYSCALL_TRACE has been set. This replaces having each arch implement the ptrace guts for its syscall tracing support. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	445a91d2fe	tracehook: tracehook_consider_fatal_signal This defines tracehook_consider_fatal_signal() has a fine-grained hook for deciding to skip the special cases for a fatal signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	35de254dc6	tracehook: tracehook_consider_ignored_signal This defines tracehook_consider_ignored_signal() has a fine-grained hook for deciding to prevent the normal short-circuit of sending an ignored signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	c45aea2761	tracehook: tracehook_signal_handler This defines tracehook_signal_handler() as a hook for the arch signal handling code to call. It gives ptrace the opportunity to stop for a pseudo-single-step trap immediately after signal handler setup is done. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	fa8e26ccd4	tracehook: tracehook_expect_breakpoints This adds tracehook_expect_breakpoints() as a formal hook for the nommu code to use for its, "Is text-poking likely?" check at mmap time. This names the actual semantics the code means to test, and documents it. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:09 -07:00
Roland McGrath	0d094efeb1	tracehook: tracehook_tracer_task This adds the tracehook_tracer_task() hook to consolidate all forms of "Who is using ptrace on me?" logic. This is used for "TracerPid:" in /proc and for permission checks. We also clean up the selinux code the called an identical accessor. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	dae33574dc	tracehook: release_task This moves the ptrace-related logic from release_task into tracehook.h and ptrace.h inlines. It provides clean hooks both before and after locking tasklist_lock, for future tracing logic to do more cleanup without the lock. This also changes release_task() itself in the rare "zap_leader" case to set the leader to EXIT_DEAD before iterating. This maintains the invariant that release_task() only ever handles a task in EXIT_DEAD. This is a common-sense invariant that is already always true except in this one arcane case of zombie leader whose parent ignores SIGCHLD. This change is harmless and only costs one store in this one rare case. It keeps the expected state more consisently sane, which is nicer when debugging weirdness in release_task(). It also lets some future code in the tracehook entry points rely on this invariant for bookkeeping. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	daded34be9	tracehook: vfork-done This moves the PTRACE_EVENT_VFORK_DONE tracing into a tracehook.h inline, tracehook_report_vfork_done(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	09a05394fe	tracehook: clone This moves all the ptrace initialization and tracing logic for task creation into tracehook.h and ptrace.h inlines. It reorganizes the code slightly, but should not change any behavior. There are four tracehook entry points, at each important stage of task creation. This keeps the interface from the core fork.c code fairly clean, while supporting the complex setup required for ptrace or something like it. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	30199f5a46	tracehook: exit This moves the PTRACE_EVENT_EXIT tracing into a tracehook.h inline, tracehook_report_exec(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	6341c393fc	tracehook: exec This moves all the ptrace hooks related to exec into tracehook.h inlines. This also lifts the calls for tracing out of the binfmt load_binary hooks into search_binary_handler() after it calls into the binfmt module. This change has no effect, since all the binfmt modules' load_binary functions did the call at the end on success, and now search_binary_handler() does it immediately after return if successful. We consolidate the repeated code, and binfmt modules no longer need to import ptrace_notify(). Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Roland McGrath	88ac2921a7	tracehook: add linux/tracehook.h This patch series introduces the "tracehook" interface layer of inlines in <linux/tracehook.h>. There are more details in the log entry for patch 01/23 and in the header file comments inside that patch. Most of these changes move code around with little or no change, and they should not break anything or change any behavior. This sets a new standard for uniform arch support to enable clean arch-independent implementations of new debugging and tracing stuff, denoted by CONFIG_HAVE_ARCH_TRACEHOOK. Patch 20/23 adds that symbol to arch/Kconfig, with comments listing everything an arch has to do before setting "select HAVE_ARCH_TRACEHOOK". These are elaborted a bit at: http://sourceware.org/systemtap/wiki/utrace/arch/HowTo The new inlines that arch code must define or call have detailed kerneldoc comments in the generic header files that say what is required. No arch is obligated to do any work, and no arch's build should be broken by these changes. There are several steps that each arch should take so it can set HAVE_ARCH_TRACEHOOK. Most of these are simple. Providing this support will let new things people add for doing debugging and tracing of user-level threads "just work" for your arch in the future. For an arch that does not provide HAVE_ARCH_TRACEHOOK, some new options for such features will not be available for config. I have done some arch work and will submit this to the arch maintainers after the generic tracehook series settles in. For now, that work is available in my GIT repositories, and in patch and mbox-of-patches form at http://people.redhat.com/roland/utrace/2.6-current/ This paves the way for my "utrace" work, to be submitted later. But it is not innately tied to that. I hope that the tracehook series can go in soon regardless of what eventually does or doesn't go on top of it. For anyone implementing any kind of new tracing/debugging plan, or just understanding all the context of the existing ptrace implementation, having tracehook.h makes things much easier to find and understand. This patch: This adds the new kernel-internal header file <linux/tracehook.h>. This is not yet used at all. The comments in the header introduce what the following series of patches is about. The aim is to formalize and consolidate all the places that the core kernel code and the arch code now ties into the ptrace implementation. These patches mostly don't cause any functional change. They just move the details of ptrace logic out of core code into tracehook.h inlines, where they are mostly compiled away to the same as before. All that changes is that everything is thoroughly documented and any future reworking of ptrace, or addition of something new, would not have to touch core code all over, just change the tracehook.h inlines. The new linux/ptrace.h inlines are used by the following patches in the new tracehook_*() inlines. Using these helpers for the ptrace event stops makes it simple to change or disable the old ptrace implementation of these stops conditionally later. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Alexey Dobriyan	51cc50685a	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:07 -07:00
Nick Piggin	19fd623127	mm: spinlock tree_lock mapping->tree_lock has no read lockers. convert the lock from an rwlock to a spinlock. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	e286781d5f	mm: speculative page references If we can be sure that elevating the page_count on a pagecache page will pin it, we can speculatively run this operation, and subsequently check to see if we hit the right page rather than relying on holding a lock or otherwise pinning a reference to the page. This can be done if get_page/put_page behaves consistently throughout the whole tree (ie. if we "get" the page after it has been used for something else, we must be able to free it with a put_page). Actually, there is a period where the count behaves differently: when the page is free or if it is a constituent page of a compound page. We need an atomic_inc_not_zero operation to ensure we don't try to grab the page in either case. This patch introduces the core locking protocol to the pagecache (ie. adds page_cache_get_speculative, and tweaks some update-side code to make it work). Thanks to Hugh for pointing out an improvement to the algorithm setting page_count to zero when we have control of all references, in order to hold off speculative getters. [kamezawa.hiroyu@jp.fujitsu.com: fix migration_entry_wait()] [hugh@veritas.com: fix add_to_page_cache] [akpm@linux-foundation.org: repair a comment] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	47feff2c8e	radix-tree: add gang_lookup_slot, gang_lookup_slot_tag Introduce gang_lookup_slot() and gang_lookup_slot_tag() functions, which are used by lockless pagecache. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	21cc199baa	mm: introduce get_user_pages_fast Introduce a new get_user_pages_fast mm API, which is basically a get_user_pages with a less general API (but still tends to be suited to the common case): - task and mm are always current and current->mm - force is always 0 - pages is always non-NULL - don't pass back vmas This restricted API can be implemented in a much more scalable way on many architectures when the ptes are present, by walking the page tables locklessly (no mmap_sem or page table locks). When the ptes are not populated, get_user_pages_fast() could be slower. This is implemented locklessly on x86, and used in some key direct IO call sites, in later patches, which provides nearly 10% performance improvement on a threaded database workload. Lots of other code could use this too, depending on use cases (eg. grep drivers/). And it might inspire some new and clever ways to use it. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:05 -07:00
Huang Weiyi	080ccd4573	include/linux/aio.h: removed duplicated include Removed duplicated include <linux/uio.h> in include/linux/aio.h Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	20d8b67c06	relay: add buffer-only channels; useful for early logging Allows one to create and use a channel with no associated files. Files can be initialized later. This is useful in scenarios such as logging in early code, before VFS is up. Therefore, such channels can be created and used as soon as kmem_cache_init() completed. This is needed by kmemtrace to do tracing in early kernel code. [kosaki.motohiro@jp.fujitsu.com: build fix] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	7babe8db99	Full conversion to early_initcall() interface, remove old interface A previous patch added the early_initcall(), to allow a cleaner hooking of pre-SMP initcalls. Now we remove the older interface, converting all existing users to the new one. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: build fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu	c2147a5092	Better interface for hooking early initcalls Added early initcall (pre-SMP) support, using an identical interface to that of regular initcalls. Functions called from do_pre_smp_initcalls() could be converted to use this cleaner interface. This is required by CPU hotplug, because early users have to register notifiers before going SMP. One such CPU hotplug user is the relay interface with buffer-only channels, which needs to register such a notifier, to be usable in early code. This in turn is used by kmemtrace. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Huang Ying	89081d17f7	kexec jump: save/restore device state This patch implements devices state save/restore before after kexec. This patch together with features in kexec_jump patch can be used for following: - A simple hibernation implementation without ACPI support. You can kexec a hibernating kernel, save the memory image of original system and shutdown the system. When resuming, you restore the memory image of original system via ordinary kexec load then jump back. - Kernel/system debug through making system snapshot. You can make system snapshot, jump back, do some thing and make another system snapshot. - Cooperative multi-kernel/system. With kexec jump, you can switch between several kernels/systems quickly without boot process except the first time. This appears like swap a whole kernel/system out/in. - A general method to call program in physical mode (paging turning off). This can be used to invoke BIOS code under Linux. The following user-space tools can be used with kexec jump: - kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 - makedumpfile with patches are used as memory image saving tool, it can exclude free pages from original kernel memory image file. The patches and the precompiled makedumpfile can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10 - An initramfs image can be used as the root file system of kexeced kernel. An initramfs image built with "BuildRoot" can be downloaded from the following URL: initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz All user space tools above are included in the initramfs image. Usage example of simple hibernation: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_RELOCATABLE=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PM=y CONFIG_HIBERNATION=y CONFIG_KEXEC_JUMP=y 2. Build an initramfs image contains kexec-tool and makedumpfile, or download the pre-built initramfs image, called rootfs.gz in following text. 3. Prepare a partition to save memory image of original kernel, called hibernating partition in following text. 4. Boot kernel compiled in step 1 (kernel A). 5. In the kernel A, load kernel compiled in step 1 (kernel B) with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000 --mem-max=0xffffff --initrd=rootfs.gz 6. Boot the kernel B with following shell command line: /sbin/kexec -e 7. The kernel B will boot as normal kexec. In kernel B the memory image of kernel A can be saved into hibernating partition as follow: jump_back_entry=`cat /proc/cmdline \| tr ' ' '\n' \| grep kexec_jump_back_entry \| cut -d '='` echo $jump_back_entry > kexec_jump_back_entry cp /proc/vmcore dump.elf Then you can shutdown the machine as normal. 8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as root file system. 9. In kernel C, load the memory image of kernel A as follow: /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf 10. Jump back to the kernel A as follow: /sbin/kexec -e Then, kernel A is resumed. Implementation point: To support jumping between two kernels, before jumping to (executing) the new kernel and jumping back to the original kernel, the devices are put into quiescent state, and the state of devices and CPU is saved. After jumping back from kexeced kernel and jumping to the new kernel, the state of devices and CPU are restored accordingly. The devices/CPU state save/restore code of software suspend is called to implement corresponding function. Known issues: - Because the segment number supported by sys_kexec_load is limited, hibernation image with many segments may not be load. This is planned to be eliminated by adding a new flag to sys_kexec_load to make a image can be loaded with multiple sys_kexec_load invoking. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Huang Ying	3ab8352137	kexec jump This patch provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used by the original kernel before/after kexec. - Save/restore CPU state before/after kexec. The features of this patch can be used as a general method to call program in physical mode (paging turning off). This can be used to call BIOS code under Linux. kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 Usage example of calling some physical mode code and return: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_KEXEC=y CONFIG_PM=y CONFIG_KEXEC_JUMP=y 2. Build patched kexec-tool or download the pre-built one. 3. Build some physical mode executable named such as "phy_mode" 4. Boot kernel compiled in step 1. 5. Load physical mode executable with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context --args-none phy_mode 6. Call physical mode executable with following shell command line: /sbin/kexec -e Implementation point: To support jumping without reserving memory. One shadow backup page (source page) is allocated for each page used by kexeced code image (destination page). When do kexec_load, the image of kexeced code is loaded into source pages, and before executing, the destination pages and the source pages are swapped, so the contents of destination pages are backupped. Before jumping to the kexeced code image and after jumping back to the original kernel, the destination pages and the source pages are swapped too. C ABI (calling convention) is used as communication protocol between kernel and called code. A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to indicate that the loaded kernel image is used for jumping back. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Alex Dubov	17017d8d2c	memstick: add "start" and "stop" methods to memstick device In some cases it may be desirable to ensure that associated driver is not going to access the media in some period of time. "start" and "stop" methods are provided therefore to allow it. Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Alex Dubov	b77899985b	memstick: allow "set_param" method to return an error code Some controllers (Jmicron, for instance) can report temporal failure condition during power-on. It is desirable to account for this using a return value of "set_param" device method. The return value can also be handy to distinguish between supported and unsupported device parameters in run time. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Adrian Bunk	929dfb24fb	parport/share.c: proper externs This patch adds proper externs for parport_default_timeslice and parport_default_spintime in include/linux/parport.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Andrew Morton	16d69265b9	uninline arch_pick_mmap_layout() Fix this, on avr32: include/linux/utsname.h:35, from init/main.c:20: include/linux/sched.h: In function 'arch_pick_mmap_layout': include/linux/sched.h:2149: error: implicit declaration of function 'PAGE_ALIGN' Reported-by: Adrian Bunk <bunk@kernel.org> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:01 -07:00
Mauro Carvalho Chehab	fdd2a7e2da	V4L/DVB (8500a): videotext.h: whitespace cleanup Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-26 13:25:25 -03:00
Mike Travis	b8d317d10c	cpumask: make cpumask_of_cpu_map generic If an arch doesn't define cpumask_of_cpu_map, create a generic statically-initialized one for them. This allows removal of the buggy cpumask_of_cpu() macro (&cpumask_of_cpu() gives address of out-of-scope var). An arch with NR_CPUS of 4096 probably wants to allocate this itself based on the actual number of CPUs, since otherwise they're using 2MB of rodata (1024 cpus means 128k). That's what CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the moment). In future as we support more CPUs, we'll need to resort to a get_cpu_map()/put_cpu_map() allocation scheme. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:40:32 +02:00
Ingo Molnar	10d3285d0b	Merge branch 'x86/urgent' into x86/core Conflicts: include/asm-x86/gpio.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:30:19 +02:00
Ingo Molnar	6dec3a10a7	Merge branch 'x86/x2apic' into x86/core Conflicts: include/asm-x86/i8259.h include/asm-x86/msidef.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:29:23 +02:00
Yinghai Lu	0af36739af	usb: move ehci reg def prepare x86: usb debug port early console move ehci struct def to linux/usrb/ehci_def.h from host/ehci.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: "Arjan van de Ven" <arjan@infradead.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Greg KH" <greg@kroah.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:17:01 +02:00
Joerg Roedel	3bc9f79ee1	iommu: add iommu_num_pages helper function Calculating the number of pages from given address and length numbers is a task required in multiple IOMMU implementations. So implement this as a generic function into the IOMMU helper code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: iommu@lists.linux-foundation.org Cc: bhavna.sarathy@amd.com Cc: robert.richter@amd.com Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:43:05 +02:00
Robert Richter	ee648bc77f	OProfile: add IBS code macros Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:48:04 +02:00
Robert Richter	021f8b75e7	x86: add PCI IDs for AMD Barcelona PCI devices Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:47:59 +02:00
Ingo Molnar	36ac26171a	crashdump: fix undefined reference to `elfcorehdr_addr' fix build bug introduced by `95b68dec0d` "calgary iommu: use the first kernels TCE tables in kdump": arch/x86/kernel/built-in.o: In function `calgary_iommu_init': (.init.text+0x8399): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `calgary_iommu_init': (.init.text+0x856c): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `detect_calgary': (.init.text+0x8c68): undefined reference to `elfcorehdr_addr' arch/x86/kernel/built-in.o: In function `detect_calgary': (.init.text+0x8d0c): undefined reference to `elfcorehdr_addr' make elfcorehdr_addr a generally available symbol. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 11:26:23 +02:00
Ilpo Järvinen	ec34c702ca	net: drop unused BUG_TRAP() Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 21:45:49 -07:00
Grant Likely	284b018973	spi: Add OF binding support for SPI busses This patch adds support for populating an SPI bus based on data in the OF device tree. This is useful for powerpc platforms which use the device tree instead of discrete code for describing platform layout. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2008-07-25 22:34:40 -04:00
Grant Likely	dc87c98e8f	spi: split up spi_new_device() to allow two stage registration. spi_new_device() allocates and registers an spi device all in one swoop. If the driver needs to add extra data to the spi_device before it is registered, then this causes problems. This is needed for OF device tree support so that the SPI device tree helper can add a pointer to the device node after the device is allocated, but before the device is registered. OF aware SPI devices can then retrieve data out of the device node to populate a platform data structure. This patch splits the allocation and registration portions of code out of spi_new_device() and creates two new functions; spi_alloc_device() and spi_register_device(). spi_new_device() is modified to use the new functions for allocation and registration. None of the existing users of spi_new_device() should be affected by this change. Drivers using the new API can forego the use of spi_board_info structure to describe the device layout and populate data into the spi_device structure directly. This change is in preparation for adding an OF device tree parser to generate spi_devices based on data in the device tree. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David Brownell <dbrownell@users.sourceforge.net>	2008-07-25 22:34:29 -04:00
Grant Likely	3f07af494d	of: adapt of_find_i2c_driver() to be usable by SPI also SPI has a similar problem as I2C in that it needs to determine an appropriate modalias value for each device node. This patch adapts the of_i2c of_find_i2c_driver() function to be usable by of_spi also. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2008-07-25 22:25:13 -04:00
Harvey Harrison	b4615e69b6	sys_paccept definition missing __user annotation Introduced by commit `aaca0bdca5` ("flag parameters: paccept"): net/socket.c:1515:17: error: symbol 'sys_paccept' redeclared with different type (originally declared at include/linux/syscalls.h:413) - incompatible argument 4 (different address spaces) Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 17:28:49 -07:00
Linus Torvalds	ff5d48a6d1	Merge git://git.infradead.org/embedded-2.6 * git://git.infradead.org/embedded-2.6: Make console charset translation optional	2008-07-25 12:02:08 -07:00
Linus Torvalds	762b8291be	Merge git://git.infradead.org/~dwmw2/random-2.6 * git://git.infradead.org/~dwmw2/random-2.6: remove dummy asm/kvm.h files firmware: create firmware binaries during 'make modules'.	2008-07-25 12:01:37 -07:00
Johannes Weiner	c6af5e9f8a	bootmem: Move node allocation macros back to !HAVE_ARCH_BOOTMEM_NODE These got unintentionally moved, put them back as x86 provides its own versions. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 11:36:44 -07:00
Adrian Bunk	7dcf2a9fce	remove dummy asm/kvm.h files This patch removes the dummy asm/kvm.h files on architectures not (yet) supporting KVM and uses the same conditional headers installation as already used for a.out.h . Also removed are superfluous install rules in the s390 and x86 Kbuild files (they are already in Kbuild.asm). Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-25 14:35:50 -04:00
Linus Torvalds	5047887caf	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc: Wireup new syscalls Move update_mmu_cache() declaration from tlbflush.h to pgtable.h powerpc/pseries: Remove kmalloc call in handling writes to lparcfg powerpc/pseries: Update arch vector to indicate support for CMO ibmvfc: Add support for collaborative memory overcommit ibmvscsi: driver enablement for CMO ibmveth: enable driver for CMO ibmveth: Automatically enable larger rx buffer pools for larger mtu powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O powerpc/pseries: vio bus support for CMO powerpc/pseries: iommu enablement for CMO powerpc/pseries: Add CMO paging statistics powerpc/pseries: Add collaborative memory manager powerpc/pseries: Utilities to set firmware page state powerpc/pseries: Enable CMO feature during platform setup powerpc/pseries: Split retrieval of processor entitlement data into a helper routine powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg powerpc: Fix compile error with binutils 2.15 ... Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually.	2008-07-25 11:08:17 -07:00
Linus Torvalds	996abf053e	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6: (22 commits) UBI: always start the background thread UBI: fix gcc warning UBI: remove pre-sqnum images support UBI: fix kernel-doc errors and warnings UBI: fix checkpatch.pl errors and warnings UBI: bugfix - do not torture PEB needlessly UBI: rework scrubbing messages UBI: implement multiple volumes rename UBI: fix and re-work debugging stuff UBI: amend commentaries UBI: fix error message UBI: improve mkvol request validation UBI: add ubi_sync() interface UBI: fix 64-bit calculations UBI: fix LEB locking UBI: fix memory leak on error path UBI: do not forget to free internal volumes UBI: fix memory leak UBI: avoid unnecessary division operations UBI: fix buffer padding ...	2008-07-25 11:02:17 -07:00
Arthur Jones	8f421c595a	edac: i5100 new intel chipset driver Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	33670fa296	fuse: nfs export special lookups Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	bde74e4bc6	locks: add special return value for asynchronous locks Use a special error value FILE_LOCK_DEFERRED to mean that a locking operation returned asynchronously. This is returned by posix_lock_file() for sleeping locks to mean that the lock has been queued on the block list, and will be woken up when it might become available and needs to be retried (either fl_lmops->fl_notify() is called or fl_wait is woken up). f_op->lock() to mean either the above, or that the filesystem will call back with fl_lmops->fl_grant() when the result of the locking operation is known. The filesystem can do this for sleeping as well as non-sleeping locks. This is to make sure, that return values of -EAGAIN and -EINPROGRESS by filesystems are not mistaken to mean an asynchronous locking. This also makes error handling in fs/locks.c and lockd/svclock.c slightly cleaner. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Keika Kobayashi	016ae219b9	per-task-delay-accounting: update taskstats for memory reclaim delay Add members for memory reclaim delay to taskstats, and accumulate them in __delayacct_add_tsk() . Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp> Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Keika Kobayashi	873b477177	per-task-delay-accounting: add memory reclaim delay Sometimes, application responses become bad under heavy memory load. Applications take a bit time to reclaim memory. The statistics, how long memory reclaim takes, will be useful to measure memory usage. This patch adds accounting memory reclaim to per-task-delay-accounting for accounting the time of do_try_to_free_pages(). <i.e> - When System is under low memory load, memory reclaim may not occur. $ free total used free shared buffers cached Mem: 8197800 1577300 6620500 0 4808 1516724 -/+ buffers/cache: 55768 `8142032` Swap: 16386292 0 16386292 $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0 Measure the time of tar command. $ ls -s test.dat 1501472 test.dat $ time tar cvf test.tar test.dat real 0m13.388s user 0m0.116s sys 0m5.304s $ ./delayget -d -p <pid> CPU count real total virtual total delay total 428 5528345500 5477116080 62749891 IO count delay total 338 8078977189 SWAP count delay total 0 0 RECLAIM count delay total 0 0 - When system is under heavy memory load memory reclaim may occur. $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0 0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 100 0 0 0 7159032 49848 1812 3012 0 0 0 0 3 22 0 0 100 0 In this case, one process uses more 8G memory by execution of malloc() and memset(). $ time tar cvf test.tar test.dat real 1m38.563s <- increased by 85 sec user 0m0.140s sys 0m7.060s $ ./delayget -d -p <pid> CPU count real total virtual total delay total 9021 7140446250 7315277975 923201824 IO count delay total 8965 90466349669 SWAP count delay total 3 21036367 RECLAIM count delay total 740 61011951153 In the later case, the value of RECLAIM is increasing. So, taskstats can show how much memory reclaim influences TAT. Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujistu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Andrea Righi	297c5d9263	task IO accounting: provide distinct tgid/tid I/O statistics Report per-thread I/O statistics in /proc/pid/task/tid/io and aggregate parent I/O statistics in /proc/pid/io. This approach follows the same model used to account per-process and per-thread CPU times. As a practial application, this allows for example to quickly find the top I/O consumer when a process spawns many child threads that perform the actual I/O work, because the aggregated I/O statistics can always be found in /proc/pid/io. [ Oleg Nesterov points out that we should check that the task is still alive before we iterate over the threads, but also says that we can do that fixup on top of this later. - Linus ] Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: Matt Heaton <matt@hostmonster.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Acked-by-with-comments: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Pavel Emelyanov	0b6b030fc3	bsdacct: switch from global bsd_acct_struct instance to per-pidns one Allocate the structure on the first call to sys_acct(). After this each namespace, that ordered the accounting, will live with this structure till its own death. Two notes - routines, that close the accounting on fs umount time use the init_pid_ns's acct by now; - accounting routine accounts to dying task's namespace (also by now). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:47 -07:00
Pavel Emelyanov	20fad13ac6	pidns: add the struct bsd_acct_struct pointer on pid_namespace struct All the bsdacct-related info will be stored in the area, pointer by this one. It will be NULL automatically for all new namespaces. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:46 -07:00
Jonathan Lim	49b5cf3472	accounting: account for user time when updating memory integrals Adapt acct_update_integrals() to include user time when calculating the time difference. The units of acct_rss_mem1 and acct_vm_mem1 are also changed from pages-jiffies to pages-usecs to avoid calling jiffies_to_usecs() in xacct_add_tsk() which might overflow. Signed-off-by: Jonathan Lim <jlim@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:46 -07:00
Pavel Emelyanov	dbda0de526	pidns: remove find_task_by_pid, unused for a long time It seems to me that it was a mistake marking this function as deprecated and scheduling it for removal, rather than resolutely removing it after the last caller's death. Anyway - better late, then never. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Pavel Emelyanov	e49859e71e	pidns: remove now unused find_pid function. This one had the only users so far - the kill_proc, which is removed, so drop this (invalid in namespaced world) call too. And of course - erase all references on it from comments. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Pavel Emelyanov	19b0cfcca4	pidns: remove now unused kill_proc function This function operated on a pid_t to kill a task, which is no longer valid in a containerized system. It has finally lost all its users and we can safely remove it from the tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Richard Kennedy	33166b1ffc	shrink struct pid by removing padding on 64 bit builds When struct pid is built on a 64 bit platform gcc has to insert padding to maintain the correct alignment, by simply reordering its members the memory usage shrinks from 88 bytes to 80. I've successfully run with this patch on my desktop AMD64 machine. There are no significant kernel size changes to a default config.X86_64 on the latest git v2.6.26-rc1 text data bss dec hex filename 5404828 976760 734280 7115868 6c945c vmlinux 5404811 976760 734280 7115851 6c944b vmlinux.pid-patch Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Adrian Bunk	3ae4eed34b	proper pid{hash,map}_init() prototypes This patch adds proper prototypes for pid{hash,map}_init() in include/linux/pid_namespace.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:45 -07:00
Alexey Dobriyan	881adb8535	proc: always do ->release Current two-stage scheme of removing PDE emphasizes one bug in proc: open rmmod remove_proc_entry close ->release won't be called because ->proc_fops were cleared. In simple cases it's small memory leak. For every ->open, ->release has to be done. List of openers is introduced which is traversed at remove_proc_entry() if neeeded. Discussions with Al long ago (sigh). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Adrian Bunk	6e644c3126	move proc_kmsg_operations to fs/proc/internal.h This patch moves the extern of struct proc_kmsg_operations to fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c so that the latter sees the former. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Abdel Benamrouche	d805dda412	fs/partition/check.c: fix return value warning fs/partitions/check.c:381: warning: ignoring return value of ___device_add___, declared with attribute warn_unused_result [akpm@linux-foundation.org: multiple-return-statements-per-function are evil] Signed-off-by: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:44 -07:00
Nadia Derbey	9eefe520c8	ipc: do not use a negative value to re-enable msgmni automatic recomputing This patch proposes an alternative to the "magical positive-versus-negative number trick" Andrew complained about last week in http://lkml.org/lkml/2008/6/24/418. This had been introduced with the patches that scale msgmni to the amount of lowmem. With these patches, msgmni has a registered notification routine that recomputes msgmni value upon memory add/remove or ipc namespace creation/ removal. When msgmni is changed from user space (i.e. value written to the proc file), that notification routine is unregistered, and the way to make it registered back is to write a negative value into the proc file. This is the "magical positive-versus-negative number trick". To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni. This file acts as ON/OFF for msgmni automatic recomputing. With this patch, the process is the following: 1) kernel boots in "automatic recomputing mode" /proc/sys/kernel/msgmni contains the value that has been computed (depends on lowmem) /proc/sys/kernel/automatic_msgmni contains "1" 2) echo <val> > /proc/sys/kernel/msgmni . sets msg_ctlmni to <val> . de-activates automatic recomputing (i.e. if, say, some memory is added msgmni won't be recomputed anymore) . /proc/sys/kernel/automatic_msgmni now contains "0" 3) echo "0" > /proc/sys/kernel/automatic_msgmni . de-activates msgmni automatic recomputing this has the same effect as 2) except that msg_ctlmni's value stays blocked at its current value) 3) echo "1" > /proc/sys/kernel/automatic_msgmni . recomputes msgmni's value based on the current available memory size and number of ipc namespaces . re-activates automatic recomputing for msgmni. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	380af1b33b	ipc/sem.c: rewrite undo list locking The attached patch: - reverses the locking order of ulp->lock and sem_lock: Previously, it was first ulp->lock, then inside sem_lock. Now it's the other way around. - converts the undo structure to rcu. Benefits: - With the old locking order, IPC_RMID could not kfree the undo structures. The stale entries remained in the linked lists and were released later. - The patch fixes a a race in semtimedop(): if both IPC_RMID and a semget() that recreates exactly the same id happen between find_alloc_undo() and sem_lock, then semtimedop() would access already kfree'd memory. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	a1193f8ec0	ipc/sem.c: convert sem_array.sem_pending to struct list_head sem_array.sem_pending is a double linked list, the attached patch converts it to struct list_head. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	2c0c29d414	ipc/sem.c: remove unused entries from struct sem_queue sem_queue.sma and sem_queue.id were never used, the attached patch removes them. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Manfred Spraul	4daa28f6d8	ipc/sem.c: convert undo structures to struct list_head The undo structures contain two linked lists, the attached patch replaces them with generic struct list_head lists. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Nadia Derbey	f9c46d6ea5	idr: make idr_find rcu-safe Make idr_find rcu-safe: it can now be called inside an rcu_read critical section. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:42 -07:00
Nadia Derbey	944ca05c7b	idr: error checking factorization Do some code factorization in the return code analysis. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Nadia Derbey	2027d1abc2	idr: change the idr structure After scalability problems have been detected when using the sysV ipcs, I have proposed to use an RCU based implementation of the IDR api instead (see threads http://lkml.org/lkml/2008/4/11/212 and http://lkml.org/lkml/2008/4/29/295). This resulted in many people asking to convert the idr API and make it rcu safe (because most of the code was duplicated and thus unmaintanable and unreviewable). So here is a first attempt. The important change wrt to the idr API itself is during idr removes: idr layers are freed after a grace period, instead of being moved to the free list. The important change wrt to ipcs, is that idr_find() can now be called locklessly inside a rcu read critical section. Here are the results I've got for the pmsg test sent by Manfred: 2.6.25-rc3-mm1 2.6.25-rc3-mm1+ 2.6.25-mm1 Patched 2.6.25-mm1 1 1168441 1064021 876000 947488 2 1094264 921059 1549592 1730685 3 2082520 1738165 1694370 2324880 4 2079929 1695521 404553 2400408 5 2898758 406566 391283 3246580 6 2921417 261275 263249 3752148 7 3308761 126056 191742 4243142 8 `3329456` 100129 141722 4275780 1st column: stock 2.6.25-rc3-mm1 2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs) 3nd column: stock 2.6.25-mm1 4th column: 2.6.25-mm1 + this pacth series. This patch: Add an rcu_head to the idr_layer structure in order to free it after a grace period. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Chandru	95b68dec0d	calgary iommu: use the first kernels TCE tables in kdump kdump kernel fails to boot with calgary iommu and aacraid driver on a x366 box. The ongoing dma's of aacraid from the first kernel continue to exist until the driver is loaded in the kdump kernel. Calgary is initialized prior to aacraid and creation of new tce tables causes wrong dma's to occur. Here we try to get the tce tables of the first kernel in kdump kernel and use them. While in the kdump kernel we do not allocate new tce tables but instead read the base address register contents of calgary iommu and use the tables that the registers point to. With these changes the kdump kernel and hence aacraid now boots normally. Signed-off-by: Chandru Siddalingappa <chandru@in.ibm.com> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:41 -07:00
Oleg Nesterov	3da1c84c00	workqueues: make get_online_cpus() useable for work->func() workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under cpu_maps_update_begin(). This means that the multithreaded workqueues can't use get_online_cpus() due to the possible deadlock, very bad and very old problem. Introduce the new state, CPU_POST_DEAD, which is called after cpu_hotplug_done() but before cpu_maps_update_done(). Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD. This means that create/destroy functions can't rely on get_online_cpus() any longer and should take cpu_add_remove_lock instead. [akpm@linux-foundation.org: fix CONFIG_SMP=n] Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Paul Menage <menage@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	db70089722	workqueues: implement flush_work() Most of users of flush_workqueue() can be changed to use cancel_work_sync(), but sometimes we really need to wait for the completion and cancelling is not an option. schedule_on_each_cpu() is good example. Add the new helper, flush_work(work), which waits for the completion of the specific work_struct. More precisely, it "flushes" the result of of the last queue_work() which is visible to the caller. For example, this code queue_work(wq, work); /* WINDOW / queue_work(wq, work); flush_work(work); doesn't necessary work "as expected". What can happen in the WINDOW above is - wq starts the execution of work->func() - the caller migrates to another CPU now, after the 2nd queue_work() this work is active on the previous CPU, and at the same time it is queued on another. In this case flush_work(work) may return before the first work->func() completes. It is trivial to add another helper int flush_work_sync(struct work_struct work) { return flush_work(work) \|\| wait_on_work(work); } which works "more correctly", but it has to iterate over all CPUs and thus it much slower than flush_work(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Acked-by: Jarek Poplawski <jarkao2@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	a94e2d408e	coredump: kill mm->core_done Now that we have core_state->dumper list we can use it to wake up the sub-threads waiting for the coredump completion. This uglifies the code and .text grows by 47 bytes, but otoh mm_struct lessens by sizeof(struct completion). Also, with this change we can decouple exit_mm() from the coredumping code. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	b564daf806	coredump: construct the list of coredumping threads at startup time binfmt->core_dump() has to iterate over the all threads in system in order to find the coredumping threads and construct the list using the GFP_ATOMIC allocations. With this patch each thread allocates the list node on exit_mm()'s stack and adds itself to the list. This allows us to do further changes: - simplify ->core_dump() - change exit_mm() to clear ->mm first, then wait for ->core_done. this makes the coredumping process visible to oom_kill - kill mm->core_done Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:40 -07:00
Oleg Nesterov	c5f1cc8c18	coredump: turn core_state->nr_threads into atomic_t Turn core_state->nr_threads into atomic_t and kill now unneeded down_write(&mm->mmap_sem) in exit_mm(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	999d9fc167	coredump: move mm->core_waiters into struct core_state Move mm->core_waiters into "struct core_state" allocated on stack. This shrinks mm_struct a little bit and allows further changes. This patch mostly does s/core_waiters/core_state. The only essential change is that coredump_wait() must clear mm->core_state before return. The coredump_wait()'s path is uglified and .text grows by 30 bytes, this is fixed by the next patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	32ecb1f26d	coredump: turn mm->core_startup_done into the pointer to struct core_state mm->core_startup_done points to "struct completion startup_done" allocated on the coredump_wait()'s stack. Introduce the new structure, core_state, which holds this "struct completion". This way we can add more info visible to the threads participating in coredump without enlarging mm_struct. No changes in affected .o files. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	246bb0b1de	kill PF_BORROWED_MM in favour of PF_KTHREAD Kill PF_BORROWED_MM. Change use_mm/unuse_mm to not play with ->flags, and do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users. No functional changes yet. But this allows us to do further fixes/cleanups. oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the kthreads, this is wrong because of use_mm(). The problem with PF_BORROWED_MM is that we need task_lock() to avoid races. With this patch we can check PF_KTHREAD directly, or use a simple lockless helper: /* The result must not be dereferenced !!! / struct mm_struct __get_task_mm(struct task_struct *tsk) { if (tsk->flags & PF_KTHREAD) return NULL; return tsk->mm; } Note also ecard_task(). It runs with ->mm != NULL, but it's the kernel thread without PF_BORROWED_MM. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	7b34e4283c	introduce PF_KTHREAD flag Introduce the new PF_KTHREAD flag to mark the kernel threads. It is set by INIT_TASK() and copied to the forked childs (we could set it in kthreadd() along with PF_NOFREEZE instead). daemonize() was changed as well. In that case testing of PF_KTHREAD is racy, but daemonize() is hopeless anyway. This flag is cleared in do_execve(), before search_binary_handler(). Probably not the best place, we can do this in exec_mmap() or in start_thread(), or clear it along with PF_FORKNOEXEC. But I think this doesn't matter in practice, and if do_execve() fails kthread should die soon. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
Oleg Nesterov	364d3c13c1	ptrace: give more respect to SIGKILL ptrace_stop() has some complicated checks to prevent the scheduling in the TASK_TRACED state with the pending SIGKILL, but these checks are racy, and they depend on arch_ptrace_stop_needed(). This patch assumes that the traced task should die asap if it was killed by SIGKILL, in that case schedule()->signal_pending_state() has no reason to ignore the TASK_WAKEKILL part of TASK_TRACED, and we can kill this nasty special case. Note: do_exit()->ptrace_notify() is special, the killed task can already dequeue SIGKILL at this point. Another indication that fatal_signal_pending() is not exactly right. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:39 -07:00
KAMEZAWA Hiroyuki	12b9804419	res_counter: limit change support ebusy Add an interface to set limit. This is necessary to memory resource controller because it shrinks usage at set limit. Other controllers may not need this interface to shrink usage because shrinking is not necessary or impossible. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	c9b0ed5148	memcg: helper function for relcaim from shmem. A new call, mem_cgroup_shrink_usage() is added for shmem handling and relacing non-standard usage of mem_cgroup_charge/uncharge. Now, shmem calls mem_cgroup_charge() just for reclaim some pages from mem_cgroup. In general, shmem is used by some process group and not for global resource (like file caches). So, it's reasonable to reclaim pages from mem_cgroup where shmem is mainly used. [hugh@veritas.com: shmem_getpage release page sooner] [hugh@veritas.com: mem_cgroup_shrink_usage css_put] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	69029cd550	memcg: remove refcnt from page_cgroup memcg: performance improvements Patch Description 1/5 ... remove refcnt fron page_cgroup patch (shmem handling is fixed) 2/5 ... swapcache handling patch 3/5 ... add helper function for shmem's memory reclaim patch 4/5 ... optimize by likely/unlikely ppatch 5/5 ... remove redundunt check patch (shmem handling is fixed.) Unix bench result. == 2.6.26-rc2-mm1 + memory resource controller Execl Throughput 2915.4 lps (29.6 secs, 3 samples) C Compiler Throughput 1019.3 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5796.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1097.7 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 565.3 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1022128.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 544057.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 346481.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 319325.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 148788.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 99051.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2058917.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1606109.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 854789.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 126145.2 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 2915.4 678.0 File Copy 1024 bufsize 2000 maxblocks 3960.0 346481.0 875.0 File Copy 256 bufsize 500 maxblocks 1655.0 99051.0 598.5 File Copy 4096 bufsize 8000 maxblocks 5800.0 854789.0 1473.8 Shell Scripts (8 concurrent) 6.0 1097.7 1829.5 ========= FINAL SCORE 991.3 == 2.6.26-rc2-mm1 + this set == Execl Throughput 3012.9 lps (29.9 secs, 3 samples) C Compiler Throughput 981.0 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5872.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1120.3 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 578.0 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1003993.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 550452.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 347159.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 314644.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 151852.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 101000.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2033256.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1611814.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 847979.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 128148.7 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 3012.9 700.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 347159.0 876.7 File Copy 256 bufsize 500 maxblocks 1655.0 101000.0 610.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 847979.0 1462.0 Shell Scripts (8 concurrent) 6.0 1120.3 1867.2 ========= FINAL SCORE 1004.6 This patch: Remove refcnt from page_cgroup(). After this, * A page is charged only when !page_mapped() && no page_cgroup is assigned. * Anon page is newly mapped. * File page is added to mapping->tree. * A page is uncharged only when * Anon page is fully unmapped. * File page is removed from LRU. There is no change in behavior from user's view. This patch also removes unnecessary calls in rmap.c which was used only for refcnt mangement. [akpm@linux-foundation.org: fix warning] [hugh@veritas.com: fix shmem_unuse_inode charging] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki	e8589cc189	memcg: better migration handling This patch changes page migration under memory controller to use a different algorithm. (thanks to Christoph for new idea.) Before: - page_cgroup is migrated from an old page to a new page. After: - a new page is accounted , no reuse of page_cgroup. Pros: - We can avoid compliated lock depndencies and races in migration. Cons: - new param to mem_cgroup_charge_common(). - mem_cgroup_getref() is added for handling ref_cnt ping-pong. This version simplifies complicated lock dependency in page migraiton under memory resource controller. new refcnt sequence is following. a mapped page: prepage_migration() ..... +1 to NEW page try_to_unmap() ..... all refs to OLD page is gone. move_pages() ..... +1 to NEW page if page cache. remap... ..... all refs from map is added to NEW one. end_migration() ..... -1 to New page. page's mapcount + (page_is_cache) refs are added to NEW one. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
Serge E. Hallyn	e885dcde75	cgroup_clone: use pid of newly created task for new cgroup cgroup_clone creates a new cgroup with the pid of the task. This works correctly for unshare, but for clone cgroup_clone is called from copy_namespaces inside copy_process, which happens before the new pid is created. As a result, the new cgroup was created with current's pid. This patch: 1. Moves the call inside copy_process to after the new pid is created 2. Passes the struct pid into ns_cgroup_clone (as it is not yet attached to the task) 3. Passes a name from ns_cgroup_clone() into cgroup_clone() so as to keep cgroup_clone() itself simpler 4. Uses pid_vnr() to get the process id value, so that the pid used to name the new cgroup is always the pid as it would be known to the task which did the cloning or unsharing. I think that is the most intuitive thing to do. This way, task t1 does clone(CLONE_NEWPID) to get t2, which does clone(CLONE_NEWPID) to get t3, then the cgroup for t3 will be named for the pid by which t2 knows t3. (Thanks to Dan Smith for finding the main bug) Changelog: June 11: Incorporate Paul Menage's feedback: don't pass NULL to ns_cgroup_clone from unshare, and reduce patch size by using 'nodename' in cgroup_clone. June 10: Original version [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge Hallyn <serge@us.ibm.com> Acked-by: Paul Menage <menage@google.com> Tested-by: Dan Smith <danms@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:37 -07:00
Paul Menage	856c13aa1f	cgroup files: convert res_counter_write() to be a cgroups write_string() handler Currently res_counter_write() is a raw file handler even though it's ultimately taking a number, since in some cases it wants to pre-process the string when converting it to a number. This patch converts res_counter_write() from a raw file handler to a write_string() handler; this allows some of the boilerplate copying/locking/checking to be removed, and simplies the cleanup path, since these functions are now performed by the cgroups framework. [lizf@cn.fujitsu.com: build fix] Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:36 -07:00
Paul Menage	84eea84288	cgroups: misc cleanups to write_string patchset This patch contains cleanups suggested by reviewers for the recent write_string() patchset: - pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for clarity, rather than directly unlocking cgroup_mutex. - make the return type of cgroup_lock_live_group() a bool - use a #define'd constant for the local buffer size in read/write functions Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	e788e066c6	cgroup files: move the release_agent file to use typed handlers Adds cgroup_release_agent_write() and cgroup_release_agent_show() methods to handle writing/reading the path to a cgroup hierarchy's release agent. As a result, cgroup_common_file_read() is now unnecessary. As part of the change, a previously-tolerated race in cgroup_release_agent() is avoided by copying the current release_agent_path prior to calling call_usermode_helper(). Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	db3b14978a	cgroup files: add write_string cgroup control file method This patch adds a write_string() method for cgroups control files. The semantics are that a buffer is copied from userspace to kernelspace and the handler function invoked on that buffer. The buffer is guaranteed to be nul-terminated, and no longer than max_write_len (defaulting to 64 bytes if unspecified). Later patches will convert existing raw file write handlers in control group subsystems to use this method. Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Balbir Singh <balbir@in.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Paul Menage	ce16b49d37	cgroup files: clean up whitespace in struct cftype This patch removes some extraneous spaces from method declarations in struct cftype, to fit in with conventional kernel style. Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Pavel Emelyanov	f2992db2a4	Mark res_counter_charge(_locked) with __must_check Ignoring their return values may result in counter underflow in the future - when the value charged will be uncharged (or in "leaks" - when the value is not uncharged). This also prevents from using charging routines to decrement the counter value (i.e. uncharge it) ;) (Current code works OK with res_counter, however :) ) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	657d3bfa98	quota: implement sending information via netlink about user below quota Sometimes it may be useful for userspace to know (e.g. for some hosting guys) that some user stopped exceeding his hardlimit or softlimit in quotas. Implement sending of such events to userspace via quota netlink protocol so that they don't have to poll for such events. Based on idea and initial implementation by Vladislav Bogdanov. Cc: Vladislav Bogdanov <slava@nsys.by> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	03b063436c	quota: convert macros to inline functions Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	74abb9890d	quota: move function-macros from quota.h to quotaops.h Move declarations of some macros, which should be in fact functions to quotaops.h. This way they can be later converted to inline functions because we can now use declarations from quota.h. Also add necessary includes of quotaops.h to a few files. [akpm@linux-foundation.org: fix JFS build] [akpm@linux-foundation.org: fix UFS build] [vegard.nossum@gmail.com: fix QUOTA=n build] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Arjen Pool <arjenpool@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	02a55ca871	quota: cleanup loop in sync_dquots() Make loop in sync_dquots() checking whether there's something to write more readable, remove useless variable and macro info_any_dirty() which is used only in this place. Signed-off-by: Jan Kara <jack@suse.cz> Cc: "Vegard Nossum" <vegard.nossum@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Jan Kara	b85f4b87a5	quota: rename quota functions from upper case, make bigger ones non-inline Cleanup quotaops.h: Rename functions from uppercase to lowercase (and define backward compatibility macros), move larger functions to dquot.c and make them non-inline. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:35 -07:00
Joe Peterson	b271e067c8	fatfs: add UTC timestamp option Provide a new mount option ("tz=UTC") for DOS (vfat/msdos) filesystems, allowing timestamps to be in coordinated universal time (UTC) rather than local time in applications where doing this is advantageous. In particular, portable devices that use fat/vfat (such as digital cameras) can benefit from using UTC in their internal clocks, thus avoiding daylight saving time errors and general time ambiguity issues. The user of the device does not have to worry about changing the time when moving from place or when daylight saving changes. The new mount option, when set, disables the counter-adjustment that Linux currently makes to FAT timestamp info in anticipation of the normal userspace time zone correction. When used in this new mode, all daylight saving time and time zone handling is done in userspace as is normal for many other filesystems (like ext3). The default mode, which remains unchanged, is still appropriate when mounting volumes written in Windows (because of its use of local time). I originally based this patch on one submitted last year by Paul Collins, but I updated it to work with current source and changed variable/option naming. Ogawa Hirofumi (who maintains these filesystems) and I discussed this patch at length on lkml, and he suggested using the option name in the attached version of the patch. Barry Bouwsma pointed out a good addition to the patch as well. Signed-off-by: Joe Peterson <joe@skyrush.com> Signed-off-by: Paul Collins <paul@ondioline.org> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Barry Bouwsma <free_beer_for_all@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Adrian Bunk	e8938a62a8	remove unused #include <linux/dirent.h>'s Remove some unused #include <linux/dirent.h>'s. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Adrian Bunk	cf6ae8b50e	remove the in-kernel struct dirent{,64} The kernel struct dirent{,64} were different from the ones in userspace. Even worse, we exported the kernel ones to userspace. But after the fat usages are fixed we can remove the conflicting kernel versions. Reviewed-by: H. Peter Anvin <hpa@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Rene Scharfe	7557bc66be	msdos fs: remove unsettable atari option It has been impossible to set the option 'atari' of the MSDOS filesystem for several years. Since nobody seems to have missed it, let's remove its remains. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
OGAWA Hirofumi	4596c8aaf9	fat: fix VFAT_IOCTL_READDIR_xxx and cleanup for userland "struct dirent" is a kernel type here, but is a different type in userspace! This means both the structure and the IOCTL number is wrong! So, this adds new "struct __fat_dirent" to generate correct IOCTL number. And kernel stuff moves to under __KERNEL__. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:34 -07:00
Jeff Mahoney	90415deac7	reiserfs: convert j_commit_lock to mutex j_commit_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Jeff Mahoney	afe7025907	reiserfs: convert j_flush_sem to mutex j_flush_sem is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: fix mutex_trylock retval treatment] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Jeff Mahoney	f68215c464	reiserfs: convert j_lock to mutex j_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chris Mason <chris.mason@oracle.com> Cc: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Adrian Bunk	de0ca06a99	coda: remove CODA_FS_OLD_API While fixing CONFIG_ leakages to the userspace kernel headers I ran into CODA_FS_OLD_API. After five years, are there still people using the old API left? Especially considering that you have to choose at compile time which API to support in the kernel (and distributions tend to offer the new API for some time). Jan: "The old API can definitely go. Around the time the new interface went in there were some non-Coda userspace file system implementations that took a while longer to convert to the new API, but by now they all switched to the new interface or in some cases to a FUSE-based solution." Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:33 -07:00
Duane Griffin	ae76dd9a6b	ext3: handle corrupted orphan list at mount If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext3_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext3_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: printk warning fix] Signed-off-by: Duane Griffin <duaneg@dghda.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:32 -07:00
Samuel Thibault	50c33a84db	ext2: fix typo in Hurd part of include/linux/ext2_fs.h Fix typo in Hurd part of include/linux/ext2_fs.h The ';' here is redundant or can even pose problem. This is actually not used by the Linux kernel, but it is exposed in GNU/Hurd. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:31 -07:00
Eric Miao	bbcd6d543d	gpio: max732x driver This adds a driver supporting a family of I2C port expanders from Maxim, which includes the MAX7319 and MAX7320-7327 chips. [dbrownell@users.sourceforge.net: minor fixes] Signed-off-by: Jack Ren <jack.ren@marvell.com> Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
David Brownell	8f1cc3b10e	gpio: mcp23s08 handles multiple chips per chipselect Teach the mcp23s08 driver about a curious feature of these chips: up to four of them can share the same chipselect, with the SPI signals wired in parallel, by matching two bits in the first protocol byte against two address lines on the chip. This is handled by three software changes: * Platform data now holds an array of per-chip structs, not just one chip's address and pullup configuration. * Probe() and remove() now use another level of structure, wrapping an instance of the original structure for each mcp23s08 chip sharing that chipselect. * The HAEN bit is set, so that the hardware address bits can no longer be ignored (boot firmware may not have enabled them). The "one struct per chip" preserves the guts of the current code, but platform_data will need minor changes. OLD: /* incorrect "slave" ID may not have mattered / .slave = 3, .pullups = BIT(3) \| BIT(1) \| BIT(0), NEW: / slave address _must_ match chip's wiring */ .chip[3] = { .is_present = true, .pullups = BIT(3) \| BIT(1) \| BIT(0), }, There's no change in how things _behave_ for spi_device nodes with a single mcp23s08 chip. New multi-chip configurations assign GPIOs in sequence, without holes. The spi_device just resembles a bigger controller, but internally it has multiple gpio_chip instances. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
David Brownell	d8f388d8dc	gpio: sysfs interface This adds a simple sysfs interface for GPIOs. /sys/class/gpio /export ... asks the kernel to export a GPIO to userspace /unexport ... to return a GPIO to the kernel /gpioN ... for each exported GPIO #N /value ... always readable, writes fail for input GPIOs /direction ... r/w as: in, out (default low); write high, low /gpiochipN ... for each gpiochip; #N is its first GPIO /base ... (r/o) same as N /label ... (r/o) descriptive, not necessarily unique /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1) GPIOs claimed by kernel code may be exported by its owner using a new gpio_export() call, which should be most useful for driver debugging. Such exports may optionally be done without a "direction" attribute. Userspace may ask to take over a GPIO by writing to a sysfs control file, helping to cope with incomplete board support or other "one-off" requirements that don't merit full kernel support: echo 23 > /sys/class/gpio/export ... will gpio_request(23, "sysfs") and gpio_export(23); use /sys/class/gpio/gpio-23/direction to (re)configure it, when that GPIO can be used as both input and output. echo 23 > /sys/class/gpio/unexport ... will gpio_free(23), when it was exported as above The extra D-space footprint is a few hundred bytes, except for the sysfs resources associated with each exported GPIO. The additional I-space footprint is about two thirds of the current size of gpiolib (!). Since no /dev node creation is involved, no "udev" support is needed. Related changes: * This adds a device pointer to "struct gpio_chip". When GPIO providers initialize that, sysfs gpio class devices become children of that device instead of being "virtual" devices. * The (few) gpio_chip providers which have such a device node have been updated. * Some gpio_chip drivers also needed to update their module "owner" field ... for which missing kerneldoc was added. * Some gpio_chips don't support input GPIOs. Those GPIOs are now flagged appropriately when the chip is registered. Based on previous patches, and discussion both on and off LKML. A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this merges to mainline. [akpm@linux-foundation.org: a few maintenance build fixes] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Srinivasa D S	ef53d9c5e4	kprobes: improve kretprobe scalability with hashed locking Currently list of kretprobe instances are stored in kretprobe object (as used_instances,free_instances) and in kretprobe hash table. We have one global kretprobe lock to serialise the access to these lists. This causes only one kretprobe handler to execute at a time. Hence affects system performance, particularly on SMP systems and when return probe is set on lot of functions (like on all systemcalls). Solution proposed here gives fine-grain locks that performs better on SMP system compared to present kretprobe implementation. Solution: 1) Instead of having one global lock to protect kretprobe instances present in kretprobe object and kretprobe hash table. We will have two locks, one lock for protecting kretprobe hash table and another lock for kretporbe object. 2) We hold lock present in kretprobe object while we modify kretprobe instance in kretprobe object and we hold per-hash-list lock while modifying kretprobe instances present in that hash list. To prevent deadlock, we never grab a per-hash-list lock while holding a kretprobe lock. 3) We can remove used_instances from struct kretprobe, as we can track used instances of kretprobe instances using kretprobe hash table. Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system with return probes set on all systemcalls looks like this. cacheline non-cacheline Un-patched kernel aligned patch aligned patch =============================================================================== real 9m46.784s 9m54.412s 10m2.450s user 40m5.715s 40m7.142s 40m4.273s sys 2m57.754s 2m58.583s 3m17.430s =========================================================== Time duration for kernel compilation ("make -j 8) on the same system, when kernel is not probed. ========================= real 9m26.389s user 40m8.775s sys 2m7.283s ========================= Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Ben Dooks	42cd2366fb	sm501: gpio I2C support Add support for adding the GPIO based I2C resources. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Arnaud Patard	60e540d617	sm501: gpio dynamic registration for PCI devices The SM501 PCI card requires a dyanmic gpio allocation as the number of cards is not known at compile time. Fixup the platform data and registration to deal with this. Acked-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:30 -07:00
Ben Dooks	f61be273d3	sm501: add gpiolib support Add support for exporting the GPIOs on the SM501 via gpiolib. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Ben Dooks	472dba7d11	sm501: add power control callback Add callback to get or set the power control if the device has the sleep connected to some form of GPIO. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Arnaud Patard <apatard@mandriva.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Dave Young	717115e1a5	printk ratelimiting rewrite All ratelimit user use same jiffies and burst params, so some messages (callbacks) will be lost. For example: a call printk_ratelimit(5 * HZ, 1) b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will will be supressed. - rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for hints from andrew. - Add WARN_ON_RATELIMIT, update rcupreempt.h - remove __printk_ratelimit - use __ratelimit in net_ratelimit Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Vegard Nossum	2711b793eb	kallsyms: unify 32- and 64-bit code Use the %p format string which already accounts for the padding you need with a pointer type on a particular architecture. Also replace the macro with a static inline function to match the rest of the file. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Arjan van de Ven	b6c6393700	Rename WARN() to WARNING() to clear the namespace We want to use WARN() as a variant of WARN_ON(), however a few drivers are using WARN() internally. This patch renames these to WARNING() to avoid the namespace clash. A few cases were defining but not using the thing, for those cases I just deleted the definition. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Greg KH <greg@kroah.com> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:29 -07:00
Robert P. J. Day	4500d067ee	init.h: remove obsolete content Remove apparently obsolete content from init.h referring to gcc 2.9x and to "no_module_init". Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:28 -07:00
KOSAKI Motohiro	ac331d158e	call_usermodehelper(): increase reliability Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return NULL _very_ easily. GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust and better. thus, I add gfp_mask argument to call_usermodehelper_setup(). So, its callers pass the gfp_t as below: call_usermodehelper() and call_usermodehelper_keys(): depend on 'wait' argument. call_usermodehelper_pipe(): always GFP_KERNEL because always run under process context. orderly_poweroff(): pass to GFP_ATOMIC because may run under interrupt context. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: "Paul Menage" <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:28 -07:00
Andrew Morton	cebbd3fb80	build-kernel-profileo-only-when-requested-cleanups Cc: Adrian Bunk <bunk@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Adrian Bunk	b03f6489f9	build kernel/profile.o only when requested Build kernel/profile.o only if CONFIG_PROFILING is enabled. This makes CONFIG_PROFILING=n kernels smaller. As a bonus, some profile_tick() calls and one branch from schedule() are now eliminated with CONFIG_PROFILING=n (but I doubt these are measurable effects). This patch changes the effects of CONFIG_PROFILING=n, but I don't think having more than two choices would be the better choice. This patch also adds the name of the first parameter to the prototypes of profile_{hits,tick}() since I anyway had to add them for the dummy functions. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Robert P. J. Day	e0ce0da9fe	lists: remove a redundant conditional definition of list_add() Remove the conditional surrounding the definition of list_add() from list.h since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently pick up from lib/list_debug.c will be absolutely identical, at which point you can remove that redundant definition from list_debug.c as well. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Robert P. J. Day	b39c08cb69	Remove apparently unused fd1772.h header file. This header file has been unused for quite some time, and the corresponding source files appear to have been removed back in commit `99eb8a550d` ("Remove the arm26 port") Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Adrian Bunk <bunk@stusta.de> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:27 -07:00
Harvey Harrison	8b5ac31e27	include: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Steven Rostedt	3f307891ce	locking: add typecheck on irqsave and friends for correct flags There haave been several areas in the kernel where an int has been used for flags in local_irq_save() and friends instead of a long. This can cause some hard to debug problems on some architectures. This patch adds a typecheck inside the irqsave and restore functions to flag these cases. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Andrew Morton	e0deaff470	split the typecheck macros out of include/linux/kernel.h Needed to fix up a recursive include snafu in locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:26 -07:00
Yevgeny Petrilin	25c94d010a	mlx4_core: Add VLAN tag field to WQE control segment struct Add fields for VLAN tag and insert VLAN tag flag to the control section struct. These fields will be used for sending ethernet packets. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-25 10:30:06 -07:00
David Miller	3d6f4a20cc	endian: Always evaluate arguments. Changeset `7fa897b91a` ("ide: trivial sparse annotations") created an IDE bootup regression on big-endian systems. In drivers/ide/ide-iops.c, function ide_fixstring() we now have the loop: for (p = end ; p != s;) be16_to_cpus((u16 *)(p -= 2)); which will never terminate on big-endian because in such a configuration be16_to_cpus() evaluates to "do { } while (0)" Therefore, always evaluate the arguments to nop endian transformation operations. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 09:28:09 -07:00
Alexey Korolev	3d45955962	[MTD] [NAND] subpage read feature as a way to increase performance. This patch enables NAND subpage read functionality. If upper layer drivers are requesting to read non page aligned data NAND subpage-read functionality reads the only whose ECC regions which include requested data when original code reads whole page. This significantly improves performance in many cases. Here are some digits : UBI volume mount time No subpage reads: 5.75 seconds Subpage read patch: 2.42 seconds Open/stat time for files on JFFS2 volume: No subpage read 0m 5.36s Subpage read 0m 2.88s Signed-off-by Alexey Korolev <akorolev@infradead.org> Acked-by: Artem Bityutskiy <dedekind@infradead.org> Acked-by: Jörn Engel <joern@logfs.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-07-25 10:49:50 -04:00
David Woodhouse	ff877ea80e	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6	2008-07-25 10:40:14 -04:00
Stefan Richter	95984f62c9	firewire: fw-ohci: TSB43AB22/A dualbuffer workaround Isochronous reception in dualbuffer mode is reportedly broken with TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been determined as the trigger: https://bugzilla.redhat.com/show_bug.cgi?id=435550 Two fixes are possible: - pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK); at least when IR descriptors are allocated, or - simply don't use dualbuffer. This fix implements the latter workaround. But we keep using dualbuffer on x86-32 which won't give us highmen (and thus physical addresses outside the 31bit range) in coherent DMA memory allocations. Right now we could for example also whitelist PPC32, but DMA mapping implementation details are expected to change there. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>	2008-07-25 15:41:23 +02:00
Ingo Molnar	10a010f695	Merge branch 'linus' into x86/x2apic Conflicts: drivers/pci/dmar.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-25 13:08:16 +02:00
Nathan Lynch	483fad1c3f	ELF loader support for auxvec base platform string Some IBM POWER-based platforms have the ability to run in a mode which mostly appears to the OS as a different processor from the actual hardware. For example, a Power6 system may appear to be a Power5+, which makes the AT_PLATFORM value "power5+". This means that programs are restricted to the ISA supported by Power5+; Power6-specific instructions are treated as illegal. However, some applications (virtual machines, optimized libraries) can benefit from knowledge of the underlying CPU model. A new aux vector entry, AT_BASE_PLATFORM, will denote the actual hardware. For example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is that AT_PLATFORM indicates the instruction set supported, while AT_BASE_PLATFORM indicates the underlying microarchitecture. If the architecture has defined ELF_BASE_PLATFORM, copy that value to the user stack in the same manner as ELF_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Linus Torvalds	832fe9c222	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: Add transport feature handling stub for virtio_ring. virtio: Rename set_features to finalize_features virtio: Formally reserve bits 28-31 to be 'transport' features. s390: use virtio_console for KVM on s390 virtio: console as a config option virtio_console: use virtqueue notification for hvc_console hvc_console: rework setup to replace irq functions with callbacks virtio_blk: check for hardsector size from host virtio: Use bus_type probe and remove methods virtio: don't always force a notification when ring is full virtio: clarify that ABI is usable by any implementations virtio: Recycle unused recv buffer pages for large skbs in net driver virtio net: Allow receiving SG packets virtio net: Add ethtool ops for SG/GSO virtio: fix virtio_net xmit of freed skb bug	2008-07-24 19:11:49 -07:00
Rusty Russell	ed9559d38a	Label kthread_create() with printf attribute tag. Obvious misc patch been in my queue (& linux-next) for over a cycle. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 19:11:15 -07:00
Rusty Russell	e34f872567	virtio: Add transport feature handling stub for virtio_ring. To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:14 +10:00
Rusty Russell	c624896e48	virtio: Rename set_features to finalize_features Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:12 +10:00
Rusty Russell	dd7c7bc462	virtio: Formally reserve bits 28-31 to be 'transport' features. We assign feature bits as required, but it makes sense to reserve some for the particular transport, rather than the particular device. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:07 +10:00
Christian Borntraeger	066f4d82a6	virtio_blk: check for hardsector size from host Currently virtio_blk assumes a 512 byte hard sector size. This can cause trouble / performance issues if the backing has a different block size (like a file on an ext3 file system formatted with 4k block size or a dasd). Lets add a feature flag that tells the guest to use a different hard sector size than 512 byte. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:05 +10:00
Rusty Russell	674bfc23c5	virtio: clarify that ABI is usable by any implementations We want others to implement and use virtio, so it makes sense to BSD license the non-__KERNEL__ parts of the headers to make this crystal clear. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Ryan Harper <ryanh@us.ibm.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com> Acked-by: Anthony Liguori <aliguori@us.ibm.com>	2008-07-25 12:06:04 +10:00
Linus Torvalds	b5684b83b1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) ide: use proper printk() KERN_* levels in ide-probe.c ide: fix for EATA SCSI HBA in ATA emulating mode ide: remove stale comments from drivers/ide/Makefile ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase ide-scsi: remove kmalloced struct request ht6560b: remove old history ht6560b: update email address ide-cd: fix oops when using growisofs gayle: release resources on ide_host_add() failure palm_bk3710: add UltraDMA/100 support ide: trivial sparse annotations ide: ide-tape.c sparse annotations and unaligned access removal ide: drop 'name' parameter from ->init_chipset method ide: prefix messages from IDE PCI host drivers by driver name it821x: remove DECLARE_ITE_DEV() macro it8213: remove DECLARE_ITE_DEV() macro ide: include PCI device name in messages from IDE PCI host drivers ide: remove <asm/ide.h> for some archs ide-generic: remove ide_default_{io_base,irq}() inlines (take 3) ide-generic: is no longer needed on ppc32 ...	2008-07-24 14:55:09 -07:00
Linus Torvalds	5042d99795	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fixup sparse endianness warnings in proc.c PCI PM: make more PCI PM core functionality available to drivers PCI/DMAR: don't assume presence of RMRRs PCI hotplug: fix error path in pci_slot's register_slot	2008-07-24 13:57:13 -07:00
Bartlomiej Zolnierkiewicz	a326b02b0c	ide: drop 'name' parameter from ->init_chipset method There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:33 +02:00
Bartlomiej Zolnierkiewicz	2a8f7450f8	ide: remove <asm/ide.h> for some archs * Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes <linux/interrupt.h> which is enough). * Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz	d83b8b85cd	ide: define MAX_HWIFS in <linux/ide.h> * Now that ide_hwif_t instances are allocated dynamically the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10 is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs except these ones that use MAX_HWIFS == 1. * Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>. [ Please note that avr32/cris/v850 have no <asm/ide.h> and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:30 +02:00
Bartlomiej Zolnierkiewicz	ef0b04276d	ide: add ide_pci_remove() helper * Add 'unsigned long host_flags' field to struct ide_host. * Set ->host_flags in ide_host_alloc_all(). * Always set PCI dev's ->driver_data in ide_pci_init_{one,two}(). * Add ide_pci_remove() helper (the default implementation for struct pci_driver's ->remove method). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:19 +02:00
Bartlomiej Zolnierkiewicz	08da591e14	ide: add ide_device_{get,put}() helpers * Add 'struct ide_host host' field to ide_hwif_t and set it in ide_host_alloc_all(). Add ide_device_{get,put}() helpers loosely based on SCSI's scsi_device_{get,put}() ones. * Convert IDE device drivers to use ide_device_{get,put}(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:15 +02:00
Bartlomiej Zolnierkiewicz	6cdf6eb357	ide: add ->dev and ->host_priv fields to struct ide_host * Add 'struct device dev[2]' and 'void host_priv' fields to struct ide_host. * Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s](). * Pass 'void priv' argument to ide_setup_pci_device[s]() and use it to set ->host_priv. Set PCI dev's ->driver_data to point to the struct ide_host instance if PCI host driver wants to use ->host_priv. * Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-24 22:53:14 +02:00
Linus Torvalds	5c402355ad	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event	2008-07-24 12:56:07 -07:00
Linus Torvalds	ecc8b655b3	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well nohz: prevent tick stop outside of the idle loop	2008-07-24 12:55:01 -07:00
Linus Torvalds	7540081c6b	Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: Remove __DECLARE_SEMAPHORE_GENERIC Remove asm/semaphore.h Remove use of asm/semaphore.h Add missing semaphore.h includes Remove mention of semaphores from kernel-locking	2008-07-24 12:24:40 -07:00
Linus Torvalds	c54554d388	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Ensure led->trigger is set earlier leds: Add support for Philips PCA955x I2C LED drivers leds: Fix sparse warnings in leds-h1940 driver leds: mark led_classdev.default_trigger as const leds: fix unsigned value overflow in atmel pwm driver leds: Add pca9532 platform data for Thecus N2100 leds: Add pca9532 led driver	2008-07-24 12:16:02 -07:00
Steven Whitehouse	f9247273cb	UFS: add const to parser token table This patch adds a "const" to the parser token table. I've done an allmodconfig build to see if this produces any warnings/failures and the patch includes a fix for the only warning that was produced. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Alexander Viro <aviro@redhat.com> Acked-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 11:50:15 -07:00
Philippe De Muyter	5bb49fcd50	video/fb: cleanup FB_MAJOR usage Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses, and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c needs. Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from fb.h to major.h, and fix fbmem.c to use major.h's definition. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:41 -07:00
Hans-Christian Egtvedt	3e074058d7	fbdev: LCD backlight driver using Atmel PWM driver This patch adds a platform driver using the ATMEL PWM driver to control a backlight which requires a PWM signal and optional GPIO signal for discrete on/off signal. It has been tested on Favr-32 board from EarthLCD. The driver is configurable by supplying a struct with the platform data. See the include/linux/atmel-pwm-bl.h for details. The board code for Favr-32 will be submitted to the AVR32 kernel list. Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:41 -07:00
Ben Dooks	0c531360ed	lcd: add lcd_device to check_fb() entry in lcd_ops Add the lcd_device being checked to the check_fb entry of lcd_ops. This ensures that any driver using this to check against it's own state can do so, and also makes all the calls in lcd_ops more orthogonal in their arguments. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:40 -07:00
Ben Dooks	206c5d69d0	sm501: add inversion controls for VBIASEN and FPEN Add flags to allow the driver to invert the sense of both VBIASEN and FPEN signals comming from the SM501. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:40 -07:00
Krzysztof Helt	01a2d9ed85	tridentfb: acceleration constants change This patch replaces deprecated constant FB_ACCELF_TEXT with FBINFO_HWACCEL_DISABLED and adds constants for Trident families of accelerators. The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works now. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:36 -07:00
David Brownell	d3de851a44	rtc: BCD codeshrink This updates <linux/bcd.h> to define the key routines as constant functions, which the macros will then call. Newer code can now call bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS. This lets each driver shrink their codespace by using N function calls to a single (global) copy of those routines, instead of N inlined copies of these functions per driver. These routines aren't used in speed-critical code. Almost all callers are in the RTC framework. Typical per-driver savings is near 300 bytes. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Adrian Bunk <bunk@kernel.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
David Brownell	53e84b672c	rtc: ds1305/ds1306 driver Support the Dallas/Maxim DS1305 and DS1306 RTC chips. These use SPI, and support alarms, NVRAM, and a trickle charger for use when their backup power supply is a supercap or rechargeable cell. This basic driver doesn't yet support suspend/resume or wakealarms. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
David Brownell	5ad31a5751	rtc: remove BKL for ioctl() Remove implicit use of BKL in ioctl() from the RTC framework. Instead, the rtc->ops_lock is used. That's the same lock that already protects the RTC operations when they're issued through the exported rtc_*() calls in drivers/rtc/interface.c ... making this a bugfix, not just a cleanup, since both ioctl calls and set_alarm() need to update IRQ enable flags and that implies a common lock (which RTC drivers as a rule do not provide on their own). A new comment at the declaration of "struct rtc_class_ops" summarizes current locking rules. It's not clear to me that the exceptions listed there should exist ... if not, those are pre-existing problems which can be fixed in a patch that doesn't relate to BKL removal. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
Ian Kent	aa55ddf340	autofs4: remove unused ioctls The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added several years ago but what they were intended for has never been implemented (as far as I'm aware noone uses them) so remove them. Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:33 -07:00
Grant Likely	102eb97564	spi: make spi_board_info.modalias a char array Currently, 'modalias' in the spi_device structure is a 'const char '. The spi_new_device() function fills in the modalias value from a passed in spi_board_info data block. Since it is a pointer copy, the new spi_device remains dependent on the spi_board_info structure after the new spi_device is registered (no other fields in spi_device directly depend on the spi_board_info structure; all of the other data is copied). This causes a problem when dynamically propulating the list of attached SPI devices. For example, in arch/powerpc, the list of SPI devices can be populated from data in the device tree. With the current code, the device tree adapter must kmalloc() a new spi_board_info structure for each new SPI device it finds in the device tree, and there is no simple mechanism in place for keeping track of these allocations. This patch changes modalias from a 'const char ' to a fixed char array. By copying the modalias string instead of referencing it, the dependency on the spi_board_info structure is eliminated and an outside caller does not need to maintain a separate spi_board_info allocation for each device. If searched through the code to the best of my ability for any references to modalias which may be affected by this change and haven't found anything. It has been tested with the lite5200b platform in arch/powerpc. [dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc] Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:30 -07:00
Ulrich Drepper	9fe5ad9c8c	flag parameters add-on: remove epoll_create size param Remove the size parameter from the new epoll_create syscall and renames the syscall itself. The updated test program follows. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	510df2dd48	flag parameters: NONBLOCK in inotify_init This patch adds non-blocking support for inotify_init1. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("inotify_init1(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_NONBLOCK); if (fd == -1) { puts ("inotify_init1(IN_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("inotify_init1(IN_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	be61a86d72	flag parameters: NONBLOCK in pipe This patch adds O_NONBLOCK support to pipe2. It is minimally more involved than the patches for eventfd et.al but still trivial. The interfaces of the create_write_pipe and create_read_pipe helper functions were changed and the one other caller as well. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fds[2]; if (syscall (__NR_pipe2, fds, 0) == -1) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1) { puts ("pipe2(O_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	6b1ef0e60d	flag parameters: NONBLOCK in timerfd_create This patch adds support for the TFD_NONBLOCK flag to timerfd_create. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("timerfd_create(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_NONBLOCK); if (fd == -1) { puts ("timerfd_create(TFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("timerfd_create(TFD_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	e7d476dfdf	flag parameters: NONBLOCK in eventfd This patch adds support for the EFD_NONBLOCK flag to eventfd2. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("eventfd2(0) sets non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_NONBLOCK); if (fd == -1) { puts ("eventfd2(EFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("eventfd2(EFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	5fb5e04926	flag parameters: NONBLOCK in signalfd This patch adds support for the SFD_NONBLOCK flag to signalfd4. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_NONBLOCK O_NONBLOCK int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("signalfd4(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_NONBLOCK); if (fd == -1) { puts ("signalfd4(SFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("signalfd4(SFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	77d2720059	flag parameters: NONBLOCK in socket and socketpair This patch introduces support for the SOCK_NONBLOCK flag in socket, socketpair, and paccept. To do this the internal function sock_attach_fd gets an additional parameter which it uses to set the appropriate flag for the file descriptor. Given that in modern, scalable programs almost all socket connections are non-blocking and the minimal additional cost for the new functionality I see no reason not to add this code. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <pthread.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #include <sys/syscall.h> #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_NONBLOCK O_NONBLOCK static pthread_barrier_t b; static void * tf (void arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); return NULL; } int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("socket(0) set non-blocking mode"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM\|SOCK_NONBLOCK, 0); if (fd == -1) { puts ("socket(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("socket(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("socketpair(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM\|SOCK_NONBLOCK, 0, fds) == -1) { puts ("socketpair(SOCK_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("socketpair(SOCK_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, NULL) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr ) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } fl = fcntl (s2, F_GETFL); if (fl & O_NONBLOCK) { puts ("paccept(0) set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_NONBLOCK); if (s2 < 0) { puts ("paccept(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (s2, F_GETFL); if ((fl & O_NONBLOCK) == 0) { puts ("paccept(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:29 -07:00
Ulrich Drepper	4006553b06	flag parameters: inotify_init This patch introduces the new syscall inotify_init1 (note: the 1 stands for the one parameter the syscall takes, as opposed to no parameter before). The values accepted for this parameter are function-specific and defined in the inotify.h header. Here the values must match the O_* flags, though. In this patch CLOEXEC support is introduced. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("inotify_init1(0) set close-on-exit"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_CLOEXEC); if (fd == -1) { puts ("inotify_init1(IN_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	ed8cae8ba0	flag parameters: pipe This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	336dd1f70f	flag parameters: dup2 This patch adds the new dup3 syscall. It extends the old dup2 syscall by one parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag is added in this patch. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_dup3 # ifdef __x86_64__ # define __NR_dup3 292 # elif defined __i386__ # define __NR_dup3 330 # else # error "need __NR_dup3" # endif #endif int main (void) { int fd = syscall (__NR_dup3, 1, 4, 0); if (fd == -1) { puts ("dup3(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("dup3(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC); if (fd == -1) { puts ("dup3(O_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("dup3(O_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	a0998b50c3	flag parameters: epoll_create This patch adds the new epoll_create2 syscall. It extends the old epoll_create syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EPOLL_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 1, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:28 -07:00
Ulrich Drepper	11fcb6c146	flag parameters: timerfd_create The timerfd_create syscall already has a flags parameter. It just is unused so far. This patch changes this by introducing the TFD_CLOEXEC flag to set the close-on-exec flag for the returned file descriptor. A new name TFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <time.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("timerfd_create(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC); if (fd == -1) { puts ("timerfd_create(TFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	b087498eb5	flag parameters: eventfd This patch adds the new eventfd2 syscall. It extends the old eventfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("eventfd2(0) sets close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC); if (fd == -1) { puts ("eventfd2(EFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	9deb27baed	flag parameters: signalfd This patch adds the new signalfd4 syscall. It extends the old signalfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is SFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name SFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_CLOEXEC O_CLOEXEC int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("signalfd4(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC); if (fd == -1) { puts ("signalfd4(SFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	7d9dbca342	flag parameters: anon_inode_getfd extension This patch just extends the anon_inode_getfd interface to take an additional parameter with a flag value. The flag value is passed on to get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag. No actual semantic changes here, the changed callers all pass 0 for now. [akpm@linux-foundation.org: KVM fix] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	c019bbc612	flag parameters: paccept w/out set_restore_sigmask Some platforms do not have support to restore the signal mask in the return path from a syscall. For those platforms syscalls like pselect are not defined at all. This is, I think, not a good choice for paccept() since paccept() adds more value on top of accept() than just the signal mask handling. Therefore this patch defines a scaled down version of the sys_paccept function for those platforms. It returns -EINVAL in case the signal mask is non-NULL but behaves the same otherwise. Note that I explicitly included <linux/thread_info.h>. I saw that it is currently included but indirectly two levels down. There is too much risk in relying on this. The header might change and then suddenly the function definition would change without anyone immediately noticing. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	aaca0bdca5	flag parameters: paccept This patch is by far the most complex in the series. It adds a new syscall paccept. This syscall differs from accept in that it adds (at the userlevel) two additional parameters: - a signal mask - a flags value The flags parameter can be used to set flag like SOCK_CLOEXEC. This is imlpemented here as well. Some people argued that this is a property which should be inherited from the file desriptor for the server but this is against POSIX. Additionally, we really want the signal mask parameter as well (similar to pselect, ppoll, etc). So an interface change in inevitable. The flag value is the same as for socket and socketpair. I think diverging here will only create confusion. Similar to the filesystem interfaces where the use of the O_* constants differs, it is acceptable here. The signal mask is handled as for pselect etc. The mask is temporarily installed for the thread and removed before the call returns. I modeled the code after pselect. If there is a problem it's likely also in pselect. For architectures which use socketcall I maintained this interface instead of adding a system call. The symmetry shouldn't be broken. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <errno.h> #include <fcntl.h> #include <pthread.h> #include <signal.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #include <sys/syscall.h> #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_CLOEXEC O_CLOEXEC static pthread_barrier_t b; static void * tf (void arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr ) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); sleep (2); pthread_kill ((pthread_t) arg, SIGUSR1); return NULL; } static void handler (int s) { } int main (void) { pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, (void ) pthread_self ()) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } int coe = fcntl (s2, F_GETFD); if (coe & FD_CLOEXEC) { puts ("paccept(0) set close-on-exec-flag"); return 1; } close (s2); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC); if (s2 < 0) { puts ("paccept(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (s2, F_GETFD); if ((coe & FD_CLOEXEC) == 0) { puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (s2); pthread_barrier_wait (&b); struct sigaction sa; sa.sa_handler = handler; sa.sa_flags = 0; sigemptyset (&sa.sa_mask); sigaction (SIGUSR1, &sa, NULL); sigset_t ss; pthread_sigmask (SIG_SETMASK, NULL, &ss); sigaddset (&ss, SIGUSR1); pthread_sigmask (SIG_SETMASK, &ss, NULL); sigdelset (&ss, SIGUSR1); alarm (4); pthread_barrier_wait (&b); errno = 0 ; s2 = paccept (s, NULL, 0, &ss, 0); if (s2 != -1 \|\| errno != EINTR) { puts ("paccept did not fail with EINTR"); return 1; } close (s); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: make it compile] [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Roland McGrath <roland@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Ulrich Drepper	a677a039be	flag parameters: socket and socketpair This patch adds support for flag values which are ORed to the type passwd to socket and socketpair. The additional code is minimal. The flag values in this implementation can and must match the O_* flags. This avoids overhead in the conversion. The internal functions sock_alloc_fd and sock_map_fd get a new parameters and all callers are changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <sys/socket.h> #define PORT 57392 /* For Linux these must be the same. */ #define SOCK_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("socket(0) set close-on-exec flag"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM\|SOCK_CLOEXEC, 0); if (fd == -1) { puts ("socket(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM\|SOCK_CLOEXEC, 0, fds) == -1) { puts ("socketpair(SOCK_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:27 -07:00
Adrian Bunk	f606ddf42f	remove the v850 port Trying to compile the v850 port brings many compile errors, one of them exists since at least kernel 2.6.19. There also seems to be noone willing to bring this port back into a usable state. This patch therefore removes the v850 port. If anyone ever decides to revive the v850 port the code will still be available from older kernels, and it wouldn't be impossible for the port to reenter the kernel if it would become actively maintained again. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:24 -07:00
Shaohua Li	bdfe6b7c68	pm: acpi hibernation: utilize hardware signature ACPI defines a hardware signature. BIOS calculates the signature according to hardware configure and if hardware changes while hibernated, the signature will change. In that case, S4 resume should fail. Still, there may be systems on which this mechanism does not work correctly, so it is better to provide a workaround for them. For this reason, add a new switch to the acpi_sleep= command line argument allowing one to disable hardware signature checking. [shaohua.li@intel.com: build fix] Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: <Valdis.Kletnieks@vt.edu> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:24 -07:00
Zhang Rui	c1a220e7ac	pm: introduce new interfaces schedule_work_on() and queue_work_on() This interface allows adding a job on a specific cpu. Although a work struct on a cpu will be scheduled to other cpu if the cpu dies, there is a recursion if a work task tries to offline the cpu it's running on. we need to schedule the task to a specific cpu in this case. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [oleg@tv-sign.ru: cleanups] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Rus <harbour@sfinx.od.ua> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Alan Stern	8111d1b552	pm: add new PM_EVENT codes for runtime power transitions This patch (as1112) adds some new PM_EVENT_* codes for use by kernel subsystems. They describe runtime power-state transitions of the sort already implemented by the USB subsystem. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Rafael J. Wysocki	8c363265d5	pm: drop unnecessary includes from pm.h Drop unnecessary includes from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:23 -07:00
Rafael J. Wysocki	e7ecb331e1	pm: remove remaining obsolete definitions from pm.h Remove the remaining obsolete definitions from include/linux/pm.h and move the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which is the only user of them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Rafael J. Wysocki	558481f038	pm: remove definition of struct pm_dev Remove the definition of 'struct pm_dev', which is not used any more, along with some related stuff from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Adrian Bunk	d75f65fd24	remove include/linux/pm_legacy.h Remove the obsolete and no longer used include/linux/pm_legacy.h Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Pavel Machek <pavel@suse.cz> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Hugh Dickins	9b3e43a747	security: remove unused forwards Why would linux/security.h need forward declarations for nfsctl_arg and swap_info_struct? It's hard to imagine: remove them. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Andrew G. Morgan	5459c164f0	security: protect legacy applications from executing with insufficient privilege When cap_bset suppresses some of the forced (fP) capabilities of a file, it is generally only safe to execute the program if it understands how to recognize it doesn't have enough privilege to work correctly. For legacy applications (fE!=0), which have no non-destructive way to determine that they are missing privilege, we fail to execute (EPERM) any executable that requires fP capabilities, but would otherwise get pP' < fP. This is a fail-safe permission check. For some discussion of why it is problematic for (legacy) privileged applications to run with less than the set of capabilities requested for them, see: http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html With this iteration of this support, we do not include setuid-0 based privilege protection from the bounding set. That is, the admin can still (ab)use the bounding set to suppress the privileges of a setuid-0 program. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: cleanup] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:22 -07:00
Gerald Schaefer	83d1674a94	mm: make CONFIG_MIGRATION available w/o CONFIG_NUMA We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA support. This patch makes CONFIG_MIGRATION selectable for architectures that define ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the kernel won't compile because migrate_vmas() does not know about vm_ops->migrate() and vma_migratable() does not know about policy_zone. To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA' because they are not being used w/o NUMA. vma_migratable() is moved over from migrate.h to mempolicy.h. [kosaki.motohiro@jp.fujitsu.com: build fix] Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Milton Miller	9ca908f47b	kcalloc: remove runtime division While in all cases in the kernel we know the size of the elements to be created, we don't always know the count of elements. By commuting the size and count in the overflow check, the compiler can reduce the runtime division of size_t with a compare to a (unique) constant in these cases. Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Badari Pulavarty	5c755e9fd8	memory-hotplug: add sysfs removable attribute for hotplug memory remove Memory may be hot-removed on a per-memory-block basis, particularly on POWER where the SPARSEMEM section size often matches the memory-block size. A user-level agent must be able to identify which sections of memory are likely to be removable before attempting the potentially expensive operation. This patch adds a file called "removable" to the memory directory in sysfs to help such an agent. In this patch, a memory block is considered removable if; o It contains only MOVABLE pageblocks o It contains only pageblocks with free pages regardless of pageblock type On the other hand, a memory block starting with a PageReserved() page will never be considered removable. Without this patch, the user-agent is forced to choose a memory block to remove randomly. Sample output of the sysfs files: ./memory/memory0/removable: 0 ./memory/memory1/removable: 0 ./memory/memory2/removable: 0 ./memory/memory3/removable: 0 ./memory/memory4/removable: 0 ./memory/memory5/removable: 0 ./memory/memory6/removable: 0 ./memory/memory7/removable: 1 ./memory/memory8/removable: 0 ./memory/memory9/removable: 0 ./memory/memory10/removable: 0 ./memory/memory11/removable: 0 ./memory/memory12/removable: 0 ./memory/memory13/removable: 0 ./memory/memory14/removable: 0 ./memory/memory15/removable: 0 ./memory/memory16/removable: 0 ./memory/memory17/removable: 1 ./memory/memory18/removable: 1 ./memory/memory19/removable: 1 ./memory/memory20/removable: 1 ./memory/memory21/removable: 1 ./memory/memory22/removable: 1 Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Yasunori Goto	af370fb8cb	memory hotplug: small fixes to bootmem freeing for memory hotremove - Change some naming * Magic -> types * MIX_INFO -> MIX_SECTION_INFO * Change definition of bootmem type from direct hex value - __free_pages_bootmem() becomes __meminit. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Andrea Righi	27ac792ca0	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Timur Tabi	2be0ffe2b2	mm: add alloc_pages_exact() and free_pages_exact() alloc_pages_exact() is similar to alloc_pages(), except that it allocates the minimum number of pages to fulfill the request. This is useful if you want to allocate a very large buffer that is slightly larger than an even power-of-two number of pages. In that case, alloc_pages() will waste a lot of memory. I have a video driver that wants to allocate a 5MB buffer. alloc_pages() wiill waste 3MB of physically-contiguous memory. Signed-off-by: Timur Tabi <timur@freescale.com> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	3560e249ab	bootmem: replace node_boot_start in struct bootmem_data Almost all users of this field need a PFN instead of a physical address, so replace node_boot_start with node_min_pfn. [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()] Signed-off-by: Johannes Weiner <hannes@saeureba.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	5f2809e69c	bootmem: clean up alloc_bootmem_core alloc_bootmem_core has become quite nasty to read over time. This is a clean rewrite that keeps the semantics. bdata->last_pos has been dropped. bdata->last_success has been renamed to hint_idx and it is now an index relative to the node's range. Since further block searching might start at this index, it is now set to the end of a succeeded allocation rather than its beginning. bdata->last_offset has been renamed to last_end_off to be more clear that it represents the ending address of the last allocation relative to the node. [y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()] Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:20 -07:00
Johannes Weiner	223e8dc924	bootmem: reorder code to match new bootmem structure This only reorders functions so that further patches will be easier to read. No code changed. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Jon Tollefson	53ba51d21d	hugetlb: allow arch overridden hugepage allocation Allow alloc_bootmem_huge_page() to be overridden by architectures that can't always use bootmem. This requires huge_boot_pages to be available for use by this function. This is required for powerpc 16G pages, which have to be reserved prior to boot-time. The location of these pages are indicated in the device tree. Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:19 -07:00
Andi Kleen	ceb8687961	hugetlb: introduce pud_huge Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:18 -07:00
Andi Kleen	b54bbf7b81	mm: introduce non panic alloc_bootmem Straight forward variant of the existing __alloc_bootmem_node, only subsequent patch when allocating giant hugepages at boot -- don't want to panic if we can't allocate as many as the user asked for. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Nishanth Aravamudan	a343787016	hugetlb: new sysfs interface Provide new hugepages user APIs that are more suited to multiple hstates in sysfs. There is a new directory, /sys/kernel/hugepages. Underneath that directory there will be a directory per-supported hugepage size, e.g.: /sys/kernel/hugepages/hugepages-64kB /sys/kernel/hugepages/hugepages-16384kB /sys/kernel/hugepages/hugepages-16777216kB corresponding to 64k, 16m and 16g respectively. Within each hugepages-size directory there are a number of files, corresponding to the tracked counters in the hstate, e.g.: /sys/kernel/hugepages/hugepages-64/nr_hugepages /sys/kernel/hugepages/hugepages-64/nr_overcommit_hugepages /sys/kernel/hugepages/hugepages-64/free_hugepages /sys/kernel/hugepages/hugepages-64/resv_hugepages /sys/kernel/hugepages/hugepages-64/surplus_hugepages Of these files, the first two are read-write and the latter three are read-only. The size of the hugepage being manipulated is trivially deducible from the enclosing directory and is always expressed in kB (to match meminfo). [dave@linux.vnet.ibm.com: fix build] [nacc@us.ibm.com: hugetlb: hang off of /sys/kernel/mm rather than /sys/kernel] [nacc@us.ibm.com: hugetlb: remove CONFIG_SYSFS dependency] Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	a137e1cc6d	hugetlbfs: per mount huge page sizes Add the ability to configure the hugetlb hstate used on a per mount basis. - Add a new pagesize= option to the hugetlbfs mount that allows setting the page size - This option causes the mount code to find the hstate corresponding to the specified size, and sets up a pointer to the hstate in the mount's superblock. - Change the hstate accessors to use this information rather than the global_hstate they were using (requires a slight change in mm/memory.c so we don't NULL deref in the error-unmap path -- see comments). [np: take hstate out of hugetlbfs inode and vma->vm_private_data] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	e5ff215941	hugetlb: multiple hstates for multiple page sizes Add basic support for more than one hstate in hugetlbfs. This is the key to supporting multiple hugetlbfs page sizes at once. - Rather than a single hstate, we now have an array, with an iterator - default_hstate continues to be the struct hstate which we use by default - Add functions for architectures to register new hstates [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andi Kleen	a551643895	hugetlb: modular state for hugetlb page size The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Nishanth Aravamudan	ff7ea79cf7	mm: create /sys/kernel/mm Add a kobject to create /sys/kernel/mm when sysfs is mounted. The kobject will exist regardless. This will allow for the hugepage related sysfs directories to exist under the mm "subsystem" directory. Add an ABI file appropriately. [kosaki.motohiro@jp.fujitsu.com: fix build] Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Andy Whitcroft	cdfd4325c0	mm: record MAP_NORESERVE status on vmas and fix small page mprotect reservations With Mel's hugetlb private reservation support patches applied, strict overcommit semantics are applied to both shared and private huge page mappings. This can be a problem if an application relied on unlimited overcommit semantics for private mappings. An example of this would be an application which maps a huge area with the intention of using it very sparsely. These application would benefit from being able to opt-out of the strict overcommit. It should be noted that prior to hugetlb supporting demand faulting all mappings were fully populated and so applications of this type should be rare. This patch stack implements the MAP_NORESERVE mmap() flag for huge page mappings. This flag has the same meaning as for small page mappings, suppressing reservations for that mapping. Thanks to Mel Gorman for reviewing a number of early versions of these patches. This patch: When a small page mapping is created with mmap() reservations are created by default for any memory pages required. When the region is read/write the reservation is increased for every page, no reservation is needed for read-only regions (as they implicitly share the zero page). Reservations are tracked via the VM_ACCOUNT vma flag which is present when the region has reservation backing it. When we convert a region from read-only to read-write new reservations are aquired and VM_ACCOUNT is set. However, when a read-only map is created with MAP_NORESERVE it is indistinguishable from a normal mapping. When we then convert that to read/write we are forced to incorrectly create reservations for it as we have no record of the original MAP_NORESERVE. This patch introduces a new vma flag VM_NORESERVE which records the presence of the original MAP_NORESERVE flag. This allows us to distinguish these two circumstances and correctly account the reserve. As well as fixing this FIXME in the code, this makes it much easier to introduce MAP_NORESERVE support for huge pages as this flag is available consistantly for the life of the mapping. VM_ACCOUNT on the other hand is heavily used at the generic level in association with small pages. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Adam Litke <agl@us.ibm.com> Cc: Johannes Weiner <hannes@saeurebad.de> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Mel Gorman	04f2cbe356	hugetlb: guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed After patch 2 in this series, a process that successfully calls mmap() for a MAP_PRIVATE mapping will be guaranteed to successfully fault until a process calls fork(). At that point, the next write fault from the parent could fail due to COW if the child still has a reference. We only reserve pages for the parent but a copy must be made to avoid leaking data from the parent to the child after fork(). Reserves could be taken for both parent and child at fork time to guarantee faults but if the mapping is large it is highly likely we will not have sufficient pages for the reservation, and it is common to fork only to exec() immediatly after. A failure here would be very undesirable. Note that the current behaviour of mainline with MAP_PRIVATE pages is pretty bad. The following situation is allowed to occur today. 1. Process calls mmap(MAP_PRIVATE) 2. Process calls mlock() to fault all pages and makes sure it succeeds 3. Process forks() 4. Process writes to MAP_PRIVATE mapping while child still exists 5. If the COW fails at this point, the process gets SIGKILLed even though it had taken care to ensure the pages existed This patch improves the situation by guaranteeing the reliability of the process that successfully calls mmap(). When the parent performs COW, it will try to satisfy the allocation without using reserves. If that fails the parent will steal the page leaving any children without a page. Faults from the child after that point will result in failure. If the child COW happens first, an attempt will be made to allocate the page without reserves and the child will get SIGKILLed on failure. To summarise the new behaviour: 1. If the original mapper performs COW on a private mapping with multiple references, it will attempt to allocate a hugepage from the pool or the buddy allocator without using the existing reserves. On fail, VMAs mapping the same area are traversed and the page being COW'd is unmapped where found. It will then steal the original page as the last mapper in the normal way. 2. The VMAs the pages were unmapped from are flagged to note that pages with data no longer exist. Future no-page faults on those VMAs will terminate the process as otherwise it would appear that data was corrupted. A warning is printed to the console that this situation occured. 2. If the child performs COW first, it will attempt to satisfy the COW from the pool if there are enough pages or via the buddy allocator if overcommit is allowed and the buddy allocator can satisfy the request. If it fails, the child will be killed. If the pool is large enough, existing applications will not notice that the reserves were a factor. Existing applications depending on the no-reserves been set are unlikely to exist as for much of the history of hugetlbfs, pages were prefaulted at mmap(), allocating the pages at that point or failing the mmap(). [npiggin@suse.de: fix CONFIG_HUGETLB=n build] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Mel Gorman	a1e78772d7	hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork() This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in a similar manner to the reservations taken for MAP_SHARED mappings. The reserve count is accounted both globally and on a per-VMA basis for private mappings. This guarantees that a process that successfully calls mmap() will successfully fault all pages in the future unless fork() is called. The characteristics of private mappings of hugetlbfs files behaviour after this patch are; 1. The process calling mmap() is guaranteed to succeed all future faults until it forks(). 2. On fork(), the parent may die due to SIGKILL on writes to the private mapping if enough pages are not available for the COW. For reasonably reliable behaviour in the face of a small huge page pool, children of hugepage-aware processes should not reference the mappings; such as might occur when fork()ing to exec(). 3. On fork(), the child VMAs inherit no reserves. Reads on pages already faulted by the parent will succeed. Successful writes will depend on enough huge pages being free in the pool. 4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper and at fault time otherwise. Before this patch, all reads or writes in the child potentially needs page allocations that can later lead to the death of the parent. This applies to reads and writes of uninstantiated pages as well as COW. After the patch it is only a write to an instantiated page that causes problems. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Johannes Weiner	9109fb7b35	mm: drop unneeded pgdat argument from free_area_init_node() free_area_init_node() gets passed in the node id as well as the node descriptor. This is redundant as the function can trivially get the node descriptor itself by means of NODE_DATA() and the node's id. I checked all the users and NODE_DATA() seems to be usable everywhere from where this function is called. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:16 -07:00
Andrew Morton	2185e69f68	mapping_set_error: add unlikely() This is called on a per-page basis and in the vast majority of cases `error' is zero. Cc: Guillaume Chazarain <guichaz@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	9023cb7e85	slob: record page flag overlays explicitly SLOB reuses two page bits for internal purposes, it overlays PG_active and PG_private. This is hidden away in slob.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	8a38082d21	slub: record page flag overlays explicitly SLUB reuses two page bits for internal purposes, it overlays PG_active and PG_error. This is hidden away in slub.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Andy Whitcroft	0cad47cf13	page-flags: record page flag overlays explicitly With the recent page flag reorganisation we have a single enum which defines the valid page flags and their values, nice and clear. However there are a number of bits which are overloaded by different subsystems. Firstly there is PG_owner_priv_1 which is used by filesystems and by XEN. Secondly both SLOB and SLUB use a couple of extra page bits to manage internal state for pages they own; both overlay other bits. All of these "aliases" are scattered about the source making it very hard for a reader to know if the bits are safe to rely on in all contexts; confusion here is bad. As we now have a single place where the bits are clearly assigned it makes sense to clarify the reuse of bits by making the aliases explicit and visible with the original bit assignments. This patch creates explicit aliases within the enum itself for the overloaded bits, creates standard bit accessors PageFoo etc. and uses those throughout. This version pulls the bit manipulation out to standard named page bit accessors as suggested by Christoph, it retains the explicit mapping to the overlayed bits. A fusion of both ideas. This has been SLUB and SLOB have been compile tested on x86_64 only, and SLUB boot tested. If people feel this is worth doing then I can run a fuller set of testing. This patch: Some page flags are used for more than one purpose, for example PG_owner_priv_1. Currently there are individual accessors for each user, each built using the common flag name far away from the bit definitions. This makes it hard to see all possible uses of these bits. Now that we have a single enum to generate the bit orders it makes sense to express overlays in the same place. So create per use aliases for this bit in the main page-flags enum and use those in the accessors. [akpm@linux-foundation.org: fix xen] Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Kentaro Makita	da3bbdd463	fix soft lock up at NFS mount via per-SB LRU-list of unused dentries [Summary] Split LRU-list of unused dentries to one per superblock to avoid soft lock up during NFS mounts and remounting of any filesystem. Previously I posted here: http://lkml.org/lkml/2008/3/5/590 [Descriptions] - background dentry_unused is a list of dentries which are not referenced. dentry_unused grows up when references on directories or files are released. This list can be very long if there is huge free memory. - the problem When shrink_dcache_sb() is called, it scans all dentry_unused linearly under spin_lock(), and if dentry->d_sb is differnt from given superblock, scan next dentry. This scan costs very much if there are many entries, and very ineffective if there are many superblocks. IOW, When we need to shrink unused dentries on one dentry, but scans unused dentries on all superblocks in the system. For example, we scan 500 dentries to unmount a filesystem, but scans 1,000,000 or more unused dentries on other superblocks. In our case , At mounting NFS, shrink_dcache_sb() is called to shrink unused dentries on NFS, but scans 100,000,000 unused dentries on superblocks in the system such as local ext3 filesystems. I hear NFS mounting took 1 min on some system in use. : NFS uses virtual filesystem in rpc layer, so NFS is affected by this problem. 100,000,000 is possible number on large systems. Per-superblock LRU of unused dentried can reduce the cost in reasonable manner. - How to fix I found this problem is solved by David Chinner's "Per-superblock unused dentry LRU lists V3"(1), so I rebase it and add some fix to reclaim with fairness, which is in Andrew Morton's comments(2). 1) http://lkml.org/lkml/2006/5/25/318 2) http://lkml.org/lkml/2006/5/25/320 Split LRU-list of unused dentries to each superblocks. Then, NFS mounting will check dentries under a superblock instead of all. But this spliting will break LRU of dentry-unused. So, I've attempted to make reclaim unused dentrins with fairness by calculate number of dentries to scan on this sb based on following way number of dentries to scan on this sb = count * (number of dentries on this sb / number of dentries in the machine) - ToDo - I have to measuring performance number and do stress tests. - When unmount occurs during prune_dcache(), scanning on same superblock, It is unable to reach next superblock because it is gone away. We restart scannig superblock from first one, it causes unfairness of reclaim unused dentries on first superblock. But I think this happens very rarely. - Test Results Result on 6GB boxes with excessive unused dentries. Without patch: $ cat /proc/sys/fs/dentry-state 10181835 10180203 45 0 0 0 # mount -t nfs 10.124.60.70:/work/kernel-src nfs real 0m1.830s user 0m0.001s sys 0m1.653s With this patch: $ cat /proc/sys/fs/dentry-state 10236610 10234751 45 0 0 0 # mount -t nfs 10.124.60.70:/work/kernel-src nfs real 0m0.106s user 0m0.002s sys 0m0.032s [akpm@linux-foundation.org: fix comments] Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: David Chinner <dgc@sgi.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Jan Beulich	42b7772812	mm: remove double indirection on tlb parameter to free_pgd_range() & Co The double indirection here is not needed anywhere and hence (at least) confusing. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Rik van Riel	28b2ee20c7	access_process_vm device memory infrastructure In order to be able to debug things like the X server and programs using the PPC Cell SPUs, the debugger needs to be able to access device memory through ptrace and /proc/pid/mem. This patch: Add the generic_access_phys access function and put the hooks in place to allow access_process_vm to access device or PPC Cell SPU memory. [riel@redhat.com: Add documentation for the vm_ops->access function] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Nick Piggin	0d71d10a42	mm: remove nopfn There are no users of nopfn in the tree. Remove it. [hugh@veritas.com: fix build error] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:15 -07:00
Adrian Bunk	c748e1340e	mm/vmstat.c: proper externs This patch adds proper extern declarations for five variables in include/linux/vmstat.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
KOSAKI Motohiro	e4048e5dc4	page allocator: inline some __alloc_pages() wrappers Two zonelist patch series rewrote __page_alloc() largely. Now, it is just a wrapper function. Inlining them will save a function call. [akpm@linux-foundation.org: export __alloc_pages_internal] Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Johannes Weiner	ffc6421f07	mm: unexport __alloc_bootmem_core() This function has no external callers, so unexport it. Also fix its naming inconsistency. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Johannes Weiner	b61bfa3c46	mm: move bootmem descriptors definition to a single place There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
FUJITA Tomonori	8b05c7e6e1	add a helper function to test if an object is on the stack lib/debugobjects.c has a function to test if an object is on the stack. The block layer and ide needs it (they need to avoid DMA from/to stack buffers). This patch moves the function to include/linux/sched.h so that everyone can use it. lib/debugobjects.c uses current->stack but this patch uses a task_stack_page() accessor, which is a preferable way to access the stack. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:14 -07:00
Akinobu Mita	e108526e77	move memory_read_from_buffer() from fs.h to string.h James Bottomley warns that inclusion of linux/fs.h in a low level driver was always a danger signal. This patch moves memory_read_from_buffer() from fs.h to string.h and fixes includes in existing memory_read_from_buffer() users. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Bob Moore <robert.moore@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:13 -07:00
Matthew Wilcox	b552068999	Remove __DECLARE_SEMAPHORE_GENERIC There are no users of __DECLARE_SEMAPHORE_GENERIC in the kernel Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-07-24 08:31:21 -04:00
Artem Bityutskiy	85c6e6e282	UBI: amend commentaries Hch asked not to use "unit" for sub-systems, let it be so. Also some other commentaries modifications. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:56 +03:00
Artem Bityutskiy	a5bf619041	UBI: add ubi_sync() interface To flush MTD device caches. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-24 13:32:56 +03:00
Linus Torvalds	7f9dce3837	Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: hrtick_enabled() should use cpu_active() sched, x86: clean up hrtick implementation sched: fix build error, provide partition_sched_domains() unconditionally sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed cpu hotplug: Make cpu_active_map synchronization dependency clear cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) sched: rework of "prioritize non-migratable tasks over migratable ones" sched: reduce stack size in isolated_cpu_setup() Revert parts of "ftrace: do not trace scheduler functions" Fixed up conflicts in include/asm-x86/thread_info.h (due to the TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr() introduction).	2008-07-23 19:36:53 -07:00
Linus Torvalds	26dcce0fab	Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually	2008-07-23 18:37:44 -07:00
Linus Torvalds	d7b6de14a0	Merge branch 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softlockup: fix invalid proc_handler for softlockup_panic softlockup: fix watchdog task wakeup frequency softlockup: fix watchdog task wakeup frequency softlockup: show irqtrace softlockup: print a module list on being stuck softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds softlockup: fix softlockup_thresh fix softlockup: fix softlockup_thresh unaligned access and disable detection at runtime softlockup: allow panic on lockup	2008-07-23 18:34:13 -07:00
Linus Torvalds	30d38542ec	Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm * 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (85 commits) [ARM] pxa: add base support for PXA930 Handheld Platform (aka SAAR) [ARM] pxa: add base support for PXA930 Evaluation Board (aka TavorEVB) [ARM] pxa: add base support for PXA930 (aka Tavor-P) [ARM] Update mach-types [ARM] pxa: make littleton to use the new smc91x platform data [ARM] pxa: make zylonite to use the new smc91x platform data [ARM] pxa: make mainstone to use the new smc91x platform data [ARM] pxa: make lubbock to use new smc91x platform data [NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data [NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable [NET] smc91x: add SMC91X_NOWAIT flag to platform data [NET] smc91x: favor the use of SMC91X_USE_* instead of SMC_CAN_USE_* [NET] smc91x: remove "irq_flags" from "struct smc91x_platdata" [ARM] 5146/1: pxa2xx: convert all boards to call pxa2xx_transceiver_mode helper Support for LCD on e740 e750 e400 and e800 e-series PDAs E-series UDC support PXA UDC - allow use of inverted GPIO for pullup Add e350 support Fix broken e-series build E-series GPIO / IRQ definitions. ...	2008-07-23 18:24:08 -07:00
Dan Williams	d8e64406a0	md: delay notification of 'active_idle' to the recovery thread sysfs_notify might sleep, so do not call it from md_safemode_timeout. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-23 13:09:48 -07:00
Linus Torvalds	20b7997e8a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: sdhci: highmem capable PIO routines sg: reimplement sg mapping iterator mmc_test: print message when attaching to card mmc: Remove Russell as primecell mci maintainer mmc_block: bounce buffer highmem support sdhci: fix bad warning from commit `c8b3e02` sdhci: add warnings for bad buffers in ADMA path mmc_test: test oversized sg lists mmc_test: highmem tests s3cmci: ensure host stopped on machine shutdown au1xmmc: suspend/resume implementation s3cmci: fixes for section mismatch warnings pxamci: trivial fix of DMA alignment register bit clearing	2008-07-23 12:04:34 -07:00
Linus Torvalds	5554b35933	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (24 commits) I/OAT: I/OAT version 3.0 support I/OAT: tcp_dma_copybreak default value dependent on I/OAT version I/OAT: Add watchdog/reset functionality to ioatdma iop_adma: cleanup iop_chan_xor_slot_count iop_adma: document how to calculate the minimum descriptor pool size iop_adma: directly reclaim descriptors on allocation failure async_tx: make async_tx_test_ack a boolean routine async_tx: remove depend_tx from async_tx_sync_epilog async_tx: export async_tx_quiesce async_tx: fix handling of the "out of descriptor" condition in async_xor async_tx: ensure the xor destination buffer remains dma-mapped async_tx: list_for_each_entry_rcu() cleanup dmaengine: Driver for the Synopsys DesignWare DMA controller dmaengine: Add slave DMA interface dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap dmaengine: Add dma_client parameter to device_alloc_chan_resources dmatest: Simple DMA memcpy test client dmaengine: DMA engine driver for Marvell XOR engine iop-adma: fix platform driver hotplug/coldplug dmaengine: track the number of clients using a channel ... Fixed up conflict in drivers/dca/dca-sysfs.c manually	2008-07-23 12:03:18 -07:00
Linus Torvalds	e669e8179d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (60 commits) ide: small whitespace fixes ide: ide-cd_ioctl.c fix sparse integer as NULL pointer warnings ide: ide-cd.c fix sparse endianness warnings ide-cd: convert to using the new atapi_flags ide: remove unused PC_FLAG_DRQ_INTERRUPT ide-scsi: convert to using the new atapi_flags ide-tape: convert to using the new atapi_flags ide-floppy: convert to using the new atapi_flags (take 2) ide: add per-device flags ide: use rq->cmd instead of pc->c in atapi common code ide-scsi: pass packet command in rq->cmd ide-tape: pass packet command in rq->cmd ide-tape: make room for packet command ids in rq->cmd ide-floppy: pass packet command in rq->cmd ide: remove pc->callback member from ide_atapi_pc ide-scsi: use drive->pc_callback instead of pc->callback ide-tape: use drive->pc_callback instead of pc->callback ide-floppy: use drive->pc_callback instead of pc->callback ide: push pc callback pointer into the ide_drive_t structure drivers/ide/ide-tape.c: remove double kfree ...	2008-07-23 11:59:09 -07:00
Dmitry Torokhov	a822bea796	Input: serio - mark serio_register_driver() __must_check Also remove extra declaration of serio_register_driver(). Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-23 14:01:49 -04:00
Borislav Petkov	ac77ef8b03	ide: remove unused PC_FLAG_DRQ_INTERRUPT There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	ea68d270ff	ide-floppy: convert to using the new atapi_flags (take 2) while at it, remove PC_FLAG_ZIP_DRIVE from the packed command flags altogether and query the drive type through drive->atapi_flags. v2: ide-floppy fix. There should be no functionality change resulting from this patch. [bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags] Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	3b8ac5398c	ide: add per-device flags Push device flags up into ide_drive_t. There should be no functionality change resulting from this patch. [bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags] Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:01 +02:00
Borislav Petkov	8bcda3bc49	ide: remove pc->callback member from ide_atapi_pc There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:56:00 +02:00
Borislav Petkov	d7c26ebb5b	ide: push pc callback pointer into the ide_drive_t structure Refrain from carrying the callback ptr with every packet command since the callback function is only one anyways. ide_drive_t is probably not the most suitable place for it right now but is the more sane solution. Besides, these structs are going to be reorganized anyways during the generic ide rewrite. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz	8a69580e1e	ide: add ide_host_free() helper (take 2) * Add ide_host_free() helper and convert ide_host_remove() to use it. * Fix handling of ide_host_register() failure in ide_host_add(), icside.c, ide-generic.c, falconide.c and sgiioc4.c. While at it: * Fix handling of ide_host_alloc_all() failure in ide-generic.c. * Fix handling of ide_host_alloc() failure in falconide.c (also return the correct error value if no device is found). v2: * falconide build fix. (From Stephen Rothwell) Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz	6f904d0152	ide: add ide_host_add() helper Add ide_host_add() helper which does ide_host_alloc()+ide_host_register(), then convert ide_setup_pci_device[s](), ide_legacy_device_add() and some host drivers to use it. While at it: * Fix ide_setup_pci_device[s](), ide_arm.c, gayle.c, ide-4drives.c, macide.c, q40ide.c, cmd640.c and cs5520.c to return correct error value. * -ENOENT -> -ENOMEM in rapide.c, ide-h8300.c, ide-generic.c, au1xxx-ide.c and pmac.c * -ENODEV -> -ENOMEM in palm_bk3710.c, ide_platform.c and delkin_cb.c * -1 -> -ENOMEM in ide-pnp.c Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz	48c3c10726	ide: add struct ide_host (take 3) * Add struct ide_host which keeps pointers to host's ports. * Add ide_host_alloc[_all]() and ide_host_remove() helpers. * Pass 'struct ide_host host' instead of 'u8 idx' to ide_device_add[_all]() and rename it to ide_host_register[_all](). * Convert host drivers and core code to use struct ide_host. * Remove no longer needed ide_find_port(). * Make ide_find_port_slot() static. * Unexport ide_unregister(). v2: * Add missing 'struct ide_host host' to macide.c. v3: Fix build problem in pmac.c (s/ide_alloc_host/ide_host_alloc/) (Noticed by Stephen Rothwell). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz	374e042c3e	ide: add struct ide_tp_ops (take 2) * Add struct ide_tp_ops for transport methods. * Add 'const struct ide_tp_ops tp_ops' to struct ide_port_info and ide_hwif_t. Set the default hwif->tp_ops in ide_init_port_data(). * Set host driver specific hwif->tp_ops in ide_init_port(). * Export ide_exec_command(), ide_read_status(), ide_read_altstatus(), ide_read_sff_dma_status(), ide_set_irq(), ide_tf_{load,read}() and ata_{in,out}put_data(). * Convert host drivers and core code to use struct ide_tp_ops. * Remove no longer needed default_hwif_transport(). * Cleanup ide_hwif_t from methods that are now in struct ide_tp_ops. While at it: * Use struct ide_port_info in falconide.c and q40ide.c. * Rename ata_{in,out}put_data() to ide_{in,out}put_data(). v2: * Fix missing convertion in ns87415.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	d6276b5f5c	ide: add 'config' field to hw_regs_t Add 'config' field to hw_regs_t and use it to set hwif->config_data in ide_init_port_hw(), then convert ide_legacy_init_one() to use hw->config. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	3b2a5c7149	ide: filter out "default" transfer mode values in set_xfer_rate() * Filter out "default" transfer mode values (0x00 - default PIO mode, 0x01 - default PIO mode w/ IORDY disabled) in write handler for obsoleted /proc/ide/hd?/settings:current_speed setting. Allowing "default" transfer mode values is a dangerous thing to do as we don't support programming controller to the "default" transfer mode and devices often use different values for the default and maximum PIO mode (i.e. PIO2 default and PIO4 maximum) so the controller will stay programmed for higher PIO mode while device will use the lower PIO mode. There is no functionality loss as by using special IOCTLs device can still be programmed to "default" transfer modes (it is only useful for debugging/testing purposes anyway). * Remove no longer needed IDE_HFLAG_ABUSE_SET_DMA_MODE host flag, it was previously used by few host drivers to program the controller to PIO0 timings for "default" transfer mode == 0x01 (although some host drivers would program invalid PIO timings instead). * Cleanup ide_set_xfer_rate() and add BUG_ON(). Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz	ba4b2e607e	ide: remove dead Virtual DMA support Lets remove dead Virtual DMA support for now so it doesn't clutter core IDE code (it can be bring back when there is a need for it): * Remove IDE_HFLAG_VDMA host flag. * Remove ide_drive_t.vdma flag. * cs5520.c: remove stale FIXMEs, cs5520_dma_host_set() and cs5520_dma_ops (also there is no longer a need to set IDE_HFLAG_NO_ATAPI_DMA). There should be no functional changes caused by this patch. Cc: TAKADA Yoshihito <takada@mbf.nifty.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:55 +02:00
Bartlomiej Zolnierkiewicz	761052e676	ide: remove ->INB, ->OUTB and ->OUTBSYNC methods * Remove no longer needed ->INB, ->OUTB and ->OUTBSYNC methods. Then: * Remove no longer used default_hwif_[mm]iops() and ide_[mm_]outbsync(). * Cleanup SuperIO handling in ns87415.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz	1823649b5a	ide: add ide_read_bcount_and_ireason() helper Add ide_read_bcount_and_ireason() helper and use it instead of ->INB in {cdrom_newpc,ide_pc}_intr(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz	92eb43800a	ide: use ->tf_read in ide_read_error() * Add IDE_TFLAG_IN_FEATURE taskfile flag for reading Feature register and handle it in ->tf_read. * Convert ide_read_error() to use ->tf_read instead of ->INB, then uninline and export it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:53 +02:00
Bartlomiej Zolnierkiewicz	6e6afb3b74	ide: add ->set_irq method Add ->set_irq method for setting nIEN bit of ATA Device Control register and use it instead of ide_set_irq(). While at it: * Use ->set_irq in init_irq() and do_reset1(). * Don't use HWIF() macro in ide_check_pm_state(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	1f6d8a0fd8	ide: add ->read_altstatus method * Remove ide_read_altstatus() inline helper. * Add ->read_altstatus method for reading ATA Alternate Status register and use it instead of ->INB. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	b73c7ee25d	ide: add ->read_status method * Remove ide_read_status() inline helper. * Add ->read_status method for reading ATA Status register and use it instead of ->INB. While at it: * Don't use HWGROUP() macro. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz	c6dfa867bb	ide: add ->exec_command method Add ->exec_command method for writing ATA Command register and use it instead of ->OUTBSYNC. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	ebb00fb55d	ide: factor out simplex handling from ide_pci_dma_base() * Factor out simplex handling from ide_pci_dma_base() to ide_pci_check_simplex(). * Set hwif->dma_base early in ->init_dma method / ide_hwif_setup_dma() and reset it in ide_init_port() if DMA initialization fails. * Use ->read_sff_dma_status instead of ->INB in ide_pci_dma_base(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	81e8d5a34f	ide: remove ide_setup_dma() Export sff_dma_ops and then remove ide_setup_dma(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	cab7f8eda4	ide: remove ->dma_{status,command} fields from ide_hwif_t * Use ->dma_base + offset instead of ->dma_{status,command} and remove no longer needed ->dma_{status,command}. While at it: * Use ATA_DMA_* defines. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz	b2f951aabc	ide: add ->read_sff_dma_status method Add ->read_sff_dma_status method for reading DMA Status register and use it instead of ->INB. While at it: * Use inb() directly in ns87415.c::ns87415_dma_end(). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:50 +02:00
Bartlomiej Zolnierkiewicz	c97c6aca75	ide: pass hw_regs_t-s to ide_device_add[_all]() (take 3) * Add 'hw_regs_t *hws' argument to ide_device_add[_all]() and convert host drivers + ide_legacy_init_one() + ide_setup_pci_device[s]() to use it instead of calling ide_init_port_hw() directly. [ However if host has > 1 port we must still set hwif->chipset to hint consecutive ide_find_port() call that the previous slot is occupied. ] Unexport ide_init_port_hw(). v2: * Use defines instead of hard-coded values in buddha.c, gayle.c and q40ide.c. (Suggested by Geert Uytterhoeven) * Better patch description. v3: * Fix build problem in ide-cs.c. (Noticed by Stephen Rothwell) There should be no functional changes caused by this patch. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-23 19:55:50 +02:00
Roland Dreier	95d04f0735	IB/mlx4: Add support for memory management extensions and local DMA L_Key Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-23 08:12:26 -07:00
Rafi Rubin	f472f80034	HID: add n-trig digitizer usage This adds a hid usage that is reported by the N-Trig digitizer in the Dell Latitude XT screen. Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu> Signed-off-by: Vojtech Pavlik <vojtech@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-07-23 15:25:21 +02:00
Tejun Heo	137d3edb48	sg: reimplement sg mapping iterator This is alternative implementation of sg content iterator introduced by commit 83e7d317... from Pierre Ossman in next-20080716. As there's already an sg iterator which iterates over sg entries themselves, name this sg_mapping_iterator. Slightly edited description from the original implementation follows. Iteration over a sg list is not that trivial when you take into account that memory pages might have to be mapped before being used. Unfortunately, that means that some parts of the kernel restrict themselves to directly accesible memory just to not have to deal with the mess. This patch adds a simple iterator system that allows any code to easily traverse an sg list and not have to deal with all the details. The user can decide to consume part of the iteration. Also, iteration can be stopped and resumed later if releasing the kmap between iteration steps is necessary. These features are useful to implement piecemeal sg copying for interrupt drive PIO for example. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-23 14:42:09 +02:00
Nate Case	f46e9203d9	leds: Add support for Philips PCA955x I2C LED drivers This driver supports the PCA9550, PCA9551, PCA9552, and PCA9553 LED driver chips. Signed-off-by: Nate Case <ncase@xes-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Anton Vorontsov	781a54e766	leds: mark led_classdev.default_trigger as const LED classdev core doesn't modify memory pointed by the default_trigger, so mark it as const and we'll able to pass const char *s without getting compiler warnings. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Riku Voipio	e14fa82439	leds: Add pca9532 led driver NXP pca9532 is a LED dimmer/controller attached to i2c bus. It allows attaching upto 16 leds which can either be on, off or dimmed and/or blinked with the two PWM modulators available. This driver is a "new-style" i2c driver that adheres to the driver model and implements the led framework api. Since the leds connected to the driver are platform specific, it is only useful when platform data is passed to the driver to define what leds are connected to which pins. Signed-off-by: Riku Voipio <riku.voipio@iki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2008-07-23 09:49:56 +01:00
Linus Torvalds	c010b2f76c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits) ipw2200: Call netif_*_queue() interfaces properly. netxen: Needs to include linux/vmalloc.h [netdrvr] atl1d: fix !CONFIG_PM build r6040: rework init_one error handling r6040: bump release number to 0.18 r6040: handle RX fifo full and no descriptor interrupts r6040: change the default waiting time r6040: use definitions for magic values in descriptor status r6040: completely rework the RX path r6040: call napi_disable when puting down the interface and set lp->dev accordingly. mv643xx_eth: fix NETPOLL build r6040: rework the RX buffers allocation routine r6040: fix scheduling while atomic in r6040_tx_timeout r6040: fix null pointer access and tx timeouts r6040: prefix all functions with r6040 rndis_host: support WM6 devices as modems at91_ether: use netstats in net_device structure sfc: Create one RX queue and interrupt per CPU package by default sfc: Use a separate workqueue for resets sfc: I2C adapter initialisation fixes ...	2008-07-22 19:09:51 -07:00
David S. Miller	7cf75262a4	Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-07-22 17:54:47 -07:00
Maciej Sosnowski	7f1b358a23	I/OAT: I/OAT version 3.0 support This patch adds to ioatdma and dca modules support for Intel I/OAT DMA engine ver.3 (aka CB3 device). The main features of I/OAT ver.3 are: * 8 single channel DMA devices (8 channels total) * 8 DCA providers, each can accept 2 requesters * 8-bit TAG values and 32-bit extended APIC IDs Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-22 17:30:57 -07:00
Rafael J. Wysocki	e5899e1b7d	PCI PM: make more PCI PM core functionality available to drivers Make more PCI PM core functionality available to drivers * Export pci_pme_capable() so that it can be called directly by drivers (for example, tg3 needs that). * Move the state choosing part of pci_prepare_to_sleep() to a separate function, pci_target_state(), that can be called directly by drivers (for example, tg3 needs that). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-22 14:25:38 -07:00
Roland Dreier	47b374752a	IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg Make the struct name consistent with other WQE segment struct types defined in <linux/mlx4/qp.h>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:19:39 -07:00
Adrian Bunk	8086cd451f	netns: make get_proc_net() static get_proc_net() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-22 14:19:19 -07:00
Dave Jones	d29f749e25	net: Fix build failure with 'make mandocs'. The function header comments have to go with the functions they are documenting, or things go horribly wrong when we try to process them with the docbook tools. Warning(include/linux/netdevice.h:1006): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1033): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1067): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1093): No description found for parameter 'dev_queue' Warning(include/linux/netdevice.h:1474): No description found for parameter 'txq' Error(net/core/dev.c:1674): cannot understand prototype: 'u32 simple_tx_hashrnd; ' Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-22 14:09:06 -07:00
Linus Torvalds	6eaaaac974	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: remove CONFIG_KMOD from core kernel code remove CONFIG_KMOD from lib remove CONFIG_KMOD from sparc64 rework try_then_request_module to do less in non-modular kernels remove mention of CONFIG_KMOD from documentation make CONFIG_KMOD invisible modules: Take a shortcut for checking if an address is in a module module: turn longs into ints for module sizes Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs module: reorder struct module to save space on 64 bit builds module: generic each_symbol iterator function module: don't use stop_machine for waiting rmmod	2008-07-22 13:17:15 -07:00
Linus Torvalds	06b8147c5d	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (49 commits) powerpc: Fix build bug with binutils < 2.18 and GCC < 4.2 powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded fbdev: Teaches offb about palette on radeon r5xx/r6xx powerpc/cell/edac: Log a syndrome code in case of correctable error powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code powerpc: Indicate which oprofile counters to use while in compat mode powerpc/boot: Change spaces to tabs powerpc: Remove duplicate 6xx option in Kconfig powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S powerpc: Use PPC_LONG_ALIGN in uaccess.h powerpc: Add a #define for aligning to a long-sized boundary powerpc: Fix OF parsing of 64 bits PCI addresses powerpc: Use WARN_ON(1) instead of __WARN() powerpc: Fix support for latencytop powerpc/ps3: Update ps3_defconfig powerpc/ps3: Add a sub-match id to ps3_system_bus powerpc: Add a 6xx defconfig powerpc/dma: Use the struct dma_attrs in iommu code powerpc/cell: Add support for power button of future IBM cell blades powerpc/cell: Cleanup sysreset_hack for IBM cell blades ...	2008-07-22 13:16:01 -07:00
Linus Torvalds	53baaaa968	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (79 commits) arm: bus_id -> dev_name() and dev_set_name() conversions sparc64: fix up bus_id changes in sparc core code 3c59x: handle pci_name() being const MTD: handle pci_name() being const HP iLO driver sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute sysdev: Add utility functions for simple int/ulong variable sysdev attributes sysdev: Pass the attribute to the low level sysdev show/store function driver core: Suppress sysfs warnings for device_rename(). kobject: Transmit return value of call_usermodehelper() to caller sysfs-rules.txt: reword API stability statement debugfs: Implement debugfs_remove_recursive() HOWTO: change email addresses of James in HOWTO always enable FW_LOADER unless EMBEDDED=y uio-howto.tmpl: use unique output names uio-howto.tmpl: use standard copyright/legal markings sysfs: don't call notify_change sysdev: fix debugging statements in registration code. kobject: should use kobject_put() in kset-example kobject: reorder kobject to save space on 64 bit builds ...	2008-07-22 13:13:47 -07:00
Laurent Pinchart	217d5a5195	fs_enet: Remove unused fields in the fs_mii_bb_platform_info structure. The mdio_port, mdio_bit, mdc_port and mdc_bit fields in the fs_mii_bb_platform_info structure are left-overs from the move to the Phy Abstraction Layer subsystem. They are not used anymore and can be safely removed. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-22 16:08:58 -04:00
Paul Fulghum	e5590717af	synclink_gt: add serial bit order control Add control of hardware serial bit order between LSB first (default/standard) and MSB first. Signed-off-by: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:29 -07:00
Alan Cox	9e98966c7b	tty: rework break handling Some hardware needs to do break handling itself and may have partial support only. Make break_ctl return an error code. Add a tty driver flag so you can indicate driver hardware side break support. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:28 -07:00
Alan Cox	01e1abb2c2	tty: Split ldisc code into its own file Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:27 -07:00
Alan Cox	95da310e66	usb_serial: API all change USB serial likes to use port->tty back pointers for the real work it does and to do so without any actual locking. Unfortunately when you consider hangup events, hangup/parallel reopen or even worse hangup followed by parallel close events the tty->port and port->tty pointers are not guaranteed to be the same as port->tty is the active tty while tty->port is the port the tty may or may not still be attached to. So rework the entire API to pass the tty struct. For console cases we need to pass both for now. This shows up multiple drivers that immediately crash with USB console some of which have been fixed in the process. Longer term we need a proper tty as console abstraction Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 13:03:22 -07:00
Juergen Beisert	0c36ec3147	gpio: gpio driver for max7301 SPI GPIO expander Maxim's MAX7301 is an SPI GPIO expander with 28 GPIOs. Note: MAX7301's interrupt feature is not supported yet. [akpm@linux-foundation.org: coding-style fixes] [g.liakhovetski@pengutronix.de: Fix inaccuracies in comments, check spi_setup() return code, mask off high byte in max7301_read()] Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 09:59:40 -07:00
John Reiser	6519108746	execve filename: document and export via auxiliary vector The Linux kernel puts the filename argument of execve() into the new address space. Many developers are surprised to learn this. Those who know and could use it, object "But it's not documented." Those who want to use it dislike the expression (char *)(1+ strlen(env[-1+ n_env]) + env[-1+ n_env]) because it requires locating the last original environment variable, and assumes that the filename follows the characters. This patch documents the insertion of the filename, and makes it easier to find by adding a new tag AT_EXECFN in the ElfXX_auxv_t; see <elf.h>. In many cases readlink("/proc/self/exe",) gives the same answer. But if all the original pages get unmapped, then the kernel erases the symlink for /proc/self/exe. This can happen when a program decompressor does a good job of cleaning up after uncompressing directly to memory, so that the address space of the target program looks the same as if compression had never happened. One example is http://upx.sourceforge.net . One notable use of the underlying concept (what path containED the executable) is glibc expanding $ORIGIN in DT_RUNPATH. In practice for the near term, it may be a good idea for user-mode code to use both /proc/self/exe and AT_EXECFN as fall-back methods for each other. /proc/self/exe can fail due to unmapping, AT_EXECFN can fail because it won't be present on non-new systems. The auxvec or {AT_EXECFN}.d_val also can get overwritten, although in nearly all cases this would be the result of a bug. The runtime cost is one NEW_AUX_ENT using two words of stack space. The underlying value is maintained already as bprm->exec; setup_arg_pages() in fs/exec.c slides it for stack_shift, etc. Signed-off-by: John Reiser <jreiser@BitWagon.com> Cc: Roland McGrath <roland@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-22 09:59:40 -07:00
Johannes Berg	a1ef5adb4c	remove CONFIG_KMOD from core kernel code Always compile request_module when the kernel allows modules. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:31 +10:00
Johannes Berg	df648c9fbe	rework try_then_request_module to do less in non-modular kernels This reworks try_then_request_module to only invoke the "lookup" function "x" once when the kernel is not modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:29 +10:00
Denys Vlasenko	2f0f2a334b	module: turn longs into ints for module sizes This shrinks module.o and each *.ko file. And finally, structure members which hold length of module code (four such members there) and count of symbols are converted from longs to ints. We cannot possibly have a module where 32 bits won't be enough to hold such counts. For one, module loading checks module size for sanity before loading, so such insanely big module will fail that test first. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:27 +10:00
Denys Vlasenko	f7f5b67557	Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs module.c and module.h conatains code for finding exported symbols which are declared with EXPORT_UNUSED_SYMBOL, and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway (because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then). This patch adds required #ifdefs. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:27 +10:00
Richard Kennedy	af5406895a	module: reorder struct module to save space on 64 bit builds reorder struct module to save space on 64 bit builds. saves 1 cacheline_size (128 on default x86_64 & 64 on AMD Opteron/athlon) when CONFIG_MODULE_UNLOAD=y. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-22 19:24:26 +10:00
Benjamin Herrenschmidt	8725f25acc	Merge commit 'origin/master' Manually fixed up: drivers/net/fs_enet/fs_enet-main.c	2008-07-22 17:12:37 +10:00
Ingo Molnar	76c3bb15d6	Merge branch 'linus' into x86/x2apic	2008-07-22 09:06:21 +02:00
Greg Kroah-Hartman	eadcf0d704	MTD: handle pci_name() being const This changes the MTD core to handle pci_name() now returning a constant string. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:03 -07:00
Andi Kleen	9800794ac1	sysdev: Add utility functions for simple int/ulong variable sysdev attributes This adds a new sysdev_ext_attribute that stores a pointer to the variable it manages and some utility functions/macro to easily use them. Previously all users wrote custom macros to generate show/store functions for each variable, with this it is possible to avoid that in many cases. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Andi Kleen	4a0b2b4dbe	sysdev: Pass the attribute to the low level sysdev show/store function This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Cornelia Huck	36ce6dad6e	driver core: Suppress sysfs warnings for device_rename(). driver core: Suppress sysfs warnings for device_rename(). Renaming network devices to an already existing name is not something we want sysfs to print a scary warning for, since the callers can deal with this correctly. So let's introduce sysfs_create_link_nowarn() which gets rid of the common warning. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:01 -07:00
Haavard Skinnemoen	9505e63756	debugfs: Implement debugfs_remove_recursive() debugfs_remove_recursive() will remove a dentry and all its children. Drivers can use this to zap their whole debugfs tree so that they don't need to keep track of every single debugfs dentry they created. It may fail to remove the whole tree in certain cases: sh-3.2# rmmod atmel-mci < /sys/kernel/debug/mmc0/ios/clock mmc0: card b368 removed atmel_mci atmel_mci.0: Lost dma0chan1, falling back to PIO sh-3.2# ls /sys/kernel/debug/mmc0/ ios But I'm not sure if that case can be handled in any sane manner. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Cc: Pierre Ossman <drzeus-list@drzeus.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:59 -07:00
Richard Kennedy	a231934bdf	kobject: reorder kobject to save space on 64 bit builds reorder kobject to save space on 64 bit builds. shrinks from 72 to 64 bytes & moves allocated kobject to a smaller slab. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:56 -07:00
Uwe Kleine-König	6d8333c24d	UIO: minor style and comment fixes Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Signed-off-by: Hans J. Koch <hjk@linutronix.de>	2008-07-21 21:54:55 -07:00
Hans J. Koch	328a14e70e	UIO: Add write function to allow irq masking Sometimes it is necessary to enable/disable the interrupt of a UIO device from the userspace part of the driver. With this patch, the UIO kernel driver can implement an "irqcontrol()" function that does this. Userspace can write an s32 value to /dev/uioX (usually 0 or 1 to turn the irq off or on). The UIO core will then call the driver's irqcontrol function. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Acked-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:55 -07:00
Greg Kroah-Hartman	b98cb4b7fe	driver core: remove DEVICE_ID_SIZE define There is no such thing as a "device id size" in the driver core, so remove the define and fix up any users of this odd define in the rest of the kernel. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:53 -07:00
Kay Sievers	ca52a49846	driver core: remove DEVICE_NAME_SIZE define There is no such thing as a "device name size" in the driver core, so remove the define and fix up any users of this odd define in the rest of the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:53 -07:00
Kay Sievers	aab0de2451	driver core: remove KOBJ_NAME_LEN define Kobjects do not have a limit in name size since a while, so stop pretending that they do. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:52 -07:00
Matthew Wilcox	d2a3b9146e	class: add lockdep infrastructure This adds the infrastructure to properly handle lockdep issues when the internal class semaphore is changed to a mutex. Matthew wrote the original patch, and Greg fixed it up to work properly with the class_create() function. From: Matthew Wilcox <matthew@wil.cx> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:52 -07:00
Greg Kroah-Hartman	7c71448b8a	class: move driver core specific parts to a private structure This moves the portions of struct class that are dynamic (kobject and lock and lists) out of the main structure and into a dynamic, private, structure. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:51 -07:00
Greg Kroah-Hartman	695794ae0c	Driver Core: add ability for class_find_device to start in middle of list This mirrors the functionality that driver_find_device has as well. We add a start variable, and all callers of the function are fixed up at the same time. The block layer will be using this new functionality in a follow-on patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	93562b5376	Driver Core: add ability for class_for_each_device to start in middle of list This mirrors the functionality that driver_for_each_device has as well. We add a start variable, and all callers of the function are fixed up at the same time. The block layer will be using this new functionality in a follow-on patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	4e10673944	device create: convert device_create_drvdata to device_create Now that device_create() has been audited, rename things back to the original call to be sane. Keep the device_create_drvdata macro around to make merges easier. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman	ccea44fadc	driver core: remove device_create() There are no more users of this, and it is racy. Use device_create_drvdata() or device_create_vargs() instead. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:47 -07:00
Dan Williams	e105b8bfc7	sysfs: add /sys/dev/{char,block} to lookup sysfs path by major:minor Why?: There are occasions where userspace would like to access sysfs attributes for a device but it may not know how sysfs has named the device or the path. For example what is the sysfs path for /dev/disk/by-id/ata-ST3160827AS_5MT004CK? With this change a call to stat(2) returns the major:minor then userspace can see that /sys/dev/block/8:32 links to /sys/block/sdc. What are the alternatives?: 1/ Add an ioctl to return the path: Doable, but sysfs is meant to reduce the need to proliferate ioctl interfaces into the kernel, so this seems counter productive. 2/ Use udev to create these symlinks: Also doable, but it adds a udev dependency to utilities that might be running in a limited environment like an initramfs. 3/ Do a full-tree search of sysfs. [kay.sievers@vrfy.org: fix duplicate registrations] [kay.sievers@vrfy.org: cleanup suggestions] Cc: Neil Brown <neilb@suse.de> Cc: Tejun Heo <htejun@gmail.com> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Reviewed-by: SL Baur <steve@xemacs.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Mark Lord <lkml@rtr.ca> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:40 -07:00
Mark Nelson	1ed6af7344	powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU implementation to use this code. Dynamic mappings can be weakly or strongly ordered on an individual basis but the fixed mapping has to be either completely strong or completely weak. This is currently decided by a kernel boot option (pass iommu_fixed=weak for a weakly ordered fixed linear mapping, strongly ordered is the default). Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:36 +10:00
Wolfgang Grandegger	18ad7a61e1	of_gpio: Should use new <linux/gpio.h> header Since commit `7560fa60fc` (gpio: <linux/gpio.h> and "no GPIO support here" stubs) drivers can use GPIOs if they're available, but don't require them. This patch actually enables this feature. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:30 +10:00
Linus Torvalds	93ded9b8fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits) usb-storage: revert DMA-alignment change for Wireless USB USB: use reset_resume when normal resume fails usb_gadget: composite cdc gadget fault handling usb gadget: minor USBCV fix for composite framework USB: Fix bug with byte order in isp116x-hcd.c fio write/read USB: fix double kfree in ipaq in error case USB: fix build error in cdc-acm for CONFIG_PM=n USB: remove board-specific UP2OCR configuration from pxa27x-udc USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx USB: Fix pointer/int cast in USB devio code usb gadget: g_cdc dependso on NET USB: Au1xxx-usb: suspend/resume support. USB: Au1xxx-usb: clean up ohci/ehci bus glue sources. usbfs: don't store bad pointers in registration usbfs: fix race between open and unregister usbfs: simplify the lookup-by-minor routines usbfs: send disconnect signals when device is unregistered USB: Force unbinding of drivers lacking reset_resume or other methods USB: ohci-pnx4008: I2C cleanups and fixes USB: debug port converter does not accept more than 8 byte packets ...	2008-07-21 15:42:53 -07:00
Alan Stern	78d9a487ee	USB: Force unbinding of drivers lacking reset_resume or other methods This patch (as1024) takes care of a FIXME issue: Drivers that don't have the necessary suspend, resume, reset_resume, pre_reset, or post_reset methods will be unbound and their interface reprobed when one of the unsupported events occurs. This is made slightly more difficult by the fact that bind operations won't work during a system sleep transition. So instead the code has to defer the operation until the transition ends. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:40 -07:00
Ming Lei	742120c631	USB: fix usb_reset_device and usb_reset_composite_device(take 3) This patch renames the existing usb_reset_device in hub.c to usb_reset_and_verify_device and renames the existing usb_reset_composite_device to usb_reset_device. Also the new usb_reset_and_verify_device does't need to be EXPORTED . The idea of the patch is that external interface driver should warn the other interfaces' driver of the same device before and after reseting the usb device. One interface driver shoud call _old_ usb_reset_composite_device instead of _old_ usb_reset_device since it can't assume the device contains only one interface. The _old_ usb_reset_composite_device is safe for single interface device also. we rename the two functions to make the change easily. This patch is under guideline from Alan Stern. Signed-off-by: Ming Lei <tom.leiming@gmail.com>	2008-07-21 15:16:33 -07:00
Ming Lei	625f694936	USB: remove interface parameter of usb_reset_composite_device From the current implementation of usb_reset_composite_device function, the iface parameter is no longer useful. This function doesn't do something special for the iface usb_interface,compared with other interfaces in the usb_device. So remove the parameter and fix the related caller. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:32 -07:00
Alan Stern	f579c2b46f	USB Gadget: documentation update This patch (as1102) clarifies two points in the USB Gadget kerneldoc: Request completion callbacks are always made with interrupts disabled; Device controllers may not support STALLing the status stage of a control transfer after the data stage is over. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:27 -07:00
Felipe Balbi	e0d795e4f3	usb: irda: cleanup on ir-usb module General cleanup on ir-usb module. Introduced a common header that could be used also on usb gadget framework. Lot's of cleanups and now using macros from the header file. Signed-off-by: Felipe Balbi <me@felipebalbi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:27 -07:00
David Brownell	40982be52d	usb gadget: composite gadget core Add <linux/usb/composite.h> interfaces for composite gadget drivers, and basic implementation support behind it: - struct usb_function ... groups one or more interfaces into a function managed as one unit within a configuration, to which it's added by usb_add_function(). - struct usb_configuration ... groups one or more such functions into a configuration managed as one unit by a driver, to which it's added by usb_add_config(). These operate at either high or full/low speeds and at a given bMaxPower. - struct usb_composite_driver ... groups one or more such configurations into a gadget driver, which may be registered or unregistered. - struct usb_composite_dev ... a usb_composite_driver manages this; it wraps the usb_gadget exposed by the controller driver. This also includes some basic kerneldoc. How to use it (the short version): provide a usb_composite_driver with a bind() that calls usb_add_config() for each of the needed configurations. The configurations in turn have bind() calls, which will usb_add_function() for each function required. Each function's bind() allocates resources needed to perform its tasks, like endpoints; sometimes configurations will allocate resources too. Separate patches will convert most gadget drivers to this infrastructure. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:01 -07:00
David Brownell	a4c39c41bf	usb gadget: descriptor copying support Define three new descriptor manipulation utilities, for use when setting up functions that may have multiple instances: usb_copy_descriptors() to copy a vector of descriptors usb_free_descriptors() to free the copy usb_find_endpoint() to find a copied version These will be used as follows. Functions will continue to have static tables of descriptors they update, now used as __initdata templates. When a function creates a new instance, it patches those tables with relevant interface and string IDs, plus endpoint assignments. Then it copies those morphed descriptors, associates the copies with the new function instance, and records the endpoint descriptors to use when activating the endpoints. When initialization is done, only the copies remain in memory. The copies are freed on driver removal. This ensures that each instance has descriptors which hold the right instance-specific data. Two instances in the same configuration will obviously never share the same interface IDs or use the same endpoints. Instances in different configurations won't do so either, which means this is slightly less memory-efficient in some cases. This also includes a bugfix to the epautoconf code that shows up with this usage model. It must replace the previous endpoint number when updating the template descriptors, not just mask in a few more bits. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:16:00 -07:00
Adrian Bunk	ea05af61a8	USB: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:55 -07:00
Alan Stern	9da82bd464	USB: implement "soft" unbinding This patch (as1091) changes the way usbcore handles interface unbinding. If the interface's driver supports "soft" unbinding (a new flag in the driver structure) then in-flight URBs are not cancelled and endpoints are not disabled. Instead the driver is allowed to continue communicating with the device (although of course it should stop before its disconnect routine returns). The purpose of this change is to allow drivers to do a clean shutdown when they get unbound from a device that is still plugged in. Killing all the URBs and disabling the endpoints before calling the driver's disconnect method doesn't give the driver any control over what happens, and it can leave devices in indeterminate states. For example, when usb-storage unbinds it doesn't want to stop while in the middle of transmitting a SCSI command. The soft_unbind flag is added because in the past, a number of drivers have experienced problems related to ongoing I/O after their disconnect routine returned. Hence "soft" unbinding is made available only to drivers that claim to support it. The patch also replaces "interface_to_usbdev(intf)" with "udev" in a couple of places, a minor simplification. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:54 -07:00
Greg Kroah-Hartman	1b26da1510	USB: handle pci_name() being const This changes usb_create_hcd() to be able to handle the fact that pci_name() has changed to a constant string. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 15:15:46 -07:00
Linus Torvalds	6d52dcbe56	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpufreq: remove CVS keywords [CPUFREQ] change cpu freq arrays to per_cpu variables	2008-07-21 15:10:37 -07:00
David S. Miller	ebb36a9781	ipv6: __KERNEL__ ifdef struct ipv6_devconf Based upon a report by Olaf Hering. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 13:41:16 -07:00
Arjan van de Ven	6579e57b31	net: Print the module name as part of the watchdog message As suggested by Dave: This patch adds a function to get the driver name from a struct net_device, and consequently uses this in the watchdog timeout handler to print as part of the message. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 13:31:48 -07:00
Linus Torvalds	e89970aa93	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netfilter: nf_conntrack_sctp: fix sparse warnings netfilter: nf_nat_sip: c= is optional for session netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function netfilter: nfnetlink_log: send complete hardware header netfilter: xt_time: fix time's time_mt()'s use of do_div() netfilter: accounting rework: ct_extend + 64bit counters (v4) netlink: add NLA_PUT_BE64 macro netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM hdlcdrv: Fix CRC calculation. Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe." net: In __netif_schedule() use WARN_ON instead of BUG_ON net: Improve simple_tx_hash(). pkt_sched: Remove unused variable skb in dev_deactivate_queue function. sunhme: Remove stop/wake TX queue calls in set-multicast-list handler. ucc_geth: do not touch net queue in adjust_link phylib callback gianfar: do not touch net queue in adjust_link phylib callback atl1: Do not wake queue before queue has been started.	2008-07-21 11:29:52 -07:00
Linus Torvalds	72a73693aa	Merge branch 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits) x86: remove extra calling to get ext cpuid level x86: use setup_clear_cpu_cap() when disabling the lapic KVM: fix exception entry / build bug, on 64-bit x86: add unknown_nmi_panic kernel parameter x86, VisWS: turn into generic arch, eliminate leftover files x86: add ->pre_time_init to x86_quirks x86: extend and use x86_quirks to clean up NUMAQ code x86: introduce x86_quirks x86: improve debug printout: add target bootmem range in early_res_to_bootmem() Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM x86: remove arch_get_ram_range x86: Add a debugfs interface to dump PAT memtype x86: Add a arch directory for x86 under debugfs x86: i386: reduce boot fixmap space i386/xen: add proper unwind annotations to xen_sysenter_target x86: reduce force_mwait visibility x86: reduce forbid_dac's visibility x86: fix two modpost warnings x86: check function status in EDD boot code x86_64: ia32_signal.c: remove signal number conversion ...	2008-07-21 10:34:25 -07:00
Linus Torvalds	b7e6f62fe2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm crypt: add merge dm table: remove merge_bvec sector restriction dm: linear add merge dm: introduce merge_bvec_fn dm snapshot: use per device mempools dm snapshot: fix race during exception creation dm snapshot: track snapshot reads dm mpath: fix test for reinstate_path dm mpath: return parameter error dm io: remove struct padding dm log: make dm_dirty_log init and exit static dm mpath: free path selector on invalid args	2008-07-21 10:30:10 -07:00
Linus Torvalds	8a392625b6	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: (52 commits) md: Protect access to mddev->disks list using RCU md: only count actual openers as access which prevent a 'stop' md: linear: Make array_size sector-based and rename it to array_sectors. md: Make mddev->array_size sector-based. md: Make super_type->rdev_size_change() take sector-based sizes. md: Fix check for overlapping devices. md: Tidy up rdev_size_store a bit: md: Remove some unused macros. md: Turn rdev->sb_offset into a sector-based quantity. md: Make calc_dev_sboffset() return a sector count. md: Replace calc_dev_size() by calc_num_sectors(). md: Make update_size() take the number of sectors. md: Better control of when do_md_stop is allowed to stop the array. md: get_disk_info(): Don't convert between signed and unsigned and back. md: Simplify restart_array(). md: alloc_disk_sb(): Return proper error value. md: Simplify sb_equal(). md: Simplify uuid_equal(). md: sb_equal(): Fix misleading printk. md: Fix a typo in the comment to cmd_match(). ...	2008-07-21 10:29:12 -07:00
Eric Leblond	72961ecf84	netfilter: nfnetlink_log: send complete hardware header This patch adds some fields to NFLOG to be able to send the complete hardware header with all necessary informations. It sends to userspace: * the type of hardware link * the lenght of hardware header * the hardware header Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 10:11:00 -07:00
Krzysztof Piotr Oledzki	584015727a	netfilter: accounting rework: ct_extend + 64bit counters (v4) Initially netfilter has had 64bit counters for conntrack-based accounting, but it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are still required, for example for "connbytes" extension. However, 64bit counters waste a lot of memory and it was not possible to enable/disable it runtime. This patch: - reimplements accounting with respect to the extension infrastructure, - makes one global version of seq_print_acct() instead of two seq_print_counters(), - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n), - makes it possible to enable/disable it at runtime by sysctl or sysfs, - extends counters from 32bit to 64bit, - renames ip_conntrack_counter -> nf_conn_counter, - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT), - set initial accounting enable state based on CONFIG_NF_CT_ACCT - removes buggy IPCT_COUNTER_FILLING event handling. If accounting is enabled newly created connections get additional acct extend. Old connections are not changed as it is not possible to add a ct_extend area to confirmed conntrack. Accounting is performed for all connections with acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct". Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-21 10:10:58 -07:00
Ingo Molnar	eb6a12c242	Merge branch 'linus' into cpus4096-for-linus Conflicts: net/sunrpc/svc.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 17:19:50 +02:00
Ingo Molnar	acee709cab	Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus	2008-07-21 16:37:17 +02:00
Milan Broz	f6fccb1213	dm: introduce merge_bvec_fn Introduce a bvec merge function for device mapper devices for dynamic size restrictions. This code ensures the requested biovec lies within a single target and then calls a target-specific function to check against any constraints imposed by underlying devices. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-07-21 12:00:37 +01:00
NeilBrown	4b80991c6c	md: Protect access to mddev->disks list using RCU All modifications and most access to the mddev->disks list are made under the reconfig_mutex lock. However there are three places where the list is walked without any locking. If a reconfig happens at this time, havoc (and oops) can ensue. So use RCU to protect these accesses: - wrap them in rcu_read_{,un}lock() - use list_for_each_entry_rcu - add to the list with list_add_rcu - delete from the list with list_del_rcu - delay the 'free' with call_rcu rather than schedule_work Note that export_rdev did a list_del_init on this list. In almost all cases the entry was not in the list anymore so it was a no-op and so safe. It is no longer safe as after list_del_rcu we may not touch the list_head. An audit shows that export_rdev is called: - after unbind_rdev_from_array, in which case the delete has already been done, - after bind_rdev_to_array fails, in which case the delete isn't needed. - before the device has been put on a list at all (e.g. in add_new_disk where reading the superblock fails). - and in autorun devices after a failure when the device is on a different list. So remove the list_del_init call from export_rdev, and add it back immediately before the called to export_rdev for that last case. Note also that ->same_set is sometimes used for lists other than mddev->list (e.g. candidates). In these cases rcu is not needed. Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
NeilBrown	f2ea68cf42	md: only count actual openers as access which prevent a 'stop' Open isn't the only thing that increments ->active. e.g. reading /proc/mdstat will increment it briefly. So to avoid false positives in testing for concurrent access, introduce a new counter that counts just the number of times the md device it open. Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
Andre Noll	d6e2215052	md: linear: Make array_size sector-based and rename it to array_sectors. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
Andre Noll	f233ea5c9e	md: Make mddev->array_size sector-based. This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:22 +10:00
Dmitry Torokhov	908cf4b925	Merge master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 into next	2008-07-21 00:55:14 -04:00
Linus Torvalds	14b395e35d	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...	2008-07-20 21:21:46 -07:00
Linus Torvalds	ae0645a451	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: let asic3 use mem resource instead of bus_shift mfd: remove DS1WM register definitions from asic3.h mfd: add ASIC3_CONFIG_GPIO templates mfd: fix the asic3 irq demux code mfd: asic3 should depend on gpiolib mfd: fix asic3 config array initialisation mfd: move asic3 probe functions into __init section mfd: Use uppercase only for asic3 macros and defines mfd: use dev_* macros for asic3 debugging mfd: New asic3 gpio configuration code mfd: asic3 children platform data removal mfd: asic3 gpiolib support	2008-07-20 21:16:27 -07:00
Linus Torvalds	f894d18380	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (277 commits) V4L/DVB (8415): gspca: Infinite loop in i2c_w() of etoms. V4L/DVB (8414): videodev/cx18: fix get_index bug and error-handling lock-ups V4L/DVB (8411): videobuf-dma-contig.c: fix 64-bit build for pre-2.6.24 kernels V4L/DVB (8410): sh_mobile_ceu_camera: fix 64-bit compiler warnings V4L/DVB (8397): video: convert select VIDEO_ZORAN_ZR36060 into depends on V4L/DVB (8396): video: Fix Kbuild dependency for VIDEO_IR_I2C V4L/DVB (8395): saa7134: Fix Kbuild dependency of ir-kbd-i2c V4L/DVB (8394): ir-common: CodingStyle fix: move EXPORT_SYMBOL_GPL to their proper places V4L/DVB (8393): media/video: Fix depencencies for VIDEOBUF V4L/DVB (8392): media/Kconfig: Convert V4L1_COMPAT select into "depends on" V4L/DVB (8390): videodev: add comment and remove magic number. V4L/DVB (8389): videodev: simplify get_index() V4L/DVB (8387): Some cosmetic changes V4L/DVB (8381): ov7670: fix compile warnings V4L/DVB (8380): saa7115: use saa7115_auto instead of saa711x as the autodetect driver name. V4L/DVB (8379): saa7127: Make device detection optional V4L/DVB (8378): cx18: move cx18_av_vbi_setup to av-core.c and rename to cx18_av_std_setup V4L/DVB (8377): ivtv/cx18: ensure the default control values are correct V4L/DVB (8376): cx25840: move cx25840_vbi_setup to core.c and rename to cx25840_std_setup V4L/DVB (8374): gspca: No conflict of 0c45:6011 with the sn9c102 driver. ...	2008-07-20 21:14:42 -07:00
Linus Torvalds	f076ab8d04	Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits) KVM: Adjust smp_call_function_mask() callers to new requirements KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts KVM: x86 emulator: emulate clflush KVM: MMU: improve invalid shadow root page handling KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues KVM: check injected pic irq within valid pic irqs KVM: x86 emulator: Fix HLT instruction KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized KVM: VMX: Add ept_sync_context in flush_tlb KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held x86: KVM guest: make kvm_smp_prepare_boot_cpu() static KVM: SVM: fix suspend/resume support KVM: s390: rename private structures KVM: s390: Set guest storage limit and offset to sane values KVM: Fix memory leak on guest exit KVM: s390: dont allocate dirty bitmap KVM: move slots_lock acquision down to vapic_exit KVM: VMX: Fake emulate Intel perfctr MSRs KVM: VMX: Fix a wrong usage of vmcs_config ...	2008-07-20 21:13:26 -07:00
Linus Torvalds	db6d8c7a40	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits) iucv: Fix bad merging. net_sched: Add size table for qdiscs net_sched: Add accessor function for packet length for qdiscs net_sched: Add qdisc_enqueue wrapper highmem: Export totalhigh_pages. ipv6 mcast: Omit redundant address family checks in ip6_mc_source(). net: Use standard structures for generic socket address structures. ipv6 netns: Make several "global" sysctl variables namespace aware. netns: Use net_eq() to compare net-namespaces for optimization. ipv6: remove unused macros from net/ipv6.h ipv6: remove unused parameter from ip6_ra_control tcp: fix kernel panic with listening_get_next tcp: Remove redundant checks when setting eff_sacks tcp: options clean up tcp: Fix MD5 signatures for non-linear skbs sctp: Update sctp global memory limit allocations. sctp: remove unnecessary byteshifting, calculate directly in big-endian sctp: Allow only 1 listening socket with SO_REUSEADDR sctp: Do not leak memory on multiple listen() calls sctp: Support ipv6only AF_INET6 sockets. ...	2008-07-20 17:43:29 -07:00
Linus Torvalds	f7df406dce	Merge branch 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6 * 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6: configfs: Allow ->make_item() and ->make_group() to return detailed errors. Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."	2008-07-20 17:17:52 -07:00
Alan Cox	44b7d1b37f	tty: add more tty_port fields Move more bits into the tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:38 -07:00
Alan Cox	77451e53e0	cyclades: use tty_port Switch cyclades to use the new tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:38 -07:00
Alan Cox	f8ae476416	stallion: use tty_port Switch the stallion driver to use the tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:37 -07:00
Alan Cox	b02f5ad6a3	istallion: use tty_port Switch istallion to use the new tty_port structure Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:37 -07:00
Alan Cox	b5391e29f4	gs: use tty_port Switch drivers using the old "generic serial" driver to use the tty_port structures Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	4982d6b37a	esp: use tty_port Switch esp to use the new tty_port structures Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	7a4d29f426	tty.h: clean up Coding style clean up and white space tidy Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:36 -07:00
Alan Cox	df4f4dd429	serial: use tty_port Switch the serial_core based drivers to use the new tty_port structure. We can't quite use all of it yet because of the dynamically allocated extras in the serial_core layer. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:35 -07:00
Alan Cox	6f67048cd0	tty: Introduce a tty_port common structure Every tty driver has its own concept of a port structure and because they all differ we cannot extract commonality. Begin fixing this by creating a structure drivers can elect to use so that over time we can push fields into this and create commonality and then introduce common methods. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:35 -07:00
Alan Cox	a352def21a	tty: Ldisc revamp Move the line disciplines towards a conventional ->ops arrangement. For the moment the actual 'tty_ldisc' struct in the tty is kept as part of the tty struct but this can then be changed if it turns out that when it all settles down we want to refcount ldiscs separately to the tty. Pull the ldisc code out of /proc and put it with our ldisc code. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:34 -07:00
Haavard Skinnemoen	6bb0e3a59a	Subject: [PATCH 1/2] serial: Add flush_buffer() operation to uart_ops Serial drivers using DMA (like the atmel_serial driver) tend to get very confused when the xmit buffer is flushed and nobody told them. They also tend to spew a lot of garbage since the DMA engine keeps running after the buffer is flushed and possibly refilled with unrelated data. This patch adds a new flush_buffer operation to the uart_ops struct, along with a call to it from uart_flush_buffer() right after the xmit buffer has been cleared. The driver can implement this in order to syncronize its internal DMA state with the xmit buffer when the buffer is flushed. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-20 17:12:34 -07:00
Philipp Zabel	99cdb0c8c5	mfd: let asic3 use mem resource instead of bus_shift The bus_shift parameter in platform_data is not needed as we can tell the driver with the IOMEM_RESOURCE whether the ASIC is located on a 16bit or 32bit memory bus. The htc-egpio driver uses a more descriptive bus_width parameter, but for drivers where the register map size fixed, we don't even need this. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:44 +02:00
Philipp Zabel	279cac484e	mfd: remove DS1WM register definitions from asic3.h There is a dedicated ds1wm driver, no need to duplicate this information here. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:24 +02:00
Philipp Zabel	4a67b528e0	mfd: add ASIC3_CONFIG_GPIO templates As ASIC3 GPIO alternate function configuration is expected to be similar for several devices, it is convenient to define descriptive macros. This patch is inspired by the PXA MFP configuration, the alternate functions were observed on hx4700 and blueangel. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-07-20 19:56:19 +02:00
Samuel Ortiz	3b8139f8b1	mfd: Use uppercase only for asic3 macros and defines Let's be consistent and use uppercase only, for both macro and defines. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:55:14 +02:00
Samuel Ortiz	3b26bf1722	mfd: New asic3 gpio configuration code The ASIC3 GPIO configuration code is a bit obscure and hardly readable. This patch changes it so that it is now more readable and understandable, by being more explicit. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:55:00 +02:00
Samuel Ortiz	1effe5bc6c	mfd: asic3 children platform data removal Platform devices should be dynamically allocated, and each supported device should have its own platform data. For now we just remove this buggy code. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:54:51 +02:00
Samuel Ortiz	6f2384c4bd	mfd: asic3 gpiolib support ASIC3 is, among other things, a GPIO extender. We should thus have it supporting the current gpiolib API. Signed-off-by: Samuel Ortiz <sameo@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-07-20 19:52:38 +02:00
Jean Delvare	5a367dfb73	V4L/DVB (8246): tvaudio: Stop I2C driver ID abuse The tvaudio driver is using "official" I2C device IDs for internal purpose. There must be some historical reason behind this but anyway, it shouldn't do that. As the stored values are never used, the easiest way to fix the problem is simply to remove them altogether. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:18:38 -03:00
Jean Delvare	be99af6679	V4L/DVB (8245): ovcamchip: Delete stray I2C bus ID I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses this driver ID, so we can remove the reference. As far as I can see, the Cypress FX2 webcam is handled by a different driver (dvb-usb). Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:18:34 -03:00
Hans de Goede	ab8f12cf8e	V4L/DVB (8197): gspca: pac207 frames no more decoded in the subdriver. videodev2: New pixfmt pac207: Remove the specific decoding. main: get_buff_size operation added for the subdriver. Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl> Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:17:02 -03:00
Hans de Goede	54ab92ca05	V4L/DVB (8194): gspca: Fix the format of the low resolution mode of spca561. The low (half) res modes of the spca561 are not spca561 compressed, but are raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG bayer format used by the spca561 in low res mode. Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl> Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:16:47 -03:00
Jean-Francois Moine	6a7eba24e4	V4L/DVB (8157): gspca: all subdrivers - remaning subdrivers added - remove the decoding helper and some specific frame decodings Signed-off-by: Jean-Francois Moine <moinejf@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:14:49 -03:00
Tobias Lorenz	1d0ba5f378	V4L/DVB (7942): Hardware frequency seek ioctl interface Signed-off-by: Tobias Lorenz <tobias.lorenz@gmx.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-07-20 07:07:12 -03:00
Marcelo Tosatti	34d4cb8fca	KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction Flush the shadow mmu before removing regions to avoid stale entries. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Tan, Li	9ef621d3be	KVM: Support mixed endian machines Currently kvmtrace is not portable. This will prevent from copying a trace file from big-endian target to little-endian workstation for analysis. In the patch, kernel outputs metadata containing a magic number to trace log, and changes 64-bit words to be u64 instead of a pair of u32s. Signed-off-by: Tan Li <li.tan@intel.com> Acked-by: Jerone Young <jyoung5@us.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:32 +03:00
Laurent Vivier	5f94c1741b	KVM: Add coalesced MMIO support (common part) This patch adds all needed structures to coalesce MMIOs. Until an architecture uses it, it is not compiled. Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that can be coalesced: - KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone. It requests one parameter (struct kvm_coalesced_mmio_zone) which defines a memory area where MMIOs can be coalesced until the next switch to user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX. - KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone). The userspace client can check kernel coalesced MMIO availability by asking ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability. The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported, or the page offset where will be stored the ring buffer. The page offset depends on the architecture. After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively to the address of the start of th kvm_run structure. The MMIO ring buffer is defined by the structure kvm_coalesced_mmio_ring. [akio: fix oops during guest shutdown] Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:31 +03:00
Laurent Vivier	92760499d0	KVM: kvm_io_device: extend in_range() to manage len and write attribute Modify member in_range() of structure kvm_io_device to pass length and the type of the I/O (write or read). This modification allows to use kvm_io_device with coalesced MMIO. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:30 +03:00
Avi Kivity	7cc8883074	KVM: Remove decache_vcpus_on_cpu() and related callbacks Obsoleted by the vmx-specific per-cpu list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Mike Travis	80422d3431	cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP * Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify) * Fix a semantic error in CPUMASK_ALLOC * Add a bit of commentry to cpumask.h Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:12 +02:00
Jussi Kivilinna	175f9c1bba	net_sched: Add size table for qdiscs Add size table functions for qdiscs and calculate packet size in qdisc_enqueue(). Based on patch by Patrick McHardy http://marc.info/?l=linux-netdev&m=115201979221729&w=2 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-20 00:08:47 -07:00
YOSHIFUJI Hideaki	230b183921	net: Use standard structures for generic socket address structures. Use sockaddr_storage{} for generic socket address storage and ensures proper alignment. Use sockaddr{} for pointers to omit several casts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-19 22:35:47 -07:00
Adam Langley	4389dded77	tcp: Remove redundant checks when setting eff_sacks Remove redundant checks when setting eff_sacks and make the number of SACKs a compile time constant. Now that the options code knows how many SACK blocks can fit in the header, we don't need to have the SACK code guessing at it. Signed-off-by: Adam Langley <agl@imperialviolet.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-19 00:07:02 -07:00
David S. Miller	3072367300	pkt_sched: Manage qdisc list inside of root qdisc. Idea is from Patrick McHardy. Instead of managing the list of qdiscs on the device level, manage it in the root qdisc of a netdev_queue. This solves all kinds of visibility issues during qdisc destruction. The way to iterate over all qdiscs of a netdev_queue is to visit the netdev_queue->qdisc, and then traverse it's list. The only special case is to ignore builting qdiscs at the root when dumping or doing a qdisc_lookup(). That was not needed previously because builtin qdiscs were not added to the device's qdisc_list. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 22:50:15 -07:00
Matthew Garrett	92c4989092	Input: add switch for dock events Add a SW_DOCK switch to input.h. ACPI docks currently send their docking status as a uevent, but not all docks are ACPI or correspond to a device. In that case, it makes more sense to simply generate an input event on docking or undocking. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-19 00:52:43 -04:00
Mark Brown	5ec461d083	Input: add microphone insert switch definition Add a new switch type to the input API for reporting microphone insertion. This will be used by the ALSA jack reporting API. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-19 00:52:36 -04:00
Patrick McHardy	8913336a7e	packet: add PACKET_RESERVE sockopt Add new sockopt to reserve some headroom in the mmaped ring frames in front of the packet payload. This can be used f.i. when the VLAN header needs to be (re)constructed to avoid moving the entire payload. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 18:05:19 -07:00
venkatesh.pallipadi@intel.com	ae79cdaacb	x86: Add a arch directory for x86 under debugfs Add a directory for x86 arch under debugfs. Can be used to accumulate all x86 specific debugfs files. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 17:22:04 -07:00
Ingo Molnar	a208f37a46	Merge branch 'linus' into x86/x2apic	2008-07-18 22:50:34 +02:00
Mike Travis	77586c2bda	cpumask: Provide a generic set of CPUMASK_ALLOC macros * Provide a generic set of CPUMASK_ALLOC macros patterned after the SCHED_CPUMASK_ALLOC macros. This is used where multiple cpumask_t variables are declared on the stack to reduce the amount of stack space required. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:03:00 +02:00
Mike Travis	65c0118453	cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr * This patch replaces the dangerous lvalue version of cpumask_of_cpu with new cpumask_of_cpu_ptr macros. These are patterned after the node_to_cpumask_ptr macros. In general terms, if there is a cpumask_of_cpu_map[] then a pointer to the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map is provided when there is a large NR_CPUS count, reducing greatly the amount of code generated and stack space used for cpumask_of_cpu(). The pointer to the cpumask_t value is needed for calling set_cpus_allowed_ptr() to reduce the amount of stack space needed to pass the cpumask_t value. If there isn't a cpumask_of_cpu_map[], then a temporary variable is declared and filled in with value from cpumask_of_cpu(cpu) as well as a pointer variable pointing to this temporary variable. Afterwards, the pointer is used to reference the cpumask value. The compiler will optimize out the extra dereference through the pointer as well as the stack space used for the pointer, resulting in identical code. A good example of the orthogonal usages is in net/sunrpc/svc.c: case SVC_POOL_PERCPU: { unsigned int cpu = m->pool_to[pidx]; cpumask_of_cpu_ptr(cpumask, cpu); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, cpumask); return 1; } case SVC_POOL_PERNODE: { unsigned int node = m->pool_to[pidx]; node_to_cpumask_ptr(nodecpumask, node); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, nodecpumask); return 1; } Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:02:57 +02:00
Ingo Molnar	bb2c018b09	Merge branch 'linus' into cpus4096 Conflicts: drivers/acpi/processor_throttling.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:00:54 +02:00
Ingo Molnar	9b610fda0d	Merge branch 'linus' into timers/nohz	2008-07-18 19:53:16 +02:00
Thomas Gleixner	b8f8c3cf0a	nohz: prevent tick stop outside of the idle loop Jack Ren and Eric Miao tracked down the following long standing problem in the NOHZ code: scheduler switch to idle task enable interrupts Window starts here ----> interrupt happens (does not set NEED_RESCHED) irq_exit() stops the tick ----> interrupt happens (does set NEED_RESCHED) return from schedule() cpu_idle(): preempt_disable(); Window ends here The interrupts can happen at any point inside the race window. The first interrupt stops the tick, the second one causes the scheduler to rerun and switch away from idle again and we end up with the tick disabled. The fact that it needs two interrupts where the first one does not set NEED_RESCHED and the second one does made the bug obscure and extremly hard to reproduce and analyse. Kudos to Jack and Eric. Solution: Limit the NOHZ functionality to the idle loop to make sure that we can not run into such a situation ever again. cpu_idle() { preempt_disable(); while(1) { tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we are in the idle loop while (!need_resched()) halt(); tick_nohz_restart_sched_tick(); <- disables NOHZ mode preempt_enable_no_resched(); schedule(); preempt_disable(); } } In hindsight we should have done this forever, but ... /me grabs a large brown paperbag. Debugged-by: Jack Ren <jack.ren@marvell.com>, Debugged-by: eric miao <eric.y.miao@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-18 18:10:28 +02:00
Lai Jiangshan	5127bed588	rcu classic: new algorithm for callbacks-processing(v2) This is v2, it's a little deference from v1 that I had send to lkml. use ACCESS_ONCE use rcu_batch_after/rcu_batch_before for batch # comparison. rcutorture test result: (hotplugs: do cpu-online/offline once per second) No CONFIG_NO_HZ: OK, 12hours No CONFIG_NO_HZ, hotplugs: OK, 12hours CONFIG_NO_HZ=y: OK, 24hours CONFIG_NO_HZ=y, hotplugs: Failed. (Failed also without my patch applied, exactly the same bug occurred, http://lkml.org/lkml/2008/7/3/24) v1's email thread: http://lkml.org/lkml/2008/6/2/539 v1's description: The code/algorithm of the implement of current callbacks-processing is very efficient and technical. But when I studied it and I found a disadvantage: In multi-CPU systems, when a new RCU callback is being queued(call_rcu[_bh]), this callback will be invoked after the grace period for the batch with batch number = rcp->cur+2 has completed very very likely in current implement. Actually, this callback can be invoked after the grace period for the batch with batch number = rcp->cur+1 has completed. The delay of invocation means that latency of synchronize_rcu() is extended. But more important thing is that the callbacks usually free memory, and these works are delayed too! it's necessary for reclaimer to free memory as soon as possible when left memory is few. A very simple way can solve this problem: a field(struct rcu_head::batch) is added to record the batch number for the RCU callback. And when a new RCU callback is being queued, we determine the batch number for this callback(head->batch = rcp->cur+1) and we move this callback to rdp->donelist if we find that head->batch <= rcp->completed when we process callbacks. This simple way reduces the wait time for invocation a lot. (about 2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems) This is my algorithm. But I do not add any field for struct rcu_head in my implement. We just need to memorize the last 2 batches and their batch number, because these 2 batches include all entries that for whom the grace period hasn't completed. So we use a special linked-list rather than add a field. Please see the comment of struct rcu_data. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 16:07:33 +02:00
Lai Jiangshan	3cac97cbb1	rcu classic: simplify the next pending batch use a batch number(rcp->pending) instead of a flag(rcp->next_pending) rcu_start_batch() need to change this flag, so mb()s is needed for memory-access safe. but(after this patch applied) rcu_start_batch() do not change this batch number(rcp->pending), rcp->pending is managed by __rcu_process_callbacks only, and troublesome mb()s are eliminated. And codes look simpler and clearer. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Gautham Shenoy <ego@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 16:07:32 +02:00
Ingo Molnar	1b427c153a	sched: fix build error, provide partition_sched_domains() unconditionally provide an empty partition_sched_domains() definition for the UP case: include/linux/cpuset.h: In function ‘rebuild_sched_domains': include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 14:02:46 +02:00
Max Krasnyansky	e761b77252	cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) This is based on Linus' idea of creating cpu_active_map that prevents scheduler load balancer from migrating tasks to the cpu that is going down. It allows us to simplify domain management code and avoid unecessary domain rebuilds during cpu hotplug event handling. Please ignore the cpusets part for now. It needs some more work in order to avoid crazy lock nesting. Although I did simplfy and unify domain reinitialization logic. We now simply call partition_sched_domains() in all the cases. This means that we're using exact same code paths as in cpusets case and hence the test below cover cpusets too. Cpuset changes to make rebuild_sched_domains() callable from various contexts are in the separate patch (right next after this one). This not only boots but also easily handles while true; do make clean; make -j 8; done and while true; do on-off-cpu 1; done at the same time. (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing). Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing this on right now in gnome-terminal and things are moving just fine. Also this is running with most of the debug features enabled (lockdep, mutex, etc) no BUG_ONs or lockdep complaints so far. I believe I addressed all of the Dmitry's comments for original Linus' version. I changed both fair and rt balancer to mask out non-active cpus. And replaced cpu_is_offline() with !cpu_active() in the main scheduler code where it made sense (to me). Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Gregory Haskins <ghaskins@novell.com> Cc: dmitry.adamushko@gmail.com Cc: pj@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 13:22:25 +02:00
Pavel Emelyanov	b6fcbdb4f2	proc: consolidate per-net single-release callers They are symmetrical to single_open ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 04:07:44 -07:00
Pavel Emelyanov	de05c557b2	proc: consolidate per-net single_open callers There are already 7 of them - time to kill some duplicate code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-18 04:07:21 -07:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
David S. Miller	8387400092	pkt_sched: Kill netdev_queue lock. We can simply use the qdisc->q.lock for all of the qdisc tree synchronization. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:30 -07:00
David S. Miller	ead81cc5fc	netdevice: Move qdisc_list back into net_device proper. And give it it's own lock. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:26 -07:00
David S. Miller	37437bb2e1	pkt_sched: Schedule qdiscs instead of netdev_queue. When we have shared qdiscs, packets come out of the qdiscs for multiple transmit queues. Therefore it doesn't make any sense to schedule the transmit queue when logically we cannot know ahead of time the TX queue of the SKB that the qdisc->dequeue() will give us. Just for sanity I added a BUG check to make sure we never get into a state where the noop_qdisc is scheduled. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:20 -07:00
David S. Miller	e2627c8c22	pkt_sched: Make QDISC_RUNNING a qdisc state. Currently it is associated with a netdev_queue, but when we have qdisc sharing that no longer makes any sense. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:18 -07:00
David S. Miller	d3b753db7c	pkt_sched: Move gso_skb into Qdisc. We liberate any dangling gso_skb during qdisc destruction. It really only matters for the root qdisc. But when qdiscs can be shared by multiple netdev_queue objects, we can't have the gso_skb in the netdev_queue any more. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:18 -07:00
David S. Miller	92831bc395	netdev: Kill plain netif_schedule() No more users. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:16 -07:00
David S. Miller	eae792b722	netdev: Add netdev->select_queue() method. Devices or device layers can set this to control the queue selection performed by dev_pick_tx(). This function runs under RCU protection, which allows overriding functions to have some way of synchronizing with things like dynamic ->real_num_tx_queues adjustments. This makes the spinlock prefetch in dev_queue_xmit() a little bit less effective, but that's the price right now for correctness. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:10 -07:00
David S. Miller	e3c50d5d25	netdev: netdev_priv() can now be sane again. The private area of a netdev is now at a fixed offset once more. Unfortunately, some assumptions that netdev_priv() == netdev->priv crept back into the tree. In particular this happened in the loopback driver. Make it use netdev->ml_priv. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:09 -07:00
David S. Miller	6b0fb1261a	netdev: Kill struct net_device_subqueue and netdev->egress_subqueue* No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:08 -07:00
David S. Miller	fd2ea0a79f	net: Use queue aware tests throughout. This effectively "flips the switch" by making the core networking and multiqueue-aware drivers use the new TX multiqueue structures. Non-multiqueue drivers need no changes. The interfaces they use such as netif_stop_queue() degenerate into an operation on TX queue zero. So everything "just works" for them. Code that really wants to do "X" to all TX queues now invokes a routine that does so, such as netif_tx_wake_all_queues(), netif_tx_stop_all_queues(), etc. pktgen and netpoll required a little bit more surgery than the others. In particular the pktgen changes, whilst functional, could be largely improved. The initial check in pktgen_xmit() will sometimes check the wrong queue, which is mostly harmless. The thing to do is probably to invoke fill_packet() earlier. The bulk of the netpoll changes is to make the code operate solely on the TX queue indicated by by the SKB queue mapping. Setting of the SKB queue mapping is entirely confined inside of net/core/dev.c:dev_pick_tx(). If we end up needing any kind of special semantics (drops, for example) it will be implemented here. Finally, we now have a "real_num_tx_queues" which is where the driver indicates how many TX queues are actually active. With IGB changes from Jeff Kirsher. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:07 -07:00
David S. Miller	1d8ae3fdeb	pkt_sched: Remove RR scheduler. This actually fixes a bug added by the RR scheduler changes. The ->bands and ->prio2band parameters were being set outside of the sch_tree_lock() and thus could result in strange behavior and inconsistencies. It might be possible, in the new design (where there will be one qdisc per device TX queue) to allow similar functionality via a TX hash algorithm for RR but I really see no reason to export this aspect of how these multiqueue cards actually implement the scheduling of the the individual DMA TX rings and the single physical MAC/PHY port. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:04 -07:00
David S. Miller	09e83b5d7d	netdev: Kill NETIF_F_MULTI_QUEUE. There is no need for a feature bit for something that can be tested by simply checking the TX queue count. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:03 -07:00
David S. Miller	e8a0464cc9	netdev: Allocate multiple queues for TX. alloc_netdev_mq() now allocates an array of netdev_queue structures for TX, based upon the queue_count argument. Furthermore, all accesses to the TX queues are now vectored through the netdev_get_tx_queue() and netdev_for_each_tx_queue() interfaces. This makes it easy to grep the tree for all things that want to get to a TX queue of a net device. Problem spots which are not really multiqueue aware yet, and only work with one queue, can easily be spotted by grepping for all netdev_get_tx_queue() calls that pass in a zero index. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:00 -07:00
Dan Williams	0839875e0c	async_tx: make async_tx_test_ack a boolean routine Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:56 -07:00
Dan Williams	3dce017137	async_tx: remove depend_tx from async_tx_sync_epilog All callers of async_tx_sync_epilog have called async_tx_quiesce on the depend_tx, so async_tx_sync_epilog need only call the callback to complete the operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:55 -07:00
Dan Williams	d2c52b7983	async_tx: export async_tx_quiesce Replace open coded "wait and acknowledge" instances with async_tx_quiesce. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-17 17:59:55 -07:00
Joel Becker	a6795e9ebb	configfs: Allow ->make_item() and ->make_group() to return detailed errors. The configfs operations ->make_item() and ->make_group() currently return a new item/group. A return of NULL signifies an error. Because of this, -ENOMEM is the only return code bubbled up the stack. Multiple folks have requested the ability to return specific error codes when these operations fail. This patch adds that ability by changing the ->make_item/group() ops to return ERR_PTR() values. These errors are bubbled up appropriately. NULL returns are changed to -ENOMEM for compatibility. Also updated are the in-kernel users of configfs. This is a rework of reverted commit `11c3b79218`. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 15:21:29 -07:00
Joel Becker	f89ab8619e	Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors." This reverts commit `11c3b79218`. The code will move to PTR_ERR(). Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-17 14:53:48 -07:00
Linus Torvalds	5b664cb235	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: [PATCH] ocfs2: fix oops in mmap_truncate testing configfs: call drop_link() to cleanup after create_link() failure configfs: Allow ->make_item() and ->make_group() to return detailed errors. configfs: Fix failing mkdir() making racing rmdir() fail configfs: Fix deadlock with racing rmdir() and rename() configfs: Make configfs_new_dirent() return error code instead of NULL configfs: Protect configfs_dirent s_links list mutations configfs: Introduce configfs_dirent_lock ocfs2: Don't snprintf() without a format. ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs ocfs2/net: Silence build warnings on sparc64 ocfs2: Handle error during journal load ocfs2: Silence an error message in ocfs2_file_aio_read() ocfs2: use simple_read_from_buffer() ocfs2: fix printk format warnings with OCFS2_FS_STATS=n [PATCH 2/2] ocfs2: Instrument fs cluster locks [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option	2008-07-17 10:55:51 -07:00
Roland McGrath	f470021adb	ptrace children revamp ptrace no longer fiddles with the children/sibling links, and the old ptrace_children list is gone. Now ptrace, whether of one's own children or another's via PTRACE_ATTACH, just uses the new ptraced list instead. There should be no user-visible difference that matters. The only change is the order in which do_wait() sees multiple stopped children and stopped ptrace attachees. Since wait_task_stopped() was changed earlier so it no longer reorders the children list, we already know this won't cause any new problems. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-16 18:02:33 -07:00
Linus Torvalds	dc7c65db28	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits) Revert "x86/PCI: ACPI based PCI gap calculation" PCI: remove unnecessary volatile in PCIe hotplug struct controller x86/PCI: ACPI based PCI gap calculation PCI: include linux/pm_wakeup.h for device_set_wakeup_capable PCI PM: Fix pci_prepare_to_sleep x86/PCI: Fix PCI config space for domains > 0 Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n PCI: Simplify PCI device PM code PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep PCI ACPI: Rework PCI handling of wake-up ACPI: Introduce new device wakeup flag 'prepared' ACPI: Introduce acpi_device_sleep_wake function PCI: rework pci_set_power_state function to call platform first PCI: Introduce platform_pci_power_manageable function ACPI: Introduce acpi_bus_power_manageable function PCI: make pci_name use dev_name PCI: handle pci_name() being const PCI: add stub for pci_set_consistent_dma_mask() PCI: remove unused arch pcibios_update_resource() functions PCI: fix pci_setup_device()'s sprinting into a const buffer ... Fixed up conflicts in various files (arch/x86/kernel/setup_64.c, arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c, drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86 and ACPI updates manually.	2008-07-16 17:25:46 -07:00
Kumar Gala	b219108cba	fs_enet: Remove !CONFIG_PPC_CPM_NEW_BINDING code Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING. Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h) Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:49 -05:00
Scott Wood	d87eb12785	gianfar: Add magic packet and suspend/resume support. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:47 -05:00
Scott Wood	d49747bdfb	powerpc/mpc83xx: Power Management support Basic PM support for 83xx. Standby is implemented as sleep. Suspend-to-RAM is implemented as "deep sleep" (with the processor turned off) on 831x. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-16 17:57:30 -05:00
Linus Torvalds	8a0ca91e1d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (68 commits) sdio_uart: Fix SDIO break control to now return success or an error mmc: host driver for Ricoh Bay1Controllers sdio: sdio_io.c Fix sparse warnings sdio: fix the use of hard coded timeout value. mmc: OLPC: update vdd/powerup quirk comment mmc: fix spares errors of sdhci.c mmc: remove multiwrite capability wbsd: fix bad dma_addr_t conversion atmel-mci: Driver for Atmel on-chip MMC controllers mmc: fix sdio_io sparse errors mmc: wbsd.c fix shadowing of 'dma' variable MMC: S3C24XX: Refuse incorrectly aligned transfers MMC: S3C24XX: Add maintainer entry MMC: S3C24XX: Update error debugging. MMC: S3C24XX: Add media presence test to request handling. MMC: S3C24XX: Fix use of msecs where jiffies are needed MMC: S3C24XX: Add MODULE_ALIAS() entries for the platform devices MMC: S3C24XX: Fix s3c2410_dma_request() return code check. MMC: S3C24XX: Allow card-detect on non-IRQ capable pin MMC: S3C24XX: Ensure host->mrq->data is valid ... Manually fixed up bogus executable bits on drivers/mmc/core/sdio_io.c and include/linux/mmc/sdio_func.h when merging.	2008-07-16 15:17:52 -07:00
Linus Torvalds	9c1be0c471	Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6 * 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: include to compilation UBIFS: add new flash file system UBIFS: add brief documentation MAINTAINERS: add UBIFS section do_mounts: allow UBI root device name VFS: export sync_sb_inodes VFS: move inode_lock into sync_sb_inodes	2008-07-16 15:02:57 -07:00
Linus Torvalds	42fdd144a4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) IDE: Report errors during drive reset back to user space Update documentation of HDIO_DRIVE_RESET ioctl IDE: Remove unused code IDE: Fix HDIO_DRIVE_RESET handling hd.c: remove the #include <linux/mc146818rtc.h> update the BLK_DEV_HD help text move ide/legacy/hd.c to drivers/block/ ide/legacy/hd.c: use late_initcall() remove BLK_DEV_HD_ONLY ide: endian annotations in ide-floppy.c ide-floppy: zero out the whole struct ide_atapi_pc on init ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path ide-cd: move request prep from cdrom_start_rw_cont to rq issue path ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path ide-cd: fold cdrom_start_seek into ide_cd_do_request ide-cd: simplify request issuing path ide-cd: mv ide_do_rw_cdrom ide_cd_do_request ide-cd: cdrom_start_seek: remove unused argument block ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block ...	2008-07-16 14:53:54 -07:00
Linus Torvalds	4314652bb4	Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6 * 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits) Fix FADT parsing Add the ability to reset the machine using the RESET_REG in ACPI's FADT table. ACPI: use dev_printk when possible PNPACPI: add support for HP vendor-specific CCSR descriptors PNP: avoid legacy IDE IRQs PNP: convert resource options to single linked list ISAPNP: handle independent options following dependent ones PNP: remove extra 0x100 bit from option priority PNP: support optional IRQ resources PNP: rename pnp_register__resource() local variables PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR PNP: centralize resource option allocations PNP: remove redundant pnp_can_configure() check PNP: make resource assignment functions return 0 (success) or -EBUSY (failure) PNP: in debug resource dump, make empty list obvious PNP: improve resource assignment debug PNP: increase I/O port & memory option address sizes PNP: introduce pnp_irq_mask_t typedef PNP: make resource option structures private to PNP subsystem PNP: define PNP-specific IORESOURCE_IO_ flags alongside IRQ, DMA, MEM ...	2008-07-16 14:52:12 -07:00
Martin K. Petersen	d442cc44c0	block: Trivial fix for blk_integrity_rq() Fail integrity check gracefully when request does not have a bio attached (BLOCK_PC). Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-16 14:51:41 -07:00
Linus Torvalds	8df1b049bc	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits) NFSv4: Remove BKL from the nfsv4 state recovery SUNRPC: Remove the BKL from the callback functions NFS: Remove BKL from the readdir code NFS: Remove BKL from the symlink code NFS: Remove BKL from the sillydelete operations NFS: Remove the BKL from the rename, rmdir and unlink operations NFS: Remove BKL from NFS lookup code NFS: Remove the BKL from nfs_link() NFS: Remove the BKL from the inode creation operations NFS: Remove BKL usage from open() NFS: Remove BKL usage from the write path NFS: Remove the BKL from the permission checking code NFS: Remove attribute update related BKL references NFS: Remove BKL requirement from attribute updates NFS: Protect inode->i_nlink updates using inode->i_lock nfs: set correct fl_len in nlmclnt_test() SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier SUNRPC: None of rpcb_create's callers wants a privileged source port SUNRPC: Introduce a specific rpcb_create for contacting localhost ...	2008-07-16 14:49:49 -07:00
Bjorn Helgaas	1f32ca31e7	PNP: convert resource options to single linked list ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of a device, i.e., the possibilities an OS bus driver has when it assigns I/O port, MMIO, and other resources to the device. PNP used to maintain this "possible resource setting" information in one independent option structure and a list of dependent option structures for each device. Each of these option structures had lists of I/O, memory, IRQ, and DMA resources, for example: dev independent options ind-io0 -> ind-io1 ... ind-mem0 -> ind-mem1 ... ... dependent option set 0 dep0-io0 -> dep0-io1 ... dep0-mem0 -> dep0-mem1 ... ... dependent option set 1 dep1-io0 -> dep1-io1 ... dep1-mem0 -> dep1-mem1 ... ... ... This data structure was designed for ISAPNP, where the OS configures device resource settings by writing directly to configuration registers. The OS can write the registers in arbitrary order much like it writes PCI BARs. However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces that perform device configuration, and it is important to pass the desired settings to those interfaces in the correct order. The OS learns the correct order by using firmware interfaces that return the "current resource settings" and "possible resource settings," but the option structures above doesn't store the ordering information. This patch replaces the independent and dependent lists with a single list of options. For example, a device might have possible resource settings like this: dev options ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ... All the possible settings are in the same list, in the order they come from the firmware "possible resource settings" list. Each entry is tagged with an independent/dependent flag. Dependent entries also have a "set number" and an optional priority value. All dependent entries must be assigned from the same set. For example, the OS can use all the entries from dependent set 0, or all the entries from dependent set 1, but it cannot mix entries from set 0 with entries from set 1. Prior to this patch PNP didn't keep track of the order of this list, and it assigned all independent options first, then all dependent ones. Using the example above, that resulted in a "desired configuration" list like this: ind->io0 -> ind->io1 -> depN-io0 ... instead of the list the firmware expects, which looks like this: ind->io0 -> depN-io0 -> ind-io1 ... Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:07 +02:00
Bjorn Helgaas	d5ebde6ef5	PNP: support optional IRQ resources This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when assigning resources to a device. If the flag is set and we are unable to assign an IRQ to the device, we can leave the IRQ disabled but allow the overall resource allocation to succeed. Some devices request an IRQ, but can run without an IRQ (possibly with degraded performance). This flag lets us run the device without the IRQ instead of just leaving the device disabled. This is a reimplementation of this previous change by Rene Herman <rene.herman@gmail.com>: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43 I reimplemented this for two reasons: - to prepare for converting all resource options into a single linked list, as opposed to the per-resource-type lists we have now, and - to preserve the order and number of resource options. In PNPBIOS and ACPI, we configure a device by giving firmware a list of resource assignments. It is important that this list has exactly the same number of resources, in the same order, as the "template" list we got from the firmware in the first place. The problem of a sound card MPU401 being left disabled for want of an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:07 +02:00
Bjorn Helgaas	a1802c4295	PNP: make resource option structures private to PNP subsystem Nothing outside the PNP subsystem should need access to a device's resource options, so this patch moves the option structure declarations to a private header file. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	08c9f262f2	PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED in a private header file, but put those flags in struct resource.flags fields. Better to make them IORESOURCE_IO_* flags like the existing IRQ, DMA, and MEM flags. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	57fd51a8be	PNP: add pnp_possible_config() -- can a device could be configured this way? As part of a heuristic to identify modem devices, 8250_pnp.c checks to see whether a device can be configured at any of the legacy COM port addresses. This patch moves the code that traverses the PNP "possible resource options" from 8250_pnp.c to the PNP subsystem. This encapsulation is important because a future patch will change the implementation of those resource options. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:06 +02:00
Bjorn Helgaas	aee3ad815d	PNP: replace pnp_resource_table with dynamically allocated resources PNP used to have a fixed-size pnp_resource_table for tracking the resources used by a device. This table often overflowed, so we've had to increase the table size, which wastes memory because most devices have very few resources. This patch replaces the table with a linked list of resources where the entries are allocated on demand. This removes messages like these: pnpacpi: exceeded the max number of IO resources 00:01: too many I/O port resources References: http://bugzilla.kernel.org/show_bug.cgi?id=9535 http://bugzilla.kernel.org/show_bug.cgi?id=9740 http://lkml.org/lkml/2007/11/30/110 This patch also changes the way PNP uses the IORESOURCE_UNSET, IORESOURCE_AUTO, and IORESOURCE_DISABLED flags. Prior to this patch, the pnp_resource_table entries used the flags like this: IORESOURCE_UNSET This table entry is unused and available for use. When this flag is set, we shouldn't look at anything else in the resource structure. This flag is set when a resource table entry is initialized. IORESOURCE_AUTO This resource was assigned automatically by pnp_assign_{io,mem,etc}(). This flag is set when a resource table entry is initialized and cleared whenever we discover a resource setting by reading an ISAPNP config register, parsing a PNPBIOS resource data stream, parsing an ACPI _CRS list, or interpreting a sysfs "set" command. Resources marked IORESOURCE_AUTO are reinitialized and marked as IORESOURCE_UNSET by pnp_clean_resource_table() in these cases: - before we attempt to assign resources automatically, - if we fail to assign resources automatically, - after disabling a device IORESOURCE_DISABLED Set by pnp_assign_{io,mem,etc}() when automatic assignment fails. Also set by PNPBIOS and PNPACPI for: - invalid IRQs or GSI registration failures - invalid DMA channels - I/O ports above 0x10000 - mem ranges with negative length After this patch, there is no pnp_resource_table, and the resource list entries use the flags like this: IORESOURCE_UNSET This flag is no longer used in PNP. Instead of keeping IORESOURCE_UNSET entries in the resource list, we remove entries from the list and free them. IORESOURCE_AUTO No change in meaning: it still means the resource was assigned automatically by pnp_assign_{port,mem,etc}(), but these functions now set the bit explicitly. We still "clean" a device's resource list in the same places, but rather than reinitializing IORESOURCE_AUTO entries, we just remove them from the list. Note that IORESOURCE_AUTO entries are always at the end of the list, so removing them doesn't reorder other list entries. This is because non-IORESOURCE_AUTO entries are added by the ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the sysfs "set" command. In each of these cases, we completely free the resource list first. IORESOURCE_DISABLED In addition to the cases where we used to set this flag, ISAPNP now adds an IORESOURCE_DISABLED resource when it reads a configuration register with a "disabled" value. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>	2008-07-16 23:27:05 +02:00
Bjorn Helgaas	20bfdbba72	PNP: make pnp_{port,mem,etc}_start(), et al work for invalid resources Some callers use pnp_port_start() and similar functions without making sure the resource is valid. This patch makes us fall back to returning the initial values if the resource is not valid or not even present. This mostly preserves the previous behavior, where we would just return the initial values set by pnp_init_resource_table(). The original 2.6.25 code didn't range-check the "bar", so it would return garbage if the bar exceeded the table size. This code returns sensible values instead. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>	2008-07-16 23:27:05 +02:00
Rafael J. Wysocki	ebb12db51f	Freezer: Introduce PF_FREEZER_NOSIG The freezer currently attempts to distinguish kernel threads from user space tasks by checking if their mm pointer is unset and it does not send fake signals to kernel threads. However, there are kernel threads, mostly related to networking, that behave like user space tasks and may want to be sent a fake signal to be frozen. Introduce the new process flag PF_FREEZER_NOSIG that will be set by default for all kernel threads and make the freezer only send fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide the set_freezable_with_signal() function to be called by the kernel threads that want to be sent a fake signal for freezing. This patch should not change the freezer's observable behavior. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>	2008-07-16 23:27:03 +02:00
Elias Oltmanns	3ef5eb424e	IDE: Remove unused code Remove some code which has been made obsolete and hasn't worked properly before anyway. Part of the infrastructure may be reintroduced in a follow up patch to implement a working command aborting facility. Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk> Cc: "Randy Dunlap" <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:48 +02:00
Elias Oltmanns	79e36a9f54	IDE: Fix HDIO_DRIVE_RESET handling Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken in various ways. Most importantly, it is treated as an out of band request in an illegal way which may very likely lead to system lock ups. Use the drive's request queue to avoid this problem (and fix a locking issue for free along the way). Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk> Cc: "Randy Dunlap" <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:48 +02:00
Bartlomiej Zolnierkiewicz	e6d95bd149	ide: ->port_init_devs -> ->init_dev Change ->port_init_devs method to take 'ide_drive_t ' as an argument instead of 'ide_hwif_t ' and rename it to ->init_dev. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:42 +02:00
Bartlomiej Zolnierkiewicz	c56c5648a3	ide: set hwif->dev in ide_init_port_hw() (take 2) * Add 'parent' field to hw_regs_t for optional parent device pointer (needed by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw(). * Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly. v2: * Update scc_pata.c. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz	63b51c6d1d	ide: make ide_hwifs[] static Move ide_hwifs[] from ide.c to ide-probe.c and make it static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz	9ad5409375	ide: move PIO blacklist to ide-pio-blacklist.c Move PIO blacklist to ide-pio-blacklist.c. While at it: - fix comment - fix whitespace damage There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	3e153cfb5e	ide: remove no longer used ide_pio_timings[] Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	c9d6c1a237	ide: move ide_pio_cycle_time() to ide-timings.c All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS so move the function from ide-lib.c to ide-timings.c. While at it: - convert ide_pio_cycle_time() to use ide_timing_find_mode() - cleanup ide_pio_cycle_time() a bit There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz	f06ab3402a	ide: convert ide-timing.h to ide-timings.c library (take 2) * Don't include ide-timing.h in cs5535 and sis5513 host drivers (they don't need it currently). * Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS config option to be selected by host drivers using the library. While at it: - fix ide_timing_find_mode() placement v2: * Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>) There should be no functional changes caused by this patch. Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:37 +02:00
Bartlomiej Zolnierkiewicz	3be53f3f21	ide: move some bits from ide-timing.h to <linux/ide.h> Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h> from drivers/ide/ide-timing.h. While at it: - use u8/u16 instead of short for struct ide_timing fields - use enum for IDE_TIMING_* There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-16 20:33:36 +02:00
Linus Torvalds	45158894d4	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (249 commits) powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES powerpc: Fix a build problem on ppc32 with new DMA_ATTRs ibm_newemac: Add MII mode support to the EMAC RGMII bridge. powerpc: Don't spin on sync instruction at boot time powerpc: Add VSX load/store alignment exception handler powerpc: fix giveup_vsx to save registers correctly powerpc: support for latencytop powerpc: Remove unnecessary condition when sanity-checking WIMG bits powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT powerpc: Add driver for Barrier Synchronization Register powerpc: mman.h export fixups powerpc/fsl: update crypto node definition and device tree instances powerpc/fsl: Refactor device bindings powerpc/85xx: Minor fixes for 85xxds and 8536ds board. powerpc: Add 82xx/83xx/86xx to 6xx Multiplatform powerpc/85xx: publish of device for cds platforms powerpc/booke: don't reinitialize time base powerpc/86xx: Refactor pic init powerpc/CPM: Add i2c pins to dts and board setup cpm_uart: Support uart_wait_until_sent() ...	2008-07-15 19:04:58 -07:00
Linus Torvalds	89a93f2f48	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits) [SCSI] scsi_dh: fix kconfig related build errors [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h [SCSI] make struct scsi_{host,target}_type static [SCSI] fix locking in host use of blk_plug_device() [SCSI] zfcp: Cleanup external header file [SCSI] zfcp: Cleanup code in zfcp_erp.c [SCSI] zfcp: zfcp_fsf cleanup. [SCSI] zfcp: consolidate sysfs things into one file. [SCSI] zfcp: Cleanup of code in zfcp_aux.c [SCSI] zfcp: Cleanup of code in zfcp_scsi.c [SCSI] zfcp: Move status accessors from zfcp to SCSI include file. [SCSI] zfcp: Small QDIO cleanups [SCSI] zfcp: Adapter reopen for large number of unsolicited status [SCSI] zfcp: Fix error checking for ELS ADISC requests [SCSI] zfcp: wait until adapter is finished with ERP during auto-port [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver [SCSI] sg: Add target reset support [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC [SCSI] sd: Move scsi_disk() accessor function to sd.h ...	2008-07-15 18:58:04 -07:00
Benjamin Herrenschmidt	84c3d4aaec	Merge commit 'origin/master' Manual merge of: arch/powerpc/Kconfig arch/powerpc/kernel/stacktrace.c arch/powerpc/mm/slice.c arch/ppc/kernel/smp.c	2008-07-16 11:07:59 +10:00
Trond Myklebust	e89e896d31	Merge branch 'devel' into next Conflicts: fs/nfs/file.c Fix up the conflict with Jon Corbet's bkl-removal tree	2008-07-15 18:34:16 -04:00
Ingo Molnar	82638844d9	Merge branch 'linus' into cpus4096 Conflicts: arch/x86/xen/smp.c kernel/sched_rt.c net/iucv/iucv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 00:29:07 +02:00
Chuck Lever	c2e1b09ff2	SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon Introduce a new API to register RPC services on IPv6 interfaces to allow the NFS server and lockd to advertise on IPv6 networks. Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind protocol version 4 to contact the local rpcbind daemon. The version 4 SET/UNSET procedures allow services to register address families besides AF_INET, register at specific network interfaces, and register transport protocols besides UDP and TCP. All of this functionality is exposed via the new rpcb_v4_register() kernel API. A user-space rpcbind daemon implementation that supports version 4 of the rpcbind protocol is required in order to make use of this new API. Note that rpcbind version 3 is sufficient to support the new rpcbind facilities listed above, but most extant implementations use version 4. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:55 -04:00
Ingo Molnar	1e09481365	Merge branch 'linus' into core/softlockup Conflicts: kernel/softlockup.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 23:12:58 +02:00
Linus Torvalds	59190f4213	Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) generic-ipi: more merge fallout generic-ipi: merge fix x86, visws: use mach-default/entry_arch.h x86, visws: fix generic-ipi build generic-ipi: fixlet generic-ipi: fix s390 build bug generic-ipi: fix linux-next tree build failure fix: "smp_call_function: get rid of the unused nonatomic/retry argument" fix: "smp_call_function: get rid of the unused nonatomic/retry argument" fix "smp_call_function: get rid of the unused nonatomic/retry argument" on_each_cpu(): kill unused 'retry' parameter smp_call_function: get rid of the unused nonatomic/retry argument sh: convert to generic helpers for IPI function calls parisc: convert to generic helpers for IPI function calls mips: convert to generic helpers for IPI function calls m32r: convert to generic helpers for IPI function calls arm: convert to generic helpers for IPI function calls alpha: convert to generic helpers for IPI function calls ia64: convert to generic helpers for IPI function calls powerpc: convert to generic helpers for IPI function calls ... Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually	2008-07-15 14:12:03 -07:00
Chuck Lever	367c8c7bd9	lockd: Pass "struct sockaddr *" to new failover-by-IP function Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to allow for future support of IPv6. Also provide additional sanity checking in failover_unlock_ip() when constructing the server's IP address. As an added bonus, provide clean kerneldoc comments on related NLM interfaces which were recently added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 16:11:29 -04:00
Ingo Molnar	1a781a777b	Merge branch 'generic-ipi' into generic-ipi-for-linus Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 21:55:59 +02:00
Ingo Molnar	6c9fcaf2ee	Merge branch 'core/rcu' into core/rcu-for-linus	2008-07-15 21:10:12 +02:00
Jeff Layton	6cde4de807	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this function, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_lock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. Since nlmsvc_testlock() now just uses the caller's reference, we no longer need to get or release it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 14:53:33 -04:00
Jeff Layton	8f920d5e29	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this functions, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_testlock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. We take a reference to host in the place where nlmsvc_testlock() previous did a new lookup, so the reference counting is unchanged from before. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-15 14:26:52 -04:00
Linus Torvalds	b312bf359e	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: AHCI: Remove an unnecessary flush from ahci_qc_issue AHCI: speed up resume [libata] Add support for VPD page b1 ata: endianness annotations in pata drivers libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc [libata] sata_svw: update code comments relating to data corruption libata/ahci: enclosure management support libata: improve EH internal command timeout handling libata: use ULONG_MAX to terminate reset timeout table libata: improve EH retry delay handling libata: consistently use msecs for time durations	2008-07-15 11:18:10 -07:00
Linus Torvalds	dc221eae08	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (56 commits) i2c: Add detection capability to new-style drivers i2c: Call client_unregister for new-style devices too i2c: Clean up old chip drivers i2c-ibm_iic: Register child nodes i2c: New-style EEPROM driver using device IDs i2c: Export the i2c_bus_type symbol i2c-au1550: Fix PM support i2c-dev: Delete empty detach_client callback i2c: Drop stray references to lm_sensors i2c: Check for ACPI resource conflicts i2c-ocores: basic PM support i2c-sibyte: SWARM I2C board initialization i2c-i801: Fix handling of error conditions i2c-i801: Rename local variable temp to status i2c-i801: Properly report bus arbitration loss i2c-i801: Remove verbose debugging messages i2c-algo-pcf: Drop unused struct members i2c-algo-pcf: Multi-master lost-arbitration improvement i2c: Deprecate the legacy gpio drivers i2c-pxa: Initialize early ...	2008-07-15 11:16:05 -07:00
Linus Torvalds	98339cbd36	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits) ide-floppy: fix unfortunate function naming ide-tape: unify idetape_create_read/write_cmd ide: add ide_pc_intr() helper ide-{floppy,scsi}: read Status Register before stopping DMA engine ide-scsi: add more debugging to idescsi_pc_intr() ide-scsi: use pc->callback ide-floppy: add more debugging to idefloppy_pc_intr() ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled ide-tape: add ide_tape_io_buffers() helper ide-tape: factor out DSC handling from idetape_pc_intr() ide-{floppy,tape}: move checking of ->failed_pc to ->callback ide: add ide_issue_pc() helper ide: add PC_FLAG_DRQ_INTERRUPT pc flag ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc() ide: add ide_transfer_pc() helper ide-scsi: set drive->scsi flag for devices handled by the driver ide-{cd,floppy,tape}: remove checking for drive->scsi ide: add PC_FLAG_ZIP_DRIVE pc flag ide-tape: factor out waiting for good ireason from idetape_transfer_pc() ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc() ...	2008-07-15 11:15:36 -07:00
Bartlomiej Zolnierkiewicz	646c0cb6c4	ide: add ide_pc_intr() helper * ide-tape.c: add 'drive' argument to idetape_update_buffers(). * Add generic ide_pc_intr() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. * ide-tape.c: remove no longer needed DBG_PC_INTR. There should be no functional changes caused by this patch (unless the debugging is explicitely compiled in). Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:22:03 +02:00
Bartlomiej Zolnierkiewicz	6bf1641ca1	ide: add ide_issue_pc() helper Add generic ide_issue_pc() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:22:00 +02:00
Bartlomiej Zolnierkiewicz	28c7214bd8	ide: add PC_FLAG_DRQ_INTERRUPT pc flag Add PC_FLAG_DRQ_INTERRUPT pc flag, set it in ide_do_request() and check for it (instead of checking for IDE_FLAG_DRQ_INTERRUPT) in ide*_issue_pc(). This is a preparation for adding generic ide_issue_pc() helper. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:59 +02:00
Bartlomiej Zolnierkiewicz	594c16d8dd	ide: add ide_transfer_pc() helper * Add ide-atapi.c file for generic ATAPI support together with CONFIG_IDE_ATAPI config option. * Add generic ide_transfer_pc() helper to ide-atapi.c and then convert ide-{floppy,tape,scsi} device drivers to use it. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:58 +02:00
Bartlomiej Zolnierkiewicz	5d41893c0f	ide: add PC_FLAG_ZIP_DRIVE pc flag Add PC_FLAG_ZIP_DRIVE pc flag, set it in idefloppy_do_request() and check for it (instead of checking for IDEFLOPPY_FLAG_ZIP_DRIVE) in idefloppy_transfer_pc(). This is a preparation for adding generic ide_transfer_pc() helper. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:57 +02:00
Bartlomiej Zolnierkiewicz	5e33109582	ide-{floppy,tape}: PC_FLAG_DMA_RECOMMENDED -> PC_FLAG_DMA_OK * Use PC_FLAG_DMA_OK flag instead of PC_FLAG_DMA_RECOMMENDED one. * Remove no longer used PC_FLAG_DMA_RECOMMENDED flag. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz	1b06e92aa0	ide-{floppy,tape}: merge pc->idefloppy_callback and pc->idetape_callback Merge pc->idefloppy_callback and pc->idetape_callback into pc->callback. There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz	92f5daff2b	ide-tape: make pc->idetape_callback void There should be no functional changes caused by this patch. Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:55 +02:00
FUJITA Tomonori	63f5abb095	ide: remove action argument in ide_do_drive_cmd ide_do_drive_cmd is called only with ide_preempt action argument. So we can remove the action argument in ide_do_drive_cmd and ide_action_t typedef. This patch also includes two minor cleanups: 1) ide_do_drive_cmd always succeeds so we don't need the return value; 2) the callers use blk_rq_init before ide_do_drive_cmd so there is no need to initialize rq->errors. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:51 +02:00
Bartlomiej Zolnierkiewicz	ff07488346	ide: remove drive->ctl Remove drive->ctl (it is always equal to 0x08 after init time). While at it: * Use ATA_DEVCTL_OBS define. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz	0fd04dcc2e	ide: use ->OUTBSYNC in ide_set_irq() Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz	f8c4bd0ab2	ide: pass 'hwif ' instead of 'drive ' to ->OUTBSYNC method There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz	1357214461	ide: remove ->mmio flag from ide_hwif_t Since scc_pata host driver no longer uses IDE PCI layer / ide_dma_setup() and all other ->mmio users set also IDE_HFLAG_MMIO host flag we can safely remove ->mmio flag. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz	ed4af48fd6	ide: move IRQ unmasking out from ->tf_load method Move IRQ unmasking out from ->tf_load method to its users. There should be no functional changes caused by this patch (SELECT_MASK() is NOP except for hpt366, icside and sgiioc4). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz	9a410e79b5	ide: remove IDE_TFLAG_NO_SELECT_MASK taskfile flag Always call SELECT_MASK(..., 0) in ide_tf_load() (needs to be done to match ide_set_irq(..., 1)) and then remove IDE_TFLAG_NO_SELECT_MASK taskfile flag. This change should only affect hpt366 and icside host drivers since ->maskproc(..., 0) for sgiioc4 is equivalent to ide_set_irq(..., 1). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz	931ee0dc5c	ide: remove obsoleted "ide=" kernel parameters * Remove obsoleted "ide=" kernel parameters. * Remove no longer needed: - ide_setup() - parse_options() - __setup("", ...) - module_param(options, ...) * Use module_{init,exit}() for MODULE=y case and remove MODULE ifdef. * Make ide_acpi and ide_doubler variables static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:47 +02:00
Bartlomiej Zolnierkiewicz	30e5ee4d1a	ide: remove obsoleted "idebus=" kernel parameter * Remove obsoleted "idebus=" kernel parameter. * Remove no longer needed ide_system_bus_speed() and system_bus_clock() (together with idebus_parameter and system_bus_speed variables). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:46 +02:00
FUJITA Tomonori	681a561b7e	block: unexport blk_end_sync_rq All the users of blk_end_sync_rq has gone (they are converted to use blk_execute_rq). This unexports blk_end_sync_rq. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:45 +02:00
FUJITA Tomonori	124cafc5eb	ide: remove ide_init_drive_cmd ide_init_drive_cmd just calls blk_rq_init. This converts the users of ide_init_drive_cmd to use blk_rq_init directly and removes ide_init_drive_cmd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-15 21:21:44 +02:00
Linus Torvalds	61d97f4fcf	Merge branch 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: remove extraneous checks in manage.c genirq: Expose default irq affinity mask (take 3)	2008-07-15 10:39:22 -07:00
Linus Torvalds	38c46578ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: [GFS2] Fix GFS2's use of do_div() in its quota calculations [GFS2] Remove unused declaration [GFS2] Remove support for unused and pointless flag [GFS2] Replace rgrp "recent list" with mru list [GFS2] Allow local DF locks when holding a cached EX glock [GFS2] Fix delayed demote race [GFS2] don't call permission() [GFS2] Fix module building [GFS2] Glock documentation [GFS2] Remove all_list from lock_dlm [GFS2] Remove obsolete conversion deadlock avoidance code [GFS2] Remove remote lock dropping code [GFS2] kernel panic mounting volume [GFS2] Revise readpage locking [GFS2] Fix ordering of args for list_add [GFS2] trivial sparse lock annotations [GFS2] No lock_nolock [GFS2] Fix ordering bug in lock_dlm [GFS2] Clean up the glock core	2008-07-15 10:38:46 -07:00
Linus Torvalds	e7849f16c1	Merge branch 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: cputopology: always define CPU topology information, clean up cpu topology: always define CPU topology information	2008-07-15 10:32:39 -07:00
Linus Torvalds	8d2567a620	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits) ext4: Documention update for new ordered mode and delayed allocation ext4: do not set extents feature from the kernel ext4: Don't allow nonextenst mount option for large filesystem ext4: Enable delalloc by default. ext4: delayed allocation i_blocks fix for stat ext4: fix delalloc i_disksize early update issue ext4: Handle page without buffers in ext4_*_writepage() ext4: Add ordered mode support for delalloc ext4: Invert lock ordering of page_lock and transaction start in delalloc mm: Add range_cont mode for writeback ext4: delayed allocation ENOSPC handling percpu_counter: new function percpu_counter_sum_and_set ext4: Add delayed allocation support in data=writeback mode vfs: add hooks for ext4's delayed allocation support jbd2: Remove data=ordered mode support using jbd buffer heads ext4: Use new framework for data=ordered mode in JBD2 jbd2: Implement data=ordered mode handling via inodes vfs: export filemap_fdatawrite_range() ext4: Fix lock inversion in ext4_ext_truncate() ext4: Invert the locking order of page_lock and transaction start ...	2008-07-15 08:36:38 -07:00
Benzi Zbit	62a7573ee9	sdio: fix the use of hard coded timeout value. This adds reading and using of enable_timeout from the CIS Signed-off-by: Benzi Zbit <benzi.zbit@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 15:47:03 +02:00
Pierre Ossman	23af60398a	mmc: remove multiwrite capability Relax requirements on host controllers and only require that they do not report a transfer count than is larger than the actual one (i.e. a lower value is okay). This is how many other parts of the kernel behaves so upper layers should already be prepared to handle that scenario. This gives us a performance boost on MMC cards. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:49 +02:00
Tomas Winkler	6d37333163	mmc: fix sdio_io sparse errors This patch fixes sdio_io sparse errors. This fix changes signature of API functions, changing unsigned char -> u8 unsigned short -> u16 unsigned long -> u32 - this was probably a bug in 64 bit platforms Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:48 +02:00
Pierre Ossman	ad3868b2ec	mmc,sdio: helper function for transfer padding There are a lot of crappy controllers out there that cannot handle all the request sizes that the MMC/SD/SDIO specifications require. In case the card driver can pad the data to overcome the problems, this commit adds a helper that calculates how much that padding should be. A corresponding helper is also added for SDIO, but it can also deal with all the complexities of splitting up a large transfer efficiently. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:44 +02:00
Anton Vorontsov	08f80bb519	mmc: change .get_ro() callback semantics Now get_ro() callback must return 0/1 values for its logical states, and negative errno values in case of error. If particular host instance doesn't support RO/WP switch, it should return -ENOSYS. This patch changes some hosts in two ways: 1. Now functions should be smart to not return negative values in "RO asserted" case (particularly gpio_ calls could return negative values for the outermost GPIOs). Also, board code usually passes get_ro() callbacks that directly return gpioreg & bit result, so at91_mci, imxmmc, pxamci and mmc_spi's get_ro() handlers need take special care when returning platform's values to the mmc core. 2. In case of host instance didn't implement get_ro() callback, it should really return -ENOSYS and let the mmc core decide what to do about it (mmc core thinks the same way as the hosts, so it isn't functional change). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Anton Vorontsov	619ef4b421	mmc_spi: add support for card-detection polling This patch adds new platform data variable "caps", so platforms could pass theirs capabilities into MMC core (for example, platforms without interrupt on the CD line will most probably want to pass MMC_CAP_NEEDS_POLL). New platform get_cd() callback provided to optimize polling. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Anton Vorontsov	28f52482b4	mmc: add support for card-detection polling Some hosts (and boards that use mmc_spi) do not use interrupts on the CD line, so they can't trigger mmc_detect_change. We want to poll the card and see if there was a change. 1 second poll interval seems resonable. This patch also implements .get_cd() host operation, that could be used by the hosts that are able to report card-detect status without need to talk MMC. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Adrian Bunk	150a55683b	include/linux/mmc/mmc.h: remove CVS tags This patch removes a CVS tag that wasn't updated for a long time. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:41 +02:00
Pierre Ossman	4489428ab5	sdhci: support JMicron secondary interface JMicron chips sometimes have two interfaces to work around limitations in Microsoft's sdhci driver. This patch allows us to use either interface. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-07-15 14:14:40 +02:00
David S. Miller	e308a5d806	netdev: Add netdev->addr_list_lock protection. Add netif_addr_{lock,unlock}{,_bh}() helpers. Use them to protect operations that operate on or read the network device unicast and multicast address lists. Also use them in cases where the code simply wants to block calls into the driver's ->set_rx_mode() and ->set_multicast_list() methods. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:13:44 -07:00
David S. Miller	f1f28aa351	netdev: Add addr_list_lock to struct net_device. This will be used to protect the per-device unicast and multicast address lists, as well as the callbacks into the drivers which configure such state such as ->set_rx_mode() and ->set_multicast_list(). Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:08:33 -07:00
Ron Livne	521e575b9a	IB/mlx4: Add support for blocking multicast loopback packets Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Patrick McHardy	393e52e33c	packet: deliver VLAN TCI to userspace Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can properly deal with hardware VLAN tagging/stripping. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:50:39 -07:00
Patrick McHardy	bbd6ef87c5	packet: support extensible, 64 bit clean mmaped ring structure The tpacket_hdr is not 64 bit clean due to use of an unsigned long and can't be extended because the following struct sockaddr_ll needs to be at a fixed offset. Add support for a version 2 tpacket protocol that removes these limitations. Userspace can query the header size through a new getsockopt option and change the protocol version through a setsockopt option. The changes needed to switch to the new protocol version are: 1. replace struct tpacket_hdr by struct tpacket2_hdr 2. query header len and save 3. set protocol version to 2 - set up ring as usual 4. for getting the sockaddr_ll, use (void )hdr + TPACKET_ALIGN(hdrlen) instead of (void )hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr)) Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:50:15 -07:00
Patrick McHardy	bc1d0411b8	vlan: deliver packets received with VLAN acceleration to network taps When VLAN header stripping is used, packets currently bypass packet sockets (and other network taps) completely. For locally existing VLANs, they appear directly on the VLAN device, for unknown VLANs they are silently dropped. Add a new function netif_nit_deliver() to deliver incoming packets to all network interface taps and use it in __vlan_hwaccel_rx() to make VLAN packets visible on the underlying device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:30 -07:00
Patrick McHardy	6aa895b047	vlan: Don't store VLAN tag in cb Use a real skb member to store the skb to avoid clashes with qdiscs, which are allowed to use the cb area themselves. As currently only real devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit invalidation is neccessary. The new member fills a hole on 64 bit, the skb layout changes from: __u32 mark; /* 172 4 / sk_buff_data_t transport_header; / 176 4 / sk_buff_data_t network_header; / 180 4 / sk_buff_data_t mac_header; / 184 4 / sk_buff_data_t tail; / 188 4 / / --- cacheline 3 boundary (192 bytes) --- / sk_buff_data_t end; / 192 4 / / XXX 4 bytes hole, try to pack / to __u32 mark; / 172 4 / __u16 vlan_tci; / 176 2 / / XXX 2 bytes hole, try to pack / sk_buff_data_t transport_header; / 180 4 / sk_buff_data_t network_header; / 184 4 */ Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:06 -07:00
Benjamin Herrenschmidt	43d2548bb2	Merge commit '85082fd7cbe3173198aac0eb5e85ab1edcc6352c' into test-build Manual fixup of: arch/powerpc/Kconfig	2008-07-15 15:44:51 +10:00
David S. Miller	925068dcdc	Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-07-14 22:30:17 -07:00
Max Krasnyansky	f271b2cc78	tun: Fix/rewrite packet filtering logic Please see the following thread to get some context on this http://marc.info/?l=linux-netdev&m=121564433018903&w=2 Basically the issue is that current multi-cast filtering stuff in the TUN/TAP driver is seriously broken. Original patch went in without proper review and ACK. It was broken and confusing to start with and subsequent patches broke it completely. To give you an idea of what's broken here are some of the issues: - Very confusing comments throughout the code that imply that the character device is a network interface in its own right, and that packets are passed between the two nics. Which is completely wrong. - Wrong set of ioctls is used for setting up filters. They look like shortcuts for manipulating state of the tun/tap network interface but in reality manipulate the state of the TX filter. - ioctls that were originally used for setting address of the the TX filter got "fixed" and now set the address of the network interface itself. Which made filter totaly useless. - Filtering is done too late. Instead of filtering early on, to avoid unnecessary wakeups, filtering is done in the read() call. The list goes on and on :) So the patch cleans all that up. It introduces simple and clean interface for setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering before enqueuing the packets. TX filtering is useful in the scenarios where TAP is part of a bridge, in which case it gets all broadcast, multicast and potentially other packets when the bridge is learning. So for example Ethernet tunnelling app may want to setup TX filters to avoid tunnelling multicast traffic. QEMU and other hypervisors can push RX filtering that is currently done in the guest into the host context therefore saving wakeups and unnecessary data transfer. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:18:19 -07:00
David S. Miller	fc943b12e4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2008-07-14 20:40:34 -07:00
Patrick McHardy	72d9794f44	net-sched: cls_flow: add perturbation support Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 20:36:32 -07:00
David S. Miller	0c4c8cae44	Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6	2008-07-14 20:32:07 -07:00
David S. Miller	2aec609fb4	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/netfilter/nf_conntrack_proto_tcp.c	2008-07-14 20:23:54 -07:00
Linus Torvalds	5a86102248	Merge branch 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6 * 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6: (64 commits) firmware: convert sb16_csp driver to use firmware loader exclusively dsp56k: use request_firmware edgeport-ti: use request_firmware() edgeport: use request_firmware() vicam: use request_firmware() dabusb: use request_firmware() cpia2: use request_firmware() ip2: use request_firmware() firmware: convert Ambassador ATM driver to request_firmware() whiteheat: use request_firmware() ti_usb_3410_5052: use request_firmware() emi62: use request_firmware() emi26: use request_firmware() keyspan_pda: use request_firmware() keyspan: use request_firmware() ttusb-budget: use request_firmware() kaweth: use request_firmware() smctr: use request_firmware() firmware: convert ymfpci driver to use firmware loader exclusively firmware: convert maestro3 driver to use firmware loader exclusively ... Fix up trivial conflicts with BKL removal in drivers/char/dsp56k.c and drivers/char/ip2/ip2main.c manually.	2008-07-14 16:54:07 -07:00
Linus Torvalds	85082fd7cb	Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm * 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (241 commits) [ARM] 5171/1: ep93xx: fix compilation of modules using clocks [ARM] 5133/2: at91sam9g20 defconfig file [ARM] 5130/4: Support for the at91sam9g20 [ARM] 5160/1: IOP3XX: gpio/gpiolib support [ARM] at91: Fix NAND FLASH timings for at91sam9x evaluation kits. [ARM] 5084/1: zylonite: Register AC97 device [ARM] 5085/2: PXA: Move AC97 over to the new central device declaration model [ARM] 5120/1: pxa: correct platform driver names for PXA25x and PXA27x UDC drivers [ARM] 5147/1: pxaficp_ir: drop pxa_gpio_mode calls, as pin setting [ARM] 5145/1: PXA2xx: provide api to control IrDA pins state [ARM] 5144/1: pxaficp_ir: cleanup includes [ARM] pxa: remove pxa_set_cken() [ARM] pxa: allow clk aliases [ARM] Feroceon: don't disable BPU on boot [ARM] Orion: LED support for HP mv2120 [ARM] Orion: add RD88F5181L-FXO support [ARM] Orion: add RD88F5181L-GE support [ARM] Orion: add Netgear WNR854T support [ARM] s3c2410_defconfig: update for current build [ARM] Acer n30: Minor style and indentation fixes. ...	2008-07-14 16:06:58 -07:00
David Woodhouse	751851af7a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git Conflicts: sound/pci/Kconfig	2008-07-14 15:51:11 -07:00
Russell King	53ffe3b440	[ARM] Merge most of the PXA work for initial merge This includes PXA work up to the SPI changes for the initial merge, since `e172274ccc` depends on the SPI tree being merged. Conflicts: arch/arm/configs/em_x270_defconfig arch/arm/configs/xm_x270_defconfig	2008-07-14 23:34:46 +01:00
Linus Torvalds	666484f025	Merge branch 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softirq: remove irqs_disabled warning from local_bh_enable softirq: remove initialization of static per-cpu variable Remove argument from open_softirq which is always NULL	2008-07-14 15:28:42 -07:00
Linus Torvalds	4bb0057f99	Merge branch 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, generic: mark early_printk as asmlinkage printk: export console_drivers printk: remember the message level for multi-line output printk: refactor processing of line severity tokens printk: don't prefer unsuited consoles on registration printk: clean up recursion check related static variables namespacecheck: more kernel/printk.c fixes namespacecheck: fix kernel printk.c	2008-07-14 15:27:43 -07:00
Linus Torvalds	40e7babbb5	Merge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: fix kernel/fork.c warning lockdep: fix ftrace irq tracing false positive lockdep: remove duplicate definition of STATIC_LOCKDEP_MAP_INIT lockdep: add lock_class information to lock_chain and output it lockdep: add lock_class information to lock_chain and output it lockdep: output lock_class key instead of address for forward dependency output __mutex_lock_common: use signal_pending_state() mutex-debug: check mutex magic before owner Fixed up conflict in kernel/fork.c manually	2008-07-14 14:55:13 -07:00
Linus Torvalds	948769a5ba	Merge branch 'sched/new-API-sched_setscheduler' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched/new-API-sched_setscheduler' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: add new API sched_setscheduler_nocheck: add a flag to control access checks	2008-07-14 14:50:49 -07:00
Linus Torvalds	e18425a0ab	Merge branch 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits) ftrace: build fix for ftraced_suspend ftrace: separate out the function enabled variable ftrace: add ftrace_kill_atomic ftrace: use current CPU for function startup ftrace: start wakeup tracing after setting function tracer ftrace: check proper config for preempt type ftrace: trace schedule ftrace: define function trace nop ftrace: move sched_switch enable after markers ftrace: prevent ftrace modifications while being kprobe'd, v2 fix "ftrace: store mcount address in rec->ip" mmiotrace broken in linux-next (8-bit writes only) ftrace: avoid modifying kprobe'd records ftrace: freeze kprobe'd records kprobes: enable clean usage of get_kprobe ftrace: store mcount address in rec->ip ftrace: build fix with gcc 4.3 namespacecheck: fixes ftrace: fix "notrace" filtering priority ftrace: fix printout ...	2008-07-14 14:49:54 -07:00
Linus Torvalds	d1794f2c5b	Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6 * 'bkl-removal' of git://git.lwn.net/linux-2.6: (146 commits) IB/umad: BKL is not needed for ib_umad_open() IB/uverbs: BKL is not needed for ib_uverbs_open() bf561-coreb: BKL unneeded for open() Call fasync() functions without the BKL snd/PCM: fasync BKL pushdown ipmi: fasync BKL pushdown ecryptfs: fasync BKL pushdown Bluetooth VHCI: fasync BKL pushdown tty_io: fasync BKL pushdown tun: fasync BKL pushdown i2o: fasync BKL pushdown mpt: fasync BKL pushdown Remove BKL from remote_llseek v2 Make FAT users happier by not deadlocking x86-mce: BKL pushdown vmwatchdog: BKL pushdown vmcp: BKL pushdown via-pmu: BKL pushdown uml-random: BKL pushdown uml-mmapper: BKL pushdown ...	2008-07-14 14:48:31 -07:00
Stephen Rothwell	c300bd2fb5	PCI: include linux/pm_wakeup.h for device_set_wakeup_capable drivers/pci/pci.c needs pm_wakeup.h since it uses device_set_wakup_capable(). The latter also needs to be stubbed out for !CONFIG_PM. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-14 14:30:21 -07:00
Jonathan Corbet	2fceef397f	Merge commit 'v2.6.26' into bkl-removal	2008-07-14 15:29:34 -06:00
Joel Becker	11c3b79218	configfs: Allow ->make_item() and ->make_group() to return detailed errors. The configfs operations ->make_item() and ->make_group() currently return a new item/group. A return of NULL signifies an error. Because of this, -ENOMEM is the only return code bubbled up the stack. Multiple folks have requested the ability to return specific error codes when these operations fail. This patch adds that ability by changing the ->make_item/group() ops to return an int. Also updated are the in-kernel users of configfs. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2008-07-14 13:57:16 -07:00
Linus Torvalds	17489c058e	Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits) sched_clock: and multiplier for TSC to gtod drift sched_clock: record TSC after gtod sched_clock: only update deltas with local reads. sched_clock: fix calculation of other CPU sched_clock: stop maximum check on NO HZ sched_clock: widen the max and min time sched_clock: record from last tick sched: fix accounting in task delay accounting & migration sched: add avg-overlap support to RT tasks sched: terminate newidle balancing once at least one task has moved over sched: fix warning sched: build fix sched: sched_clock_cpu() based cpu_clock(), lockdep fix sched: export cpu_clock sched: make sched_{rt,fair}.c ifdefs more readable sched: bias effective_load() error towards failing wake_affine(). sched: incremental effective_load() sched: correct wakeup weight calculations sched: fix mult overflow sched: update shares on wakeup ...	2008-07-14 13:54:49 -07:00
Linus Torvalds	a3da5bf84a	Merge branch 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (821 commits) x86: make 64bit hpet_set_mapping to use ioremap too, v2 x86: get x86_phys_bits early x86: max_low_pfn_mapped fix #4 x86: change _node_to_cpumask_ptr to return const ptr x86: I/O APIC: remove an IRQ2-mask hack x86: fix numaq_tsc_disable calling x86, e820: remove end_user_pfn x86: max_low_pfn_mapped fix, #3 x86: max_low_pfn_mapped fix, #2 x86: max_low_pfn_mapped fix, #1 x86_64: fix delayed signals x86: remove conflicting nx6325 and nx6125 quirks x86: Recover timer_ack lost in the merge of the NMI watchdog x86: I/O APIC: Never configure IRQ2 x86: L-APIC: Always fully configure IRQ0 x86: L-APIC: Set IRQ0 as edge-triggered x86: merge dwarf2 headers x86: use AS_CFI instead of UNWIND_INFO x86: use ignore macro instead of hash comment x86: use matching CFI_ENDPROC ...	2008-07-14 13:43:24 -07:00
Linus Torvalds	3b23e665b6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (50 commits) crypto: ixp4xx - Select CRYPTO_AUTHENC crypto: s390 - Respect STFL bit crypto: talitos - Add support for sha256 and md5 variants crypto: hash - Move ahash functions into crypto/hash.h crypto: crc32c - Add ahash implementation crypto: hash - Added scatter list walking helper crypto: prng - Deterministic CPRNG crypto: hash - Removed vestigial ahash fields crypto: hash - Fixed digest size check crypto: rmd - sparse annotations crypto: rmd128 - sparse annotations crypto: camellia - Use kernel-provided bitops, unaligned access helpers crypto: talitos - Use proper form for algorithm driver names crypto: talitos - Add support for 3des crypto: padlock - Make module loading quieter when hardware isn't available crypto: tcrpyt - Remove unnecessary kmap/kunmap calls crypto: ixp4xx - Hardware crypto support for IXP4xx CPUs crypto: talitos - Freescale integrated security engine (SEC) driver [CRYPTO] tcrypt: Add self test for des3_ebe cipher operating in cbc mode [CRYPTO] rmd: Use pointer form of endian swapping operations ...	2008-07-14 13:40:42 -07:00
Jean Delvare	4735c98f84	i2c: Add detection capability to new-style drivers Add a mechanism to let new-style i2c drivers optionally autodetect devices they would support on selected buses and ask i2c-core to instantiate them. This is a replacement for legacy i2c drivers, much cleaner. Where drivers had to implement both a legacy i2c_driver and a new-style i2c_driver so far, this mechanism makes it possible to get rid of the legacy i2c_driver and implement both enumerated and detected device support with just one (new-style) i2c_driver. Here is a quick conversion guide for these drivers, step by step: * Delete the legacy driver definition, registration and removal. Delete the attach_adapter and detach_client methods of the legacy driver. * Change the prototype of the legacy detect function from static int foo_detect(struct i2c_adapter adapter, int address, int kind); to static int foo_detect(struct i2c_client client, int kind, struct i2c_board_info info); Set the new-style driver detect callback to this new function, and set its address_data to &addr_data (addr_data is generally provided by I2C_CLIENT_INSMOD.) * Add the appropriate class to the new-style driver. This is typically the class the legacy attach_adapter method was checking for. Class checking is now mandatory (done by i2c-core.) See <linux/i2c.h> for the list of available classes. * Remove the i2c_client allocation and freeing from the detect function. A pre-allocated client is now handed to you by i2c-core, and is freed automatically. * Make the detect function fill the type field of the i2c_board_info structure it was passed as a parameter, and return 0, on success. If the detection fails, return -ENODEV. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:36 +02:00
Wolfram Sang	2b7a5056a0	i2c: New-style EEPROM driver using device IDs Add a new-style driver for most I2C EEPROMs, giving sysfs read/write access to their data. Tested with various chips and clock rates. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:35 +02:00
Jon Smirl	e9ca9eb9d7	i2c: Export the i2c_bus_type symbol Export the root of the i2c bus so that PowerPC device tree code can iterate over devices on the i2c bus. Signed-off-by: Jon Smirl <jonsmirl@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:35 +02:00
Jean Delvare	f6a7110520	i2c-dev: Delete empty detach_client callback Implementing detach_client is optional, so there is no point in an empty implementation. Likewise, i2c driver IDs are optional, and we don't need one. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:34 +02:00
Jean Delvare	e3e7fc3c40	i2c-algo-pcf: Drop unused struct members Struct members udelay and timeout aren't used anywhere, so drop them. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Eric Brower <ebrower@gmail.com>	2008-07-14 22:38:31 +02:00
Eric Brower	0573d11b2b	i2c-algo-pcf: Multi-master lost-arbitration improvement Improve lost-arbitration handling of PCF8584. This is necessary for support of a currently out-of-kernel driver for Sun Microsystems E250 environmental management; perhaps others. Signed-off-by: Eric Brower <ebrower@gmail.com> Acked-by: Dan Smolik <marvin@mydatex.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:31 +02:00
Jean Delvare	3401b2fff3	i2c: Let bus drivers add SPD to their class Let general purpose I2C/SMBus bus drivers add SPD to their class. Once this is done, we will be able to tell the eeprom driver to only probe for SPD EEPROMs and similar on these buses. Note that I took a conservative approach here, adding I2C_CLASS_SPD to many drivers that have no idea whether they can host SPD EEPROMs or not. This is to make sure that the eeprom driver doesn't stop probing buses where SPD EEPROMs or equivalent live. So, bus driver maintainers and users should feel free to remove the SPD class from drivers those buses never have SPD EEPROMs or they don't want the eeprom driver to bind to them. Likewise, feel free to add the SPD class to any bus driver I might have missed. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:29 +02:00
Jean Delvare	c1b6b4f234	i2c: Let framebuffer drivers set their I2C bus class to DDC Let framebuffer drivers set their I2C bus class to DDC. Once this is done, we will be able to tell the eeprom driver to only probe for EDID EEPROMs on these buses. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:28 +02:00
Jean Delvare	ae7193f7fa	i2c: Update stray references to smbus_access That function is actually named i2c_smbus_xfer. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:24 +02:00
Jean Delvare	67c2e66571	i2c: Delete unused function i2c_smbus_write_quick Function i2c_smbus_write_quick has no users left, so we can delete it. Also update the list of these helper functions which are gone but could be added back if needed. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:23 +02:00
Adrian Bunk	20a9b6e7c3	i2c: Remove 3 deprecated bus drivers This patch contains the scheduled removal of i2c-i810, i2c-prosavage and i2c-savage4. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-14 22:38:22 +02:00
Linus Torvalds	847106ff62	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (25 commits) security: remove register_security hook security: remove dummy module fix security: remove dummy module security: remove unused sb_get_mnt_opts hook LSM/SELinux: show LSM mount options in /proc/mounts SELinux: allow fstype unknown to policy to use xattrs if present security: fix return of void-valued expressions SELinux: use do_each_thread as a proper do/while block SELinux: remove unused and shadowed addrlen variable SELinux: more user friendly unknown handling printk selinux: change handling of invalid classes (Was: Re: 2.6.26-rc5-mm1 selinux whine) SELinux: drop load_mutex in security_load_policy SELinux: fix off by 1 reference of class_to_string in context_struct_compute_av SELinux: open code sidtab lock SELinux: open code load_mutex SELinux: open code policy_rwlock selinux: fix endianness bug in network node address handling selinux: simplify ioctl checking SELinux: enable processes with mac_admin to get the raw inode contexts Security: split proc ptrace checking into read vs. attach ...	2008-07-14 13:36:55 -07:00
Linus Torvalds	b7f80afa28	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (71 commits) [S390] sclp_tty: Fix scheduling while atomic bug. [S390] sclp_tty: remove ioctl interface. [S390] Remove P390 support. [S390] Cleanup vmcp printk messages. [S390] Cleanup lcs printk messages. [S390] Cleanup kprobes printk messages. [S390] Cleanup vmwatch printk messages. [S390] Cleanup dcssblk printk messages. [S390] Cleanup zfcp dumper printk messages. [S390] Cleanup vmlogrdr printk messages. [S390] Cleanup s390 debug feature print messages. [S390] Cleanup monreader printk messages. [S390] Cleanup appldata printk messages. [S390] Cleanup smsgiucv printk messages. [S390] Cleanup cpacf printk messages. [S390] Cleanup qeth print messages. [S390] Cleanup netiucv printk messages. [S390] Cleanup iucv printk messages. [S390] Cleanup sclp printk messages. [S390] Cleanup zcrypt printk messages. ...	2008-07-14 13:25:01 -07:00
Linus Torvalds	dddec01eb8	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (37 commits) splice: fix generic_file_splice_read() race with page invalidation ramfs: enable splice write drivers/block/pktcdvd.c: avoid useless memset cdrom: revert commit `22a9189` (cdrom: use kmalloced buffers instead of buffers on stack) scsi: sr avoids useless buffer allocation block: blk_rq_map_kern uses the bounce buffers for stack buffers block: add blk_queue_update_dma_pad DAC960: push down BKL pktcdvd: push BKL down into driver paride: push ioctl down into driver block: use get_unaligned_* helpers block: extend queue_flag bitops block: request_module(): use format string Add bvec_merge_data to handle stacked devices and ->merge_bvec() block: integrity flags can't use bit ops on unsigned short cmdfilter: extend default read filter sg: fix odd style (extra parenthesis) introduced by cmd filter patch block: add bounce support to blk_rq_map_user_iov cfq-iosched: get rid of enable_idle being unused warning allow userspace to modify scsi command filter on per device basis ...	2008-07-14 13:15:14 -07:00
Kristen Carlson Accardi	18f7ba4c2f	libata/ahci: enclosure management support Add Enclosure Management support to libata and ahci. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-14 15:59:33 -04:00
Tejun Heo	87fbc5a060	libata: improve EH internal command timeout handling ATA_TMOUT_INTERNAL which was 30secs were used for all internal commands which is way too long when something goes wrong. This patch implements command type based stepped timeouts. Different command types can use different timeouts and each command type can use different timeout values after timeouts. ie. the initial timeout is set to a value which should cover most of the cases but not too long so that run away cases don't delay things too much. After the first try times out, the second try can use longer timeout and if that one times out too, it can go for full 30sec timeout. IDENTIFYs use 5s - 10s - 30s timeout and all other commands use 5s - 10s timeouts. This patch significantly cuts down the needed time to handle failure cases while still allowing libata to work with nut job devices through retries. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-14 15:59:32 -04:00
Tejun Heo	0a2c0f5615	libata: improve EH retry delay handling EH retries were delayed by 5 seconds to ensure that resets don't occur back-to-back. However, this 5 second delay is superflous or excessive in many cases. For example, after IDENTIFY times out, there's no reason to wait five more seconds before retrying. This patch adds ehc->last_reset timestamp and record the timestamp for the last reset trial or success and uses it to space resets by ATA_EH_RESET_COOL_DOWN which is 5 secs and removes unconditional 5 sec sleeps. As this change makes inter-try waits often shorter and they're redundant in nature, this patch also removes the "retrying..." messages. While at it, convert explicit rounding up division to DIV_ROUND_UP(). This change speeds up EH in many cases w/o sacrificing robustness. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-14 15:59:32 -04:00
Tejun Heo	341c2c958e	libata: consistently use msecs for time durations libata has been using mix of jiffies and msecs for time druations. This is getting confusing. As writing sub HZ values in jiffies is PITA and msecs_to_jiffies() can't be used as initializer, unify unit for all time durations to msecs. So, durations are in msecs and deadlines are in jiffies. ata_deadline() is added to compute deadline from a start time and duration in msecs. While at it, drop now superflous _msec suffix from arguments and rename @timeout to @deadline if it represents a fixed point in time rather than duration. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-14 15:59:32 -04:00
Michael Buesch	9c0c7a429a	ssb: Include dma-mapping.h ssb.h implements DMA mapping functions, so it should include dma-mapping.h. This fixes compile failures on certain architectures. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-14 14:52:56 -04:00
Artem Bityutskiy	4ee6afd344	VFS: export sync_sb_inodes This patch exports the 'sync_sb_inodes()' which is needed for UBIFS because it has to force write-back from time to time. Namely, the UBIFS budgeting subsystem forces write-back when its pessimistic callculations show that there is no free space on the media. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-14 19:10:52 +03:00
Ingo Molnar	5806b81ac1	Merge branch 'auto-ftrace-next' into tracing/for-linus Conflicts: arch/x86/kernel/entry_32.S arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c arch/x86/lib/Makefile include/asm-x86/irqflags.h kernel/Makefile kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-14 16:11:52 +02:00
Ingo Molnar	d14c8a680c	Merge branch 'sched/for-linus' into tracing/for-linus	2008-07-14 16:11:02 +02:00
Ingo Molnar	6712e299b7	Merge branch 'tracing/ftrace' into auto-ftrace-next	2008-07-14 15:58:35 +02:00
Kumar Gala	2f3804edf9	powerpc/85xx: Add support for MPC8536DS Add support for the MPC8536 process and MPC8536DS reference board. The MPC8536 is an e500v2 based SoC which eTSEC, USB, SATA, PCI, and PCIe. The USB and SATA IP blocks are similiar to those on the PQ2 Pro SoCs and thus use the same drivers. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-07-14 07:55:37 -05:00
Ingo Molnar	361833efac	Merge branch 'sched/clock' into sched/devel	2008-07-14 12:19:13 +02:00
Ingo Molnar	b4ba0ba24b	Merge commit 'v2.6.26' into core/locking	2008-07-14 10:31:59 +02:00
Cornelia Huck	f08adc008d	[S390] css: Use css_device_id for bus matching. css_device_id exists, so use it for determining the right driver (and add a match_flags which is always 1 for valid types). Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-07-14 10:02:12 +02:00
Cornelia Huck	7e9db9eaef	[S390] cio: Introduce modalias for css bus. Add modalias and subchannel type attributes for all subchannels. I/O subchannel specific attributes are now created in io_subchannel_probe(). modalias and subchannel type are also added to the uevent for the css bus. Also make the css modalias known. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-07-14 10:02:05 +02:00
James Morris	6f0f0fd496	security: remove register_security hook The register security hook is no longer required, as the capability module is always registered. LSMs wishing to stack capability as a secondary module should do so explicitly. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-14 15:04:06 +10:00
Miklos Szeredi	b478a9f988	security: remove unused sb_get_mnt_opts hook The sb_get_mnt_opts() hook is unused, and is superseded by the sb_show_options() hook. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: James Morris <jmorris@namei.org>	2008-07-14 15:02:05 +10:00
Eric Paris	2069f45784	LSM/SELinux: show LSM mount options in /proc/mounts This patch causes SELinux mount options to show up in /proc/mounts. As with other code in the area seq_put errors are ignored. Other LSM's will not have their mount options displayed until they fill in their own security_sb_show_options() function. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: James Morris <jmorris@namei.org>	2008-07-14 15:02:05 +10:00
Stephen Smalley	006ebb40d3	Security: split proc ptrace checking into read vs. attach Enable security modules to distinguish reading of process state via proc from full ptrace access by renaming ptrace_may_attach to ptrace_may_access and adding a mode argument indicating whether only read access or full attach access is requested. This allows security modules to permit access to reading process state without granting full ptrace access. The base DAC/capability checking remains unchanged. Read access to /proc/pid/mem continues to apply a full ptrace attach check since check_mem_permission() already requires the current task to already be ptracing the target. The other ptrace checks within proc for elements like environ, maps, and fds are changed to pass the read mode instead of attach. In the SELinux case, we model such reading of process state as a reading of a proc file labeled with the target process' label. This enables SELinux policy to permit such reading of process state without permitting control or manipulation of the target process, as there are a number of cases where programs probe for such information via proc but do not need to be able to control the target (e.g. procps, lsof, PolicyKit, ConsoleKit). At present we have to choose between allowing full ptrace in policy (more permissive than required/desired) or breaking functionality (or in some cases just silencing the denials via dontaudit rules but this can hide genuine attacks). This version of the patch incorporates comments from Casey Schaufler (change/replace existing ptrace_may_attach interface, pass access mode), and Chris Wright (provide greater consistency in the checking). Note that like their predecessors __ptrace_may_attach and ptrace_may_attach, the __ptrace_may_access and ptrace_may_access interfaces use different return value conventions from each other (0 or -errno vs. 1 or 0). I retained this difference to avoid any changes to the caller logic but made the difference clearer by changing the latter interface to return a bool rather than an int and by adding a comment about it to ptrace.h for any future callers. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-07-14 15:01:47 +10:00
Benjamin Herrenschmidt	11c2d8174e	Merge commit 'origin/HEAD' into test-merge Manual fixup of include/asm-powerpc/pgtable-ppc64.h	2008-07-14 14:29:49 +10:00
Richard Kennedy	afc1246f91	file lock: reorder struct file_lock to save space on 64 bit builds Reduce sizeof struct file_lock by 8 on 64 bit builds allowing +1 objects per slab in the file_lock_cache Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-13 17:22:24 -04:00
Russell King	044e5f45e4	Merge branch 'pxa' into devel Conflicts: arch/arm/configs/em_x270_defconfig arch/arm/configs/xm_x270_defconfig	2008-07-13 12:05:49 +01:00
Gerrit Renker	5b5d0e7048	dccp: Upgrade NDP count from 3 to 6 bytes RFC 4340, 7.7 specifies up to 6 bytes for the NDP Count option, whereas the code is currently limited to up to 3 bytes. This seems to be a relict of an earlier draft version and is brought up to date by the patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>	2008-07-13 11:51:40 +01:00
Ingo Molnar	54ef76f37b	Merge branch 'linus' into sched/devel	2008-07-13 08:50:13 +02:00
Eric Miao	52256c0e06	[NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data Now that the original SMC_USE_PXA_DMA specific code will always being built if CONFIG_ARCH_PXA is defined, so to make this part of the code to be PXA public, and still prevent it from being built if support of PXA is not selected. A SMC91X_USE_DMA flag is added to the platform data to allow platform to choose its usage of DMA. Note this flag itself is so named to be generic enough (assuming other platforms can also use DMA). It keeps backward compatibility to set the SMC91X_USE_DMA flag if SMC_USE_PXA_DMA is still defined. Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Nicolas Pitre <nico@cam.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-12 21:52:41 +01:00
Eric Miao	159198862a	[NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable Now one can use the following code #define SMC_IO_SHIFT lp->io_shift to make SMC_IO_SHIFT a variable. This, however, will slightly increase the CPU overhead and have negative impact on the network performance. The tradeoff is, this can be specified in the smc91x platform data so that multiple boards support can be built in a single zImage. Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Nicolas Pitre <nico@cam.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-12 21:52:40 +01:00
Eric Miao	c4f0e76747	[NET] smc91x: add SMC91X_NOWAIT flag to platform data And also favors the usage of SMC91X_NOWAIT over the hardcoded SMC_NOWAIT by converting "nowait" (module parameter overridable) to platform flag. There are several possibilities: 1. platform data present - preferred and use as is 2. platform data absent - use "nowait", it can be: a. SMC_NOWAIT if defined b. default to 0 if SMC_NOWAIT isn't defined c. overriden by module parameter Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Nicolas Pitre <nico@cam.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-12 21:52:40 +01:00
Eric Miao	d280eadc4f	[NET] smc91x: remove "irq_flags" from "struct smc91x_platdata" IRQ trigger type can be specified in the IRQ resource definition by IORESOURCE_IRQ_, we need only one way to specify this. This also fixes the following small issue: To allow dynamic support for multiple platforms, when those relevant macros are not defined for one specific platform, the default case will be: - SMC_DYNAMIC_BUS_CONFIG defined - and SMC_IRQ_FLAGS = IRQF_TRIGGER_RISING While if "irq_flags" is missing when defining the smc91x_platdata, usually as follows: static struct smc91x_platdata xxxx_smc91x_data = { .flags = SMC91X_USE_XXBIT, }; The lp->cfg.irq_flags will always be overriden by the above structure (due to a memcpy), thus rendering lp->cfg.irq_flags to be "0" always. (regardless of the default SMC_IRQ_FLAGS or IORESOURCE_IRQ_ flags) Fixes this by forcing to use IORESOURCE_IRQ_* flags if present, and make the only user of smc91x_platdata.irq_flags (renesas/migor) to use IORESOURCE_IRQ_*. Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Nicolas Pitre <nico@cam.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-12 21:52:40 +01:00
Russell King	7fecc34e07	Merge branch 'pxa-tosa' into pxa Conflicts: arch/arm/mach-pxa/Kconfig arch/arm/mach-pxa/tosa.c arch/arm/mach-pxa/spitz.c	2008-07-12 21:43:01 +01:00
Martin K. Petersen	f11f594edb	[SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC The SCSI Block Protocol uses this 16-bit CRC to verify the integrity of each data sector. crc_t10dif() is used by sd_dif.c when performing I/O to or from disks formatted with protection information. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:32 -05:00
Suresh Siddha	75c46fa61b	x64, x2apic/intr-remap: MSI and MSI-X support for interrupt remapping infrastructure MSI and MSI-X support for interrupt remapping infrastructure. MSI address register will be programmed with interrupt-remapping table entry(IRTE) index and the IRTE will contain information about the vector, cpu destination, etc. For MSI-X, all the IRTE's will be consecutively allocated in the table, and the address registers will contain the starting index to the block and the data register will contain the subindex with in that block. This also introduces a new irq_chip for cleaner irq migration (in the process context as opposed to the current irq migration in the context of an interrupt. interrupt-remapping infrastructure will help us achieve this). As MSI is edge triggered, irq migration is a simple atomic update(of vector and cpu destination) of IRTE and flushing the hardware cache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:45:05 +02:00
Suresh Siddha	89027d35aa	x64, x2apic/intr-remap: IO-APIC support for interrupt-remapping IO-APIC support in the presence of interrupt-remapping infrastructure. IO-APIC RTE will be programmed with interrupt-remapping table entry(IRTE) index and the IRTE will contain information about the vector, cpu destination, trigger mode etc, which traditionally was present in the IO-APIC RTE. Introduce a new irq_chip for cleaner irq migration (in the process context as opposed to the current irq migration in the context of an interrupt. interrupt-remapping infrastructure will help us achieve this cleanly). For edge triggered, irq migration is a simple atomic update(of vector and cpu destination) of IRTE and flush the hardware cache. For level triggered, we need to modify the io-apic RTE aswell with the update vector information, along with modifying IRTE with vector and cpu destination. So irq migration for level triggered is little bit more complex compared to edge triggered migration. But the good news is, we use the same algorithm for level triggered migration as we have today, only difference being, we now initiate the irq migration from process context instead of the interrupt context. In future, when we do a directed EOI (combined with cpu EOI broadcast suppression) to the IO-APIC, level triggered irq migration will also be as simple as edge triggered migration and we can do the irq migration with a simple atomic update to IO-APIC RTE. TBD: some tests/changes needed in the presence of fixup_irqs() for level triggered irq migration. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:45:05 +02:00
Suresh Siddha	72b1e22dfc	x64, x2apic/intr-remap: generic irq migration support from process context Generic infrastructure for migrating the irq from the process context in the presence of CONFIG_GENERIC_PENDING_IRQ. This will be used later for migrating irq in the presence of interrupt-remapping. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:44:55 +02:00
Suresh Siddha	b6fcb33ad6	x64, x2apic/intr-remap: routines managing Interrupt remapping table entries. Routines handling the management of interrupt remapping table entries. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:44:54 +02:00
Suresh Siddha	2ae2101069	x64, x2apic/intr-remap: Interrupt remapping infrastructure Interrupt remapping (part of Intel Virtualization Tech for directed I/O) infrastructure. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:44:53 +02:00
Suresh Siddha	ad3ad3f6a2	x64, x2apic/intr-remap: parse ioapic scope under vt-d structures Parse the vt-d device scope structures to find the mapping between IO-APICs and the interrupt remapping hardware units. This will be used later for enabling Interrupt-remapping for IOAPIC devices. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:44:50 +02:00
Suresh Siddha	1886e8a90a	x64, x2apic/intr-remap: code re-structuring, to be used by both DMA and Interrupt remapping Allocate the iommu during the parse of DMA remapping hardware definition structures. And also, introduce routines for device scope initialization which will be explicitly called during dma-remapping initialization. These will be used for enabling interrupt remapping separately from the existing DMA-remapping enabling sequence. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Cc: andi@firstfloor.org Cc: ebiederm@xmission.com Cc: jbarnes@virtuousgeek.org Cc: steiner@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 08:44:48 +02:00
Ingo Molnar	ae94b8075a	Merge branch 'linus' into x86/core Conflicts: arch/x86/mm/ioremap.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-12 07:29:02 +02:00
Aneesh Kumar K.V	06d6cf6959	mm: Add range_cont mode for writeback Filesystems like ext4 needs to start a new transaction in the writepages for block allocation. This happens with delayed allocation and there is limit to how many credits we can request from the journal layer. So we call write_cache_pages multiple times with wbc->nr_to_write set to the maximum possible value limitted by the max journal credits available. Add a new mode to writeback that enables us to handle this behaviour. In the new mode we update the wbc->range_start to point to the new offset to be written. Next call to call to write_cache_pages will start writeout from specified range_start offset. In the new mode we also limit writing to the specified wbc->range_end. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-07-11 19:27:31 -04:00
Mingming Cao	e8ced39d5e	percpu_counter: new function percpu_counter_sum_and_set Delayed allocation need to check free blocks at every write time. percpu_counter_read_positive() is not quit accurate. delayed allocation need a more accurate accounting, but using percpu_counter_sum_positive() is frequently is quite expensive. This patch added a new function to update center counter when sum per-cpu counter, to increase the accurate rate for next percpu_counter_read() and require less calling expensive percpu_counter_sum(). Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-07-11 19:27:31 -04:00
Alex Tomas	29a814d2ee	vfs: add hooks for ext4's delayed allocation support Export mpage_bio_submit() and __mpage_writepage() for the benefit of ext4's delayed allocation support. Also change __block_write_full_page so that if buffers that have the BH_Delay flag set it will call get_block() to get the physical block allocated, just as in the !BH_Mapped case. Signed-off-by: Alex Tomas <alex@clusterfs.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-07-11 19:27:31 -04:00
Jan Kara	87c89c232c	jbd2: Remove data=ordered mode support using jbd buffer heads Signed-off-by: Jan Kara <jack@suse.cz>	2008-07-11 19:27:31 -04:00
Jan Kara	c851ed5401	jbd2: Implement data=ordered mode handling via inodes This patch adds necessary framework into JBD2 to be able to track inodes with each transaction and write-out their dirty data during transaction commit time. This new ordered mode brings all sorts of advantages such as possibility to get rid of journal heads and buffer heads for data buffers in ordered mode, better ordering of writes on transaction commit, simplification of some JBD code, no more anonymous pages when truncate of data being committed happens. Also with this new ordered mode, delayed allocation on ordered mode is much simpler. Signed-off-by: Jan Kara <jack@suse.cz>	2008-07-11 19:27:31 -04:00
Jan Kara	f4c0a0fdfa	vfs: export filemap_fdatawrite_range() Make filemap_fdatawrite_range() function public, so that it can later be used in ordered mode rewrite by JBD/JBD2. Signed-off-by: Jan Kara <jack@suse.cz>	2008-07-11 19:27:31 -04:00
Theodore Ts'o	736603ab29	jbd2: Add commit time into the commit block Carlo Wood has demonstrated that it's possible to recover deleted files from the journal. Something that will make this easier is if we can put the time of the commit into commit block. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-07-11 19:27:31 -04:00
Steven Rostedt	af52a90a14	sched_clock: stop maximum check on NO HZ Working with ftrace I would get large jumps of 11 millisecs or more with the clock tracer. This killed the latencing timings of ftrace and also caused the irqoff self tests to fail. What was happening is with NO_HZ the idle would stop the jiffy counter and before the jiffy counter was updated the sched_clock would have a bad delta jiffies to compare with the gtod with the maximum. The jiffies would stop and the last sched_tick would record the last gtod. On wakeup, the sched clock update would compare the gtod + delta jiffies (which would be zero) and compare it to the TSC. The TSC would have correctly (with a stable TSC) moved forward several jiffies. But because the jiffies has not been updated yet the clock would be prevented from moving forward because it would appear that the TSC jumped too far ahead. The clock would then virtually stop, until the jiffies are updated. Then the next sched clock update would see that the clock was very much behind since the delta jiffies is now correct. This would then jump the clock forward by several jiffies. This caused ftrace to report several milliseconds of interrupts off latency at every resume from NO_HZ idle. This patch adds hooks into the nohz code to disable the checking of the maximum clock update when nohz is in effect. It resumes the max check when nohz has updated the jiffies again. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-11 15:53:26 +02:00
Steven Rostedt	a2bb6a3d85	ftrace: add ftrace_kill_atomic It has been suggested that I add a way to disable the function tracer on an oops. This code adds a ftrace_kill_atomic. It is not meant to be used in normal situations. It will disable the ftrace tracer, but will not perform the nice shutdown that requires scheduling. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-11 15:49:21 +02:00
David Woodhouse	a8931ef380	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-07-11 14:36:25 +01:00
Andre Noll	7e93a89251	md: Remove some unused macros. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>	2008-07-11 22:02:23 +10:00
Andre Noll	0f420358e3	md: Turn rdev->sb_offset into a sector-based quantity. Rename it to sb_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>	2008-07-11 22:02:23 +10:00
Neil Brown	0306d5efbf	Merge branch 'master' into for-next	2008-07-11 21:57:40 +10:00
Ingo Molnar	0c81b2a144	Merge branch 'linus' into core/rcu Conflicts: include/linux/rculist.h kernel/rcupreempt.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-11 10:46:50 +02:00
Alexander Duyck	d815653404	net: add netif_napi_del function to allow for removal of napistructs Adds netif_napi_del function which is used to remove the napi struct from the netdev napi_list in cases where CONFIG_NETPOLL was enabled. The motivation for adding this is to handle the case in which the number of queues on a device changes due to a configuration change. Previously the napi structs for each queue would be left in the list until the netdev was freed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-11 01:20:33 -04:00
Linus Torvalds	e5a5816f78	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) tun: Persistent devices can get stuck in xoff state xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info ipv6: missed namespace context in ipv6_rthdr_rcv netlabel: netlink_unicast calls kfree_skb on error path by itself ipv4: fib_trie: Fix lookup error return tcp: correct kcalloc usage ip: sysctl documentation cleanup Documentation: clarify tcp_{r,w}mem sysctl docs netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP netfilter: nf_conntrack_tcp: fix endless loop libertas: fix memory alignment problems on the blackfin zd1211rw: stop beacons on remove_interface rt2x00: Disable synchronization during initialization rc80211_pid: Fix fast_start parameter handling sctp: Add documentation for sctp sysctl variable ipv6: fix race between ipv6_del_addr and DAD timer irda: Fix netlink error path return value irda: New device ID for nsc-ircc irda: via-ircc proper dma freeing sctp: Mark the tsn as received after all allocations finish ...	2008-07-10 17:58:47 -07:00
Steffen Klassert	ccf9b3b83d	xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info Add a XFRM_STATE_AF_UNSPEC flag to handle the AF_UNSPEC behavior for the selector family. Userspace applications can set this flag to leave the selector family of the xfrm_state unspecified. This can be used to to handle inter family tunnels if the selector is not set from userspace. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-10 16:55:37 -07:00
David Woodhouse	f1485f3deb	ihex: request_ihex_firmware() function to load and validate firmware Provide a helper to load the file and validate it in one call, to simplify error handling in the drivers which are going to use it. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-07-10 14:47:38 +01:00
David Woodhouse	bacfe09dd7	ihex.h: binary representation of ihex records Some devices need their firmware as a set of {address, len, data...} records in some specific order rather than a simple blob. The normal way of doing this kind of thing is 'ihex', which is a text format and not entirely suitable for use in the kernel. This provides a binary representation which is very similar, but much more compact -- and a helper routine to skip to the next record, because the alignment constraints mean that everybody will screw it up for themselves otherwise. Also a helper function which can verify that a 'struct firmware' contains a valid set of ihex records, and that following them won't run off the end of the loaded data. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-07-10 14:47:36 +01:00
David Woodhouse	5658c76944	firmware: allow firmware files to be built into kernel image Some drivers have their own hacks to bypass the kernel's firmware loader and build their firmware into the kernel; this renders those unnecessary. Other drivers don't use the firmware loader at all, because they always want the firmware to be available. This allows them to start using the firmware loader. A third set of drivers already use the firmware loader, but can't be used without help from userspace, which sometimes requires an initrd. This allows them to work in a static kernel. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-07-10 14:30:13 +01:00
David Woodhouse	b7a39bd0af	firmware: make fw->data const In preparation for supporting firmware files linked into the static kernel, make fw->data const to ensure that users aren't modifying it (so that we can pass a pointer to the original in-kernel copy, rather than having to copy it). Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-07-10 14:29:25 +01:00
Herbert Xu	18e33e6d5c	crypto: hash - Move ahash functions into crypto/hash.h All new crypto interfaces should go into individual files as much as possible in order to ensure that crypto.h does not collapse under its own weight. This patch moves the ahash code into crypto/hash.h and crypto/internal/hash.h respectively. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-07-10 20:35:18 +08:00
Herbert Xu	166247f46a	crypto: hash - Removed vestigial ahash fields The base field in ahash_tfm appears to have been cut-n-pasted from ablkcipher. It isn't needed here at all. Similarly, the info field in ahash_request also appears to have originated from its cipher counter-part and is vestigial. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-07-10 20:35:18 +08:00
Loc Ho	004a403c2e	[CRYPTO] hash: Add asynchronous hash support This patch adds asynchronous hash and digest support. Signed-off-by: Loc Ho <lho@amcc.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-07-10 20:35:13 +08:00
Ingo Molnar	5373fdbdc1	Merge branch 'tracing/mmiotrace' into auto-ftrace-next	2008-07-10 11:43:06 +02:00
Ingo Molnar	bac0c9103b	Merge branch 'tracing/ftrace' into auto-ftrace-next	2008-07-10 11:43:00 +02:00
Ingo Molnar	9e4144abf8	Merge branch 'linus' into core/printk Conflicts: kernel/printk.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-10 08:17:14 +02:00
Russell King	f974a8ec96	Merge branch 'machtypes' into pxa-palm	2008-07-09 21:34:25 +01:00
Chuck Lever	0e0cab744b	NFS: use documenting macro constants for initializing ac{reg, dir}{min, max} Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:40 -04:00
Trond Myklebust	259875efed	NFS: set transport defaults after mount option parsing is finished Move the UDP/TCP default timeo/retrans settings for text mounts to nfs_init_timeout_values(), which was were they were always being initialised (and sanity checked) for binary mounts. Document the default timeout values using appropriate #defines. Ensure that we initialise and sanity check the transport protocols that may have been specified by the user. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:38 -04:00
Chuck Lever	ce3b7e1906	NFS: Add string length argument to nfs_parse_server_address To make nfs_parse_server_address() more generally useful, allow it to accept input strings that are not terminated with '\0'. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:28 -04:00
Trond Myklebust	e468bae97d	NFS: Allow redirtying of a completed unstable write. Currently, if an unstable write completes, we cannot redirty the page in order to reflect a new change in the page data until after we've sent a COMMIT request. This patch allows a page rewrite to proceed without the unnecessary COMMIT step, putting it immediately back onto the dirty page list, undoing the VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from the NFS radix tree. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:24 -04:00
Chuck Lever	34e8f92831	NFS: Move fs/nfs/iostat.h to include/linux The fs/nfs/iostat.h header has definitions that were designed to be exposed to user space. Move these definitions under include/linux so user space can use the definitions in applications that read /proc/self/mountstats. Also address a handful of coding style issues called out by checkpatch.pl in fs/nfs/iostat.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:17 -04:00
Trond Myklebust	46cb650c22	NFS: Remove the redundant file_open entry from struct nfs_rpc_ops All instances are set to nfs_open(), so we should just remove the redundant indirection. Ditto for the file_release op Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:16 -04:00
$\\\"J. Bruce Fields\\\$ \\\"J. Bruce Fields\\\	a486aeda9b	rpc: minor cleanup of scheduler callback code Try to make the comment here a little more clear and concise. Also, this macro definition seems unnecessary. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:14 -04:00
Olga Kornievskaia	b6b6152c46	rpc: bring back cl_chatty The cl_chatty flag alows us to control whether a given rpc client leaves "server X not responding, timed out" messages in the syslog. Such messages make sense for ordinary nfs clients (where an unresponsive server means applications on the mountpoint are probably hanging), but not for the callback client (which can fail more commonly, with the only result just of disabling some optimizations). Previously cl_chatty was removed, do to lack of users; reinstate it, and use it for the nfsd's callback client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:10 -04:00
Trond Myklebust	2116271a34	NFS: Add correct bounds checking to NFSv2 locks NFSv2 file locking currently fails the Connectathon tests, because the calls to the VFS locking code do not return an EINVAL error if the struct file_lock overflows the 32-bit boundaries. The problem is due to the fact that we occasionally call helpers from fs/locks.c in order to avoid RPC calls to the server when we know that a local process holds the lock. These helpers are, of course, always 64-bit enabled, so EINVAL is not returned in cases when it would if the call had gone to the NLM code. For consistency, we therefore add support for a bounds-checking helper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:40 -04:00
Ingo Molnar	d028203c04	Merge branch 'x86/core' into x86/unify-pci	2008-07-09 11:39:02 +02:00
Dave Kleikamp	aba46c5027	powerpc/mm: Define flags for Strong Access Ordering This patch defines: - PROT_SAO, which is passed into mmap() and mprotect() in the prot field - VM_SAO in vma->vm_flags, and - _PAGE_SAO, the combination of WIMG bits in the pte that enables strong access ordering for the page. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-09 16:30:45 +10:00
Dave Kleikamp	b845f313d7	mm: Allow architectures to define additional protection bits This patch allows architectures to define functions to deal with additional protections bits for mmap() and mprotect(). arch_calc_vm_prot_bits() maps additonal protection bits to vm_flags arch_vm_get_page_prot() maps additional vm_flags to the vma's vm_page_prot arch_validate_prot() checks for valid values of the protection bits Note: vm_get_page_prot() is now pretty ugly, but the generated code should be identical for architectures that don't define additional protection bits. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-09 16:30:45 +10:00
David S. Miller	79d16385c7	netdev: Move atomic queue state bits into netdev_queue. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:14:46 -07:00
David S. Miller	b19fa1fa91	net: Delete NETDEVICES_MULTIQUEUE kconfig option. Multiple TX queue support is a core networking feature. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:14:24 -07:00
David S. Miller	c773e847ea	netdev: Move _xmit_lock and xmit_lock_owner into netdev_queue. Accesses are mostly structured such that when there are multiple TX queues the code transformations will be a little bit simpler. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:13:53 -07:00
David S. Miller	86d804e10a	netdev: Make netif_schedule() routines work with netdev_queue objects. Only plain netif_schedule() remains taking a net_device, mostly as a compatability item while we transition the rest of these interfaces. Everything else calls netif_schedule_queue() or __netif_schedule(), both of which take a netdev_queue pointer. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:11:25 -07:00
David S. Miller	970565bbad	netdev: Move gso_skb into netdev_queue. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:10:33 -07:00
David S. Miller	ee609cb362	netdev: Move next_sched into struct netdev_queue. We schedule queues, not the device, for output queue processing in BH. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 22:58:37 -07:00
David S. Miller	816f3258e7	netdev: Kill qdisc_ingress, use netdev->rx_queue.qdisc instead. Now that our qdisc management is bi-directional, per-queue, and fully orthogonal, there is no reason to have a special ingress qdisc pointer in struct net_device. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 22:49:00 -07:00
David S. Miller	b0e1e6462d	netdev: Move rest of qdisc state into struct netdev_queue Now qdisc, qdisc_sleeping, and qdisc_list also live there. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:42:10 -07:00
David S. Miller	555353cfa1	netdev: The ingress_lock member is no longer needed. Every qdisc is assosciated with a queue, and in the case of ingress qdiscs that will now be netdev->rx_queue so using that queue's lock is the thing to do. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:33:13 -07:00
David S. Miller	dc2b48475a	netdev: Move queue_lock into struct netdev_queue. The lock is now an attribute of the device queue. One thing to notice is that "suspicious" places emerge which will need specific training about multiple queue handling. They are so marked with explicit "netdev->rx_queue" and "netdev->tx_queue" references. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:18:23 -07:00
David S. Miller	bb949fbd18	netdev: Create netdev_queue abstraction. A netdev_queue is an entity managed by a qdisc. Currently there is one RX and one TX queue, and a netdev_queue merely contains a backpointer to the net_device. The Qdisc struct is augmented with a netdev_queue pointer as well. Eventually the 'dev' Qdisc member will go away and we will have the resulting hierarchy: net_device --> netdev_queue --> Qdisc Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue pointer argument. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 16:55:56 -07:00
David S. Miller	54dceb008f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2008-07-08 15:39:41 -07:00
Patrick McHardy	11a100f844	vlan: avoid header copying and linearisation where possible - vlan_dev_reorder_header() is only called on the receive path after calling skb_share_check(). This means we can use skb_cow() since all we need is a writable header. - vlan_dev_hard_header() includes a work-around for some apparently broken out of tree MPLS code. The hard_header functions can expect to always have a headroom of at least there own hard_header_len available, so the reallocation check is unnecessary. - __vlan_put_tag() can use skb_cow_head() to avoid the skb_unshare() copy when the header is writable. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 15:36:57 -07:00
Haavard Skinnemoen	3bfb1d20b5	dmaengine: Driver for the Synopsys DesignWare DMA controller This adds a driver for the Synopsys DesignWare DMA controller (aka DMACA on AVR32 systems.) This DMA controller can be found integrated on the AT32AP7000 chip and is primarily meant for peripheral DMA transfer, but can also be used for memory-to-memory transfers. This patch is based on a driver from David Brownell which was based on an older version of the DMA Engine framework. It also implements the proposed extensions to the DMA Engine API for slave DMA operations. The dmatest client shows no problems, but there may still be room for improvement performance-wise. DMA slave transfer performance is definitely "good enough"; reading 100 MiB from an SD card running at ~20 MHz yields ~7.2 MiB/s average transfer rate. Full documentation for this controller can be found in the Synopsys DW AHB DMAC Databook: http://www.synopsys.com/designware/docs/iip/DW_ahb_dmac/latest/doc/dw_ahb_dmac_db.pdf The controller has lots of implementation options, so it's usually a good idea to check the data sheet of the chip it's intergrated on as well. The AT32AP7000 data sheet can be found here: http://www.atmel.com/dyn/products/datasheets.asp?family_id=682 Changes since v4: * Use client_count instead of dma_chan_is_in_use() * Add missing include * Unmap buffers unless client told us not to Changes since v3: * Update to latest DMA engine and DMA slave APIs * Embed the hw descriptor into the sw descriptor * Clean up and update MODULE_DESCRIPTION, copyright date, etc. Changes since v2: * Dequeue all pending transfers in terminate_all() * Rename dw_dmac.h -> dw_dmac_regs.h * Define and use controller-specific dma_slave data * Fix up a few outdated comments * Define hardware registers as structs (doesn't generate better code, unfortunately, but it looks nicer.) * Get number of channels from platform_data instead of hardcoding it based on CONFIG_WHATEVER_CPU. * Give slave clients exclusive access to the channel Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>, Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-08 11:59:42 -07:00
Haavard Skinnemoen	dc0ee6435c	dmaengine: Add slave DMA interface This patch adds the necessary interfaces to the DMA Engine framework to use functionality found on most embedded DMA controllers: DMA from and to I/O registers with hardware handshaking. In this context, hardware hanshaking means that the peripheral that owns the I/O registers in question is able to tell the DMA controller when more data is available for reading, or when there is room for more data to be written. This usually happens internally on the chip, but these signals may also be exported outside the chip for things like IDE DMA, etc. A new struct dma_slave is introduced. This contains information that the DMA engine driver needs to set up slave transfers to and from a slave device. Most engines supporting DMA slave transfers will want to extend this structure with controller-specific parameters. This additional information is usually passed from the platform/board code through the client driver. A "slave" pointer is added to the dma_client struct. This must point to a valid dma_slave structure iff the DMA_SLAVE capability is requested. The DMA engine driver may use this information in its device_alloc_chan_resources hook to configure the DMA controller for slave transfers from and to the given slave device. A new operation for preparing slave DMA transfers is added to struct dma_device. This takes a scatterlist and returns a single descriptor representing the whole transfer. Another new operation for terminating all pending transfers is added as well. The latter is needed because there may be errors outside the scope of the DMA Engine framework that may require DMA operations to be terminated prematurely. DMA Engine drivers may extend the dma_device, dma_chan and/or dma_slave_descriptor structures to allow controller-specific operations. The client driver can detect such extensions by looking at the DMA Engine's struct device, or it can request a specific DMA Engine device by setting the dma_dev field in struct dma_slave. dmaslave interface changes since v4: * Fix checkpatch errors * Fix changelog (there are no slave descriptors anymore) dmaslave interface changes since v3: * Use dma_data_direction instead of a new enum * Submit slave transfers as scatterlists * Remove the DMA slave descriptor struct dmaslave interface changes since v2: * Add a dma_dev field to struct dma_slave. If set, the client can only be bound to the DMA controller that corresponds to this device. This allows controller-specific extensions of the dma_slave structure; if the device matches, the controller may safely assume its extensions are present. * Move reg_width into struct dma_slave as there are currently no users that need to be able to set the width on a per-transfer basis. dmaslave interface changes since v1: * Drop the set_direction and set_width descriptor hooks. Pass the direction and width to the prep function instead. * Declare a dma_slave struct with fixed information about a slave, i.e. register addresses, handshake interfaces and such. * Add pointer to a dma_slave struct to dma_client. Can be NULL if the DMA_SLAVE capability isn't requested. * Drop the set_slave device hook since the alloc_chan_resources hook now has enough information to set up the channel for slave transfers. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-08 11:59:35 -07:00
Dan Williams	e1d181efb1	dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap In some cases client code may need the dma-driver to skip the unmap of source and/or destination buffers. Setting these flags indicates to the driver to skip the unmap step. In this regard async_xor is currently broken in that it allows the destination buffer to be unmapped while an operation is still in progress, i.e. when the number of sources exceeds the hardware channel's maximum (fixed in a subsequent patch). Acked-by: Saeed Bishara <saeed@marvell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-08 11:59:12 -07:00
Haavard Skinnemoen	848c536a37	dmaengine: Add dma_client parameter to device_alloc_chan_resources A DMA controller capable of doing slave transfers may need to know a few things about the slave when preparing the channel. We don't want to add this information to struct dma_channel since the channel hasn't yet been bound to a client at this point. Instead, pass a reference to the client requesting the channel to the driver's device_alloc_chan_resources hook so that it can pick the necessary information from the dma_client struct by itself. [dan.j.williams@intel.com: fixed up fsldma and mv_xor] Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-08 11:58:58 -07:00
Dan Williams	7cc5bf9a3a	dmaengine: track the number of clients using a channel Haavard's dma-slave interface would like to test for exclusive access to a channel. The standard channel refcounting is not sufficient in that it tracks more than just client references, it is also inaccurate as reference counts are percpu until the channel is removed. This change also enables a future fix to deallocate resources when a client declines to use a capable channel. Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-07-08 11:58:21 -07:00
Harvey Harrison	238f74a227	mac80211: move QOS control helpers into ieee80211.h Also remove the WLAN_IS_QOS_DATA inline after removing the last two users. This starts moving away from using rx->fc to using the header frame_control directly. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-08 14:15:59 -04:00
Bartlomiej Zolnierkiewicz	a861beb140	ide: add __ide_default_irq() inline helper Add __ide_default_irq() inline helper and use it instead of ide_default_irq() in ide-probe.c and ns87415.c (all host drivers except IDE PCI ones always setup hwif->irq so it is enough to check only for I/O bases 0x1f0 and 0x170). This fixes post-2.6.25 regression since ide_default_irq() define could shadow ide_default_irq() inline. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-07-08 19:27:22 +02:00
Bernhard Walle	69ac9cd629	sysfs: add /sys/firmware/memmap This patch adds /sys/firmware/memmap interface that represents the BIOS (or Firmware) provided memory map. The tree looks like: /sys/firmware/memmap/0/start (hex number) end (hex number) type (string) ... /1/start end type With the following shell snippet one can print the memory map in the same form the kernel prints itself when booting on x86 (the E820 map). --------- 8< -------------------------- #!/bin/sh cd /sys/firmware/memmap for dir in * ; do start=$(cat $dir/start) end=$(cat $dir/end) type=$(cat $dir/type) printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" done --------- >8 -------------------------- That patch only provides the needed interface: 1. The sysfs interface. 2. The structure and enumeration definition. 3. The function firmware_map_add() and firmware_map_add_early() that should be called from architecture code (E820/EFI, for example) to add the contents to the interface. If the kernel is compiled without CONFIG_FIRMWARE_MEMMAP, the interface does nothing without cluttering the architecture-specific code with #ifdef's. The purpose of the new interface is kexec: While /proc/iomem represents the used memory map (e.g. modified via kernel parameters like 'memmap' and 'mem'), the /sys/firmware/memmap tree represents the unmodified memory map provided via the firmware. So kexec can: - use the original memory map for rebooting, - use the /proc/iomem for setting up the ELF core headers for kdump case that should only represent the memory of the system. The patch has been tested on i386 and x86_64. Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Greg KH <gregkh@suse.de> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: kexec@lists.infradead.org Cc: yhlu.kernel@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 17:55:41 +02:00
Niels de Vos	f3d1eb19ab	Input: serio - trivial documentation fix In include/linux/serio.h two different define-series are documented as "Serio types". However the second series contains defines for the different protocols. Signed-off-by: Niels de Vos <niels.devos@wincor-nixdorf.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-08 10:33:01 -04:00
Ron Rindjunsky	429a380571	mac80211: add block ack request capability This patch adds block ack request capability Signed-off-by: Ester Kummer <ester.kummer@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-08 10:21:34 -04:00
Yinghai Lu	3c999f1426	x86: check command line when CONFIG_X86_MPPARSE is not set, v2 if acpi=off, acpi=noirq and pci=noacpi, we need to disable apic. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:31 +02:00
Yinghai Lu	d52d53b8a5	RFC x86: try to remove arch_get_ram_range want to remove arch_get_ram_range, and use early_node_map instead. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:27 +02:00
Jeremy Fitzhardinge	a7bf0bd5e6	build: add __page_aligned_data and __page_aligned_bss Making a variable page-aligned by using __attribute__((section(".data.page_aligned"))) is fragile because if sizeof(variable) is not also a multiple of page size, it leaves variables in the remainder of the section unaligned. This patch introduces two new qualifiers, __page_aligned_data and __page_aligned_bss to set the section and the alignment of variables. This makes page-aligned variables more robust because the linker will make sure they're aligned properly. Unfortunately it requires all page-aligned data to use these macros... Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:13 +02:00
Patrick McHardy	9bb8582efb	vlan: TCI related type and naming cleanups The VLAN code contains multiple spots that use tag, id and tci as identifiers for arguments and variables incorrectly and they actually contain or are expected to contain something different. Additionally types are used inconsistently (unsigned short vs u16) and identifiers are sometimes capitalized. - consistently use u16 for storing TCI, ID or QoS values - consistently use vlan_id and vlan_tci for storing the respective values - remove capitalization - add kdoc comment to netif_hwaccel_{rx,receive_skb} Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:24:44 -07:00
Patrick McHardy	df6b6a0cf6	vlan: remove useless struct hlist_node declaration from if_vlan.h Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:24:14 -07:00
Patrick McHardy	22d1ba74bb	vlan: move struct vlan_dev_info to private header Hide struct vlan_dev_info from drivers to prevent them from growing more creative ways to use it. Provide accessors for the two drivers that currently use it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:23:57 -07:00
Patrick McHardy	7750f403cb	vlan: uninline __vlan_hwaccel_rx The function is huge and included at least once in every VLAN acceleration capable driver. Uninline it; to avoid having drivers depend on the VLAN module, the function is always built in statically when VLAN is enabled. With all VLAN acceleration capable drivers that build on x86_64 enabled, this results in: text data bss dec hex filename 6515227 854044 343968 7713239 75b1d7 vmlinux.inlined 6505637 854044 343968 7703649 758c61 vmlinux.uninlined ---------------------------------------------------------- -9590 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:23:36 -07:00
Patrick McHardy	acc81e1465	vlan: fix network_header/mac_header adjustments Lennert Buytenhek points out that the VLAN code incorrectly adjusts skb->network_header to point in the middle of the VLAN header and additionally tries to adjust skb->mac_header without checking for validity. The network_header should not be touched at all since we're only adding headers in front of it, mac_header adjustments are not necessary at all. Based on patch by Lennert Buytenhek <buytenh@wantstofly.org>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:21:27 -07:00
Richard Kennedy	2c693610fe	net: remove padding from struct socket on 64bit & increase objects/cache remove padding from struct socket reducing its size by 8 bytes. This allows more objects/cache in sock_inode_cache 12 objects/cache when cacheline size is 128 (generic x86_64) Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 03:03:01 -07:00
Ingo Molnar	2b4fa851b2	Merge branch 'x86/numa' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/e820.c arch/x86/kernel/efi_64.c arch/x86/kernel/mpparse.c arch/x86/kernel/setup.c arch/x86/kernel/setup_32.c arch/x86/mm/init_64.c include/asm-x86/proto.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:59:23 +02:00
Joonwoo Park	4ad3f26162	netfilter: fix string extension for case insensitive pattern matching The flag XT_STRING_FLAG_IGNORECASE indicates case insensitive string matching. netfilter can find cmd.exe, Cmd.exe, cMd.exe and etc easily. A new revision 1 was added, in the meantime invert of xt_string_info was moved into flags as a flag. If revision is 1, The flag XT_STRING_FLAG_INVERT indicates invert matching. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 02:38:56 -07:00
Joonwoo Park	dde77e6044	textsearch: convert kmalloc + memset to kzalloc convert kmalloc + memset to kzalloc for alloc_ts_config Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 02:38:40 -07:00
Joonwoo Park	b9c7967831	textsearch: support for case insensitive searching The function textsearch_prepare has a new flag to support case insensitive searching. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 02:37:31 -07:00
Adrian Bunk	cdf060a5d3	netfilter: cleanup netfilter_ipv6.h userspace header Kernel functions are not for userspace. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 02:36:40 -07:00
Mike Travis	3461b0af02	x86: remove static boot_cpu_pda array v2 * Remove the boot_cpu_pda array and pointer table from the data section. Allocate the pointer table and array during init. do_boot_cpu() will reallocate the pda in node local memory and if the cpu is being brought up before the bootmem array is released (after_bootmem = 0), then it will free the initial pda. This will happen for all cpus present at system startup. This removes 512k + 32k bytes from the data section. For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:25 +02:00
Ingo Molnar	3de352bbd8	Merge branch 'x86/mpparse' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/io_apic_32.c arch/x86/kernel/setup_64.c arch/x86/mm/init_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:14:58 +02:00
Yinghai Lu	b5bc6c0e55	x86, mm: use add_highpages_with_active_regions() for high pages init v2 use early_node_map to init high pages, so we can remove page_is_ram() and page_is_reserved_early() in the big loop with add_one_highpage also remove page_is_reserved_early(), it is not needed anymore. v2: fix the build of other platforms Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:37:25 +02:00
Yinghai Lu	cc1050bafe	x86: replace shrink_active_range() with remove_active_range() in case we have kva before ramdisk on a node, we still need to use those ranges. v2: reserve_early kva ram area, in case there are holes in highmem, to avoid those area could be treat as free high pages. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:29 +02:00
Yinghai Lu	d2dbf34332	x86: clean up reserve_bootmem_generic() and port it to 32-bit 1. add reserve_bootmem_generic for 32bit 2. change len to unsigned long 3. make early_res_to_bootmem to use it Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:17 +02:00
Ingo Molnar	896395c290	Merge branch 'linus' into tmp.x86.mpparse.new	2008-07-08 10:32:56 +02:00
Ingo Molnar	1b8ba39a3f	Merge branch 'x86/irq' into x86/devel Conflicts: arch/x86/kernel/i8259.c arch/x86/kernel/irqinit_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:53:57 +02:00
Ingo Molnar	58cf35228f	Merge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug', 'x86/ptrace' and 'x86/amd-iommu' into x86/devel	2008-07-08 09:46:15 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Neil Brown	0529613a19	Merge branch 'for-neil' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/md into for-next	2008-07-08 10:13:28 +10:00
Neil Brown	5b1a4bf220	Merge branch 'master' into for-next	2008-07-08 10:11:50 +10:00
Rafael J. Wysocki	337001b6c4	PCI: Simplify PCI device PM code If the offset of PCI device's PM capability in its configuration space, the mask of states that the device supports PME# from and the D1 and D2 support bits are cached in the corresponding struct pci_dev, the PCI device PM code can be simplified quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-07 16:26:50 -07:00
Rafael J. Wysocki	404cc2d8ce	PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep Introduce functions pci_prepare_to_sleep() and pci_back_from_sleep(), to be used by the PCI drivers that want to place their devices into the lowest power state appropiate for them (PCI_D3hot, if the device is not supposed to wake up the system, or the deepest state from which the wake-up is possible, otherwise) while the system is being prepared to go into a sleeping state and to put them back into D0 during the subsequent transition to the working state. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-07 16:26:33 -07:00
Rafael J. Wysocki	eb9d0fe40e	PCI ACPI: Rework PCI handling of wake-up * Introduce function acpi_pm_device_sleep_wake() for enabling and disabling the system wake-up capability of devices that are power manageable by ACPI. * Introduce function acpi_bus_can_wakeup() allowing other (dependent) subsystems to check if ACPI is able to enable the system wake-up capability of given device. * Introduce callback .sleep_wake() in struct pci_platform_pm_ops and for the ACPI PCI 'driver' make it use acpi_pm_device_sleep_wake(). * Introduce callback .can_wakeup() in struct pci_platform_pm_ops and for the ACPI 'driver' make it use acpi_bus_can_wakeup(). * Move the PME# handlig code out of pci_enable_wake() and split it into two functions, pci_pme_capable() and pci_pme_active(), allowing the caller to check if given device is capable of generating PME# from given power state and to enable/disable the device's PME# functionality, respectively. * Modify pci_enable_wake() to use the new ACPI callbacks and the new PME#-related functions. * Drop the generic .platform_enable_wakeup() callback that is not used any more. * Introduce device_set_wakeup_capable() that will set the power.can_wakeup flag of given device. * Rework PCI device PM initialization so that, if given device is capable of generating wake-up events, either natively through the PME# mechanism, or with the help of the platform, its power.can_wakeup flag is set and its power.should_wakeup flag is unset as appropriate. * Make ACPI set the power.can_wakeup flag for devices found to be wake-up capable by it. * Make the ACPI wake-up code enable/disable GPEs for devices that have the wakeup.flags.prepared flag set (which means that their wake-up power has been enabled). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-07 16:26:28 -07:00
Greg KH	c6c4f070a6	PCI: make pci_name use dev_name Also fixes up the sparc code that was assuming this is not a constant. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-07 16:02:40 -07:00
Claudio Nieder	7342239273	Input: add driver for Tabletkiosk Sahara TouchIT-213 touchscreen Signed-off-by: Claudio Nieder <private@claudio.ch> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-07-07 12:00:30 -04:00
Dmitry Baryshkov	f024ff10b1	[ARM] 5128/1: tc6393xb: tmio-nand support Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-07 13:22:08 +01:00
Dmitry Baryshkov	aa613de676	[ARM] 5127/1: Core MFD support This patch provides a common subdevice registration system for MFD type chips, using platfrom device. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-07 13:22:06 +01:00
Dmitry Baryshkov	d6315949ac	[ARM] 5096/2: Support Toshiba TC6393XB Mobile I/O Controller. Add support for Toshiba TC6393XB companion chip. Currently only GPIO and part of IRQ features of the device are supported. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-07 13:22:01 +01:00
Ingo Molnar	d763d5edf9	Merge branch 'linus' into tracing/mmiotrace	2008-07-07 08:07:35 +02:00
Ingo Molnar	032f82786f	Merge commit 'v2.6.26-rc9' into sched/devel	2008-07-07 08:01:26 +02:00
Ingo Molnar	9982fbface	Revert "cpumask: introduce new APIs" This reverts commit `acb7669c12`. the wrappers are not needed anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-06 14:24:08 +02:00
Ingo Molnar	68083e05d7	Merge commit 'v2.6.26-rc9' into cpus4096	2008-07-06 14:23:39 +02:00
David S. Miller	ea2aca084b	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wan/hdlc_fr.c drivers/net/wireless/iwlwifi/iwl-4965.c drivers/net/wireless/iwlwifi/iwl3945-base.c	2008-07-05 23:08:07 -07:00
Patrick McHardy	70c03b49b8	vlan: Add GVRP support Add GVRP support for dynamically registering VLANs with switches. By default GVRP is disabled because we only support the applicant-only participant model, which means it should not be enabled on vlans that are members of a bridge. Since there is currently no way to cleanly determine that, the user is responsible for enabling it. The code is pretty small and low impact, its wrapped in a config option though because it depends on the GARP implementation and the STP core. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-05 21:26:57 -07:00
Patrick McHardy	eca9ebac65	net: Add GARP applicant-only participant Add an implementation of the GARP (Generic Attribute Registration Protocol) applicant-only participant. This will be used by the following patch to add GVRP support to the VLAN code. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-05 21:26:13 -07:00
David S. Miller	4ce2417bfb	Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-07-05 21:03:31 -07:00
Eduard - Gabriel Munteanu	ca31e146d5	Move _RET_IP_ and _THIS_IP_ to include/linux/kernel.h These two macros are useful beyond lock debugging. Moved definitions from include/linux/debug_locks.h to include/linux/kernel.h, so code that needs them does not have to include the former, which would have been a less intuitive choice of a header. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-05 13:10:50 -07:00
Stephen Rothwell	acb7669c12	cpumask: introduce new APIs In linux-next there is a commit ("x86: Add performance variants of cpumask operators") which, as part of the 4096 cpu support work adds some new APIs for dealing with cpu masks. Add trivial versions of these now so that subsystems can update in a timely manner and avoid conflicts in linux-next and the next merge window. Cc: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:09 -07:00
Andres Salomon	e08c1694d9	olpc: sdhci: add quirk for the Marvell CaFe's vdd/powerup issue This has been sitting around unloved for way too long.. The Marvell CaFe chip's SD implementation chokes during card insertion if one attempts to set the voltage and power up in the same SDHCI_POWER_CONTROL register write. This adds a quirk that does that particular dance in two steps. It also adds an entry to pci_ids.h for the CaFe chip's SD device. Signed-off-by: Andres Salomon <dilinger@debian.org> Cc: Pierre Ossman <drzeus-list@drzeus.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:09 -07:00
Andrew G. Morgan	086f7316f0	security: filesystem capabilities: fix fragile setuid fixup code This commit includes a bugfix for the fragile setuid fixup code in the case that filesystem capabilities are supported (in access()). The effect of this fix is gated on filesystem capability support because changing securebits is only supported when filesystem capabilities support is configured.) [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:08 -07:00
Stephen Rothwell	93921f5c2c	Introduce rculist.h In linux-next there is a commit ("rcu: split list.h and move rcu-protected lists into rculist.h") that moved the rcu related list iterators from list.h to rculist.h. Add a trivial version of the file now so that various subsystem trees can start using it now for -next changes and so reduce the build errors caused by adding uses of the moved functions. Cc: Franck Bui-Huu <fbuihuu@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@kernel.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:07 -07:00
Miguel Ojeda	450c622e9f	Miguel Ojeda has moved Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:05 -07:00
James Bottomley	69d44a1835	firmware: fix the request_firmware() dummy > the build (.config attached) failed, make ends with : > ... > UPD include/linux/compile.h > CC init/version.o > LD init/built-in.o > LD vmlinux > drivers/built-in.o: In function `sas_request_addr': > (.text+0x33bab): undefined reference to `request_firmware' > drivers/built-in.o: In function `sas_request_addr': > (.text+0x33c3f): undefined reference to `release_firmware' > make: *** [vmlinux] Error 1 There's a slight fault in the stub logic. It fails for FW_LOADER=m and the user =y. This should fix it. This patch fixes the following 2.6.26-rc regression: http://bugzilla.kernel.org/show_bug.cgi?id=10730 Reviewed-by: Toralf Foerster <toralf.foerster@gmx.de> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:04 -07:00
Christoph Lameter	cde5353599	Christoph has moved Remove all clameter@sgi.com addresses from the kernel tree since they will become invalid on June 27th. Change my maintainer email address for the slab allocators to cl@linux-foundation.org (which will be the new email address for the future). Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-04 10:40:04 -07:00
Linus Torvalds	3ea9eed493	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slub: Do not use 192 byte sized cache if minimum alignment is 128 byte	2008-07-04 09:48:21 -07:00
Krzysztof Halasa	198191c4a7	WAN: convert drivers to use built-in netdev_stats There is no point in using separate net_device_stats structs when the one in struct net_device is present. Compiles. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-04 08:47:41 -04:00
Heiko Carstens	ba8dd03ac0	generic-ipi: fix s390 build bug forgot to remove #include <linux/spinlock.h> from linux/smp.h while fixing the original s390 build bug. Patch below fixes this build bug caused by header inclusion dependencies: CC kernel/timer.o In file included from include/linux/spinlock.h:87, from include/linux/smp.h:11, from include/linux/kernel_stat.h:4, from kernel/timer.c:22: include/asm/spinlock.h: In function '__raw_spin_lock': include/asm/spinlock.h:69: error: implicit declaration of function 'smp_processor_id' Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-04 11:26:40 +02:00
FUJITA Tomonori	27f8221af4	block: add blk_queue_update_dma_pad This adds blk_queue_update_dma_pad to prevent LLDs from overwriting the dma pad mask wrongly (we added blk_queue_update_dma_alignment due to the same reason). This also converts libata to use blk_queue_update_dma_pad instead of blk_queue_dma_pad. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-04 09:52:13 +02:00
J. Bruce Fields	e86322f611	Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 into for-2.6.27	2008-07-03 16:24:06 -04:00
Christoph Lameter	41d54d3bf8	slub: Do not use 192 byte sized cache if minimum alignment is 128 byte The 192 byte cache is not necessary if we have a basic alignment of 128 byte. If it would be used then the 192 would be aligned to the next 128 byte boundary which would result in another 256 byte cache. Two 256 kmalloc caches cause sysfs to complain about a duplicate entry. MIPS needs 128 byte aligned kmalloc caches and spits out warnings on boot without this patch. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-07-03 19:01:55 +03:00
Philipp Zabel	3b73125af6	[ARM] 5044/1: pwm_bl: add init/notify/exit callbacks This allows platform code to manipulate GPIOs and brightness level as needed. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-03 13:25:05 +01:00
eric miao	42796d37da	[ARM] pxa: add generic PWM backlight driver Patch mostly from Eric Miao, with minor edits by rmk to convert Eric's driver to a generic PWM-based backlight driver. Signed-off-by: eric miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-07-03 13:25:00 +01:00
Jens Axboe	e48ec69005	block: extend queue_flag bitops Add test_and_clear and test_and_set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:15 +02:00
Alasdair G Kergon	cc371e66e3	Add bvec_merge_data to handle stacked devices and ->merge_bvec() When devices are stacked, one device's merge_bvec_fn may need to perform the mapping and then call one or more functions for its underlying devices. The following bio fields are used: bio->bi_sector bio->bi_bdev bio->bi_size bio->bi_rw using bio_data_dir() This patch creates a new struct bvec_merge_data holding a copy of those fields to avoid having to change them directly in the struct bio when going down the stack only to have to change them back again on the way back up. (And then when the bio gets mapped for real, the whole exercise gets repeated, but that's a problem for another day...) Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: Milan Broz <mbroz@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:15 +02:00
Jens Axboe	b24498d477	block: integrity flags can't use bit ops on unsigned short Just use normal open coded bit operations instead, they need not be atomic. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:15 +02:00
Adel Gadllah	0b07de85a7	allow userspace to modify scsi command filter on per device basis This patch exports the per-gendisk command filter to user space through sysfs, so it can be changed by the system administrator. All users of the old cmd filter have been converted to use the new one. Original patch from Peter Jones. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:14 +02:00
Jens Axboe	6e2401ad6f	block: integrity cleanups - No need to check for NULL bio, we'll get an immediate oops anyway. - Make bio_integrity() a proper function. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:14 +02:00
Jens Axboe	da9cbc8739	block: blkdev.h cleanup, move iocontext stuff to iocontext.h Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:14 +02:00
Martin K. Petersen	7ba1ba12ee	block: Block layer data integrity support Some block devices support verifying the integrity of requests by way of checksums or other protection information that is submitted along with the I/O. This patch implements support for generating and verifying integrity metadata, as well as correctly merging, splitting and cloning bios and requests that have this extra information attached. See Documentation/block/data-integrity.txt for more information. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Martin K. Petersen	51d654e1d8	block: Globalize bio_set and bio_vec_slab Move struct bio_set and biovec_slab definitions to bio.h so they can be used outside of bio.c. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Jens Axboe	244b4d56f8	block: kill request_queue_t Everything was moved to struct request_queue a few kernel revisions ago, maintaining the deprecated typedef to avoid breaking things. Now the time has come to get rid of that typedef. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:12 +02:00
Alan D. Brunelle	02c62304e6	Added in user-injected messages into blk traces This allows a user to annotate the blk trace stream: writing a suitable message to {/sys/kernel/debug}/block/<dsf>/msg will have it propagated into the trace stream. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:12 +02:00
Rusty Russell	f43798c276	tun: Allow GSO using virtio_net_hdr Add a IFF_VNET_HDR flag. This uses the same ABI as virtio_net (ie. prepending struct virtio_net_hdr to packets) to indicate GSO and checksum information. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:48:02 -07:00
Rusty Russell	5228ddc98f	tun: TUNSETFEATURES to set gso features. ethtool is useful for setting (some) device fields, but it's root-only. Finer feature control is available through a tun-specific ioctl. (Includes Mark McLoughlin <markmc@redhat.com>'s fix to hold rtnl sem). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:46:16 -07:00
Rusty Russell	07240fd090	tun: Interface to query tun/tap features. The problem with introducing checksum offload and gso to tun is they need to set dev->features to enable GSO and/or checksumming, which is supposed to be done before register_netdevice(), ie. as part of TUNSETIFF. Unfortunately, TUNSETIFF has always just ignored flags it doesn't understand, so there's no good way of detecting whether the kernel supports new IFF_ flags. This patch implements a TUNGETFEATURES ioctl which returns all the valid IFF flags. It could be extended later to include other features. Here's an example program which uses it: #include <linux/if_tun.h> #include <sys/types.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <fcntl.h> #include <err.h> #include <stdio.h> static struct { unsigned int flag; const char *name; } known_flags[] = { { IFF_TUN, "TUN" }, { IFF_TAP, "TAP" }, { IFF_NO_PI, "NO_PI" }, { IFF_ONE_QUEUE, "ONE_QUEUE" }, }; int main() { unsigned int features, i; int netfd = open("/dev/net/tun", O_RDWR); if (netfd < 0) err(1, "Opening /dev/net/tun"); if (ioctl(netfd, TUNGETFEATURES, &features) != 0) { printf("Kernel does not support TUNGETFEATURES, guessing\n"); features = (IFF_TUN\|IFF_TAP\|IFF_NO_PI\|IFF_ONE_QUEUE); } printf("Available features are: "); for (i = 0; i < sizeof(known_flags)/sizeof(known_flags[0]); i++) { if (features & known_flags[i].flag) { features &= ~known_flags[i].flag; printf("%s ", known_flags[i].name); } } if (features) printf("(UNKNOWN %#x)", features); printf("\n"); return 0; } Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:45:32 -07:00
YOSHIFUJI Hideaki	e0835f8fa5	ipv4,ipv6 mroute: Add some helper inline functions to remove ugly ifdefs. ip{,v6}_mroute_{set,get}sockopt() should not matter by optimization but it would be better not to depend on optimization semantically. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-07-03 17:51:57 +09:00
Wang Chen	03d2f897e9	ipv4: Do cleanup for ip_mr_init Same as ip6_mr_init(), make ip_mr_init() return errno if fails. But do not do error handling in inet_init(), just print a msg. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-07-03 17:51:57 +09:00
Wang Chen	623d1a1af7	ipv6: Do cleanup for ip6_mr_init. If do not do it, we will get following issues: 1. Leaving junks after inet6_init failing halfway. 2. Leaving proc and notifier junks after ipv6 modules unloading. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-07-03 17:51:56 +09:00
YOSHIFUJI Hideaki	1b34be74cb	ipv6 addrconf: add accept_dad sysctl to control DAD operation. - If 0, disable DAD. - If 1, perform DAD (default). - If >1, perform DAD and disable IPv6 operation if DAD for MAC-based link-local address has been failed (RFC4862 5.4.5). We do not follow RFC4862 by default. Refer to the netdev thread entitled "Linux IPv6 DAD not full conform to RFC 4862 ?" http://www.spinics.net/lists/netdev/msg52027.html Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-07-03 17:51:56 +09:00
YOSHIFUJI Hideaki	778d80be52	ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-07-03 17:51:55 +09:00
Linus Torvalds	0e77a07ff9	Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block: Properly notify block layer of sync writes block: Fix the starving writes bug in the anticipatory IO scheduler	2008-07-02 19:25:36 -07:00
Linus Torvalds	f7572da502	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Fix bad hint about irqs in i2c.h i2c: Documentation: fix device matching description	2008-07-02 19:00:29 -07:00
Linus Torvalds	821b03ffac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits) net: fib_rules: fix error code for unsupported families netdevice: Fix wrong string handle in kernel command line parsing net: Tyop of sk_filter() comment netlink: Unneeded local variable net-sched: fix filter destruction in atm/hfsc qdisc destruction net-sched: change tcf_destroy_chain() to clear start of filter list ipv4: fix sysctl documentation of time related values mac80211: don't accept WEP keys other than WEP40 and WEP104 hostap: fix sparse warnings hostap: don't report useless WDS frames by default textsearch: fix Boyer-Moore text search bug netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags. netlabel: Fix a problem when dumping the default IPv6 static labels net/inet_lro: remove setting skb->ip_summed when not LRO-able inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild CONNECTOR: add a proc entry to list connectors netlink: Fix some doc comments in net/netlink/attr.c tcp: /proc/net/tcp rto,ato values not scaled properly (v2) include/linux/netdevice.h: don't export MAX_HEADER to userspace ...	2008-07-02 18:43:16 -07:00
Andi Kleen	9465efc9e9	Remove BKL from remote_llseek v2 - Replace remote_llseek with generic_file_llseek_unlocked (to force compilation failures in all users) - Change all users to either use generic_file_llseek_unlocked directly or take the BKL around. I changed the file systems who don't use the BKL for anything (CIFS, GFS) to call it directly. NCPFS and SMBFS and NFS take the BKL, but explicitely in their own source now. I moved them all over in a single patch to avoid unbisectable sections. Open problem: 32bit kernels can corrupt fpos because its modification is not atomic, but they can do that anyways because there's other paths who modify it without BKL. Do we need a special lock for the pos/f_version = 0 checks? Trond says the NFS BKL is likely not needed, but keep it for now until his full audit. v2: Use generic_file_llseek_unlocked instead of remote_llseek_unlocked and factor duplicated code (suggested by hch) Cc: Trond.Myklebust@netapp.com Cc: swhiteho@redhat.com Cc: sfrench@samba.org Cc: vandrove@vc.cvut.cz Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-07-02 15:06:27 -06:00
Tom Tucker	8948896c9e	svcrdma: Change WR context get/put to use the kmem cache Change the WR context pool to be shared across mount points. This reduces the RDMA transport memory footprint significantly since idle mounts don't consume WR context memory. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:02:02 -05:00
Tom Tucker	779a48577b	svcrdma: Remove unused wait q from svcrdma_xprt structure The sc_read_wait queue head is no longer used. Remove it. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:58 -05:00
Tom Tucker	87295b6c5c	svcrdma: Add dma map count and WARN_ON Add a dma map count in order to verify that all DMA mapping resources have been freed when the transport is closed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:56 -05:00
Tom Tucker	f820c57ebf	svcrdma: Use reply and chunk map for RDMA_READ processing Modify the RDMA_READ processing to use the reply and chunk list mapping data types. Also add a special purpose 'hdr_count' field in in the context to hold the header page count instead of overloading the SGE length field and corrupting the DMA map length. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:55 -05:00
Tom Tucker	ab96dddbed	svcrdma: Add a type for keeping NFS RPC mapping Create a new data structure to hold the remote client address space to local server address space mapping. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:53 -05:00
Santwona Behera	0853ad66b1	netdev: Add support for rx flow hash configuration, using ethtool. Added new interfaces to ethtool to configure receive network flow distribution across multiple rx rings using hashing. Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-02 03:47:41 -07:00
Wolfram Sang	8e29da9ee8	i2c: Fix bad hint about irqs in i2c.h i2c.h mentions -1 as a not-issued irq. This false hint was taken by of_i2c and caused crashes. Don't give any advice as 'no irq' is not consistent across all architectures yet and it is not needed internally by the i2c-core. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-07-01 22:38:18 +02:00
Jens Axboe	18ce3751cc	Properly notify block layer of sync writes fsync_buffers_list() and sync_dirty_buffer() both issue async writes and then immediately wait on them. Conceptually, that makes them sync writes and we should treat them as such so that the IO schedulers can handle them appropriately. This patch fixes a write starvation issue that Lin Ming reported, where xx is stuck for more than 2 minutes because of a large number of synchronous IO in the system: INFO: task kjournald:20558 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff810010820978 6712 20558 2 ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2 ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb 0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537 Call Trace: [<ffffffff803ba6f2>] kobject_get+0x12/0x17 [<ffffffff80247537>] getnstimeofday+0x2f/0x83 [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d195>] io_schedule+0x5d/0x9f [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff80243909>] wake_bit_function+0x0/0x23 [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6 [<ffffffff8023a676>] lock_timer_base+0x26/0x4b [<ffffffff8030300a>] kjournald+0xc1/0x1fb [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e [<ffffffff80302f49>] kjournald+0x0/0x1fb [<ffffffff802437bb>] kthread+0x47/0x74 [<ffffffff8022de51>] schedule_tail+0x28/0x5d [<ffffffff8020cac8>] child_rip+0xa/0x12 [<ffffffff80243774>] kthread+0x0/0x74 [<ffffffff8020cabe>] child_rip+0x0/0x12 Lin Ming confirms that this patch fixes the issue. I've run tests with it for the past week and no ill effects have been observed, so I'm proposing it for inclusion into 2.6.26. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-01 09:07:34 +02:00
Michael Neuling	f3e909c275	powerpc: Update for VSX core file and ptrace This correctly hooks the VSX dump into Roland McGrath core file infrastructure. It adds the VSX dump information as an additional elf note in the core file (after talking more to the tool chain/gdb guys). This also ensures the formats are consistent between signals, ptrace and core files. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-01 14:47:09 +10:00
Adrian Bunk	9b4a8dd2e9	drivers/macintosh: Various cleanups This contains the following cleanups: - make the following needlessly global code static: - adb.c: adb_controller - adb.c: adb_init() - adbhid.c: adb_to_linux_keycodes[] (also make it const) - via-pmu68k.c: backlight_level - via-pmu68k.c: backlight_enabled - remove the following unused code: - via-pmu68k.c: sleep_notifier_list Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-01 11:28:06 +10:00
Dan Williams	b5470dc5fc	md: resolve external metadata handling deadlock in md_allow_write md_allow_write() marks the metadata dirty while holding mddev->lock and then waits for the write to complete. For externally managed metadata this causes a deadlock as userspace needs to take the lock to communicate that the metadata update has completed. Change md_allow_write() in the 'external' case to start the 'mark active' operation and then return -EAGAIN. The expected side effects while waiting for userspace to write 'active' to 'array_state' are holding off reshape (code currently handles -ENOMEM), cause some 'stripe_cache_size' change requests to fail, cause some GET_BITMAP_FILE ioctl requests to fall back to GFP_NOIO, and cause updates to 'raid_disks' to fail. Except for 'stripe_cache_size' changes these failures can be mitigated by coordinating with mdmon. md_write_start() still prevents writes from occurring until the metadata handler has had a chance to take action as it unconditionally waits for MD_CHANGE_CLEAN to be cleared. [neilb@suse.de: return -EAGAIN, try GFP_NOIO] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-06-30 17:18:19 -07:00
Randy Dunlap	80be038593	PCI: add stub for pci_set_consistent_dma_mask() When CONFIG_PCI=n, there is no stub for pci_set_consistent_dma_mask(), so add one like other similar stubs. Otherwise there can be build errors, as here: linux-next-20080630/drivers/ssb/main.c:1175: error: implicit declaration of function 'pci_set_consistent_dma_mask' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-06-30 11:41:36 -07:00
Linus Torvalds	e1441b9a41	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fix locking in force-feedback core Input: add KEY_MEDIA_REPEAT definition	2008-06-30 08:58:09 -07:00
Richard Lemon	3cadd2d989	Input: Add driver for iNexio serial touchscreen. Signed-off-by: Richard Lemon <richard@codelemon.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-06-30 09:37:32 -04:00
Bastien Nocera	4bbff7e408	Input: add KEY_MEDIA_REPEAT definition This patch adds the Repeat key to the input layer. The usage in the HUT is 0xBC (listed under "15.7 Transport Controls"). Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-06-30 09:25:12 -04:00
Paul Mackerras	e9a4b6a3f6	Merge branch 'linux-2.6'	2008-06-30 10:16:50 +10:00
Linus Torvalds	0acbbee440	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled. ACPI: don't walk tables if ACPI was disabled thermal: Create CONFIG_THERMAL_HWMON=n	2008-06-29 12:22:30 -07:00
Linus Torvalds	535e49f48e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: kbuild: fix a.out.h export to userspace with O= build.	2008-06-29 12:21:56 -07:00
Linus Torvalds	a4480ac4f9	Merge branch 'audit.b52' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b52' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] remove useless argument type in audit_filter_user() [PATCH] audit: fix kernel-doc parameter notation [PATCH] kernel/audit.c: nlh->nlmsg_type is gotten more than once	2008-06-29 12:15:10 -07:00
Linus Torvalds	4f46accee4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [patch 2/3] vfs: dcache cleanups [patch 1/3] vfs: dcache sparse fixes [patch 3/3] vfs: make d_path() consistent across mount operations [patch 4/4] flock: remove unused fields from file_lock_operations [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink [patch 2/4] fs: make struct file arg to d_path const [patch 1/4] vfs: path_{get,put}() cleanups [patch for 2.6.26 4/4] vfs: utimensat(): fix write access check for futimens() [patch for 2.6.26 3/4] vfs: utimensat(): fix error checking for {UTIME_NOW,UTIME_OMIT} case [patch for 2.6.26 1/4] vfs: utimensat(): ignore tv_sec if tv_nsec == UTIME_OMIT or UTIME_NOW [patch for 2.6.26 2/4] vfs: utimensat(): be consistent with utime() for immutable and append-only files [PATCH] fix cgroup-inflicted breakage in block_dev.c	2008-06-29 12:14:37 -07:00
David S. Miller	28f49d8fec	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2008-06-28 22:57:58 -07:00
David S. Miller	332e4af80d	Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-06-28 21:28:46 -07:00
Jussi Kivilinna	818727badc	rndis_host: pass buffer length to rndis_command Pass buffer length to rndis_command so that rndis_command can read full response buffer from device instead of max CONTROL_BUFFER_SIZE bytes. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-06-28 10:23:34 -04:00
David S. Miller	1b63ba8a86	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl4965-base.c	2008-06-28 01:19:40 -07:00
Eli Cohen	251a4b320f	net/inet_lro: remove setting skb->ip_summed when not LRO-able When an SKB cannot be chained to a session, the current code attempts to "restore" its ip_summed field from lro_mgr->ip_summed. However, lro_mgr->ip_summed does not hold the original value; in fact, we'd better not touch skb->ip_summed since it is not modified by the code in the path leading to a failure to chain it. Also use a cleaer comment to the describe the ip_summed field of struct net_lro_mgr. Issue raised by Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-27 20:09:00 -07:00
Adrian Bunk	c88e6f51c2	include/linux/netdevice.h: don't export MAX_HEADER to userspace Due to the CONFIG_'s the value is anyway not correct in userspace. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-27 19:54:54 -07:00
Dan Williams	d8ee0728b5	md: replace R5_WantPrexor with R5_WantDrain, add 'prexor' reconstruct_states From: Dan Williams <dan.j.williams@intel.com> Currently ops_run_biodrain and other locations have extra logic to determine which blocks are processed in the prexor and non-prexor cases. This can be eliminated if handle_write_operations5 flags the blocks to be processed in all cases via R5_Wantdrain. The presence of the prexor operation is tracked in sh->reconstruct_state. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:32:06 +10:00
Dan Williams	600aa10993	md: replace STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} with 'reconstruct_states' From: Dan Williams <dan.j.williams@intel.com> Track the state of reconstruct operations (recalculating the parity block usually due to incoming writes, or as part of array expansion) Reduces the scope of the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags to only tracking whether a reconstruct operation has been requested via the ops_request field of struct stripe_head_state. This is the final step in the removal of ops.{pending,ack,complete,count}, i.e. the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags only request an operation and do not track the state of the operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:32:05 +10:00
Dan Williams	ecc65c9b3f	md: replace STRIPE_OP_CHECK with 'check_states' From: Dan Williams <dan.j.williams@intel.com> The STRIPE_OP_* flags record the state of stripe operations which are performed outside the stripe lock. Their use in indicating which operations need to be run is straightforward; however, interpolating what the next state of the stripe should be based on a given combination of these flags is not straightforward, and has led to bugs. An easier to read implementation with minimal degrees of freedom is needed. Towards this goal, this patch introduces explicit states to replace what was previously interpolated from the STRIPE_OP_* flags. For now this only converts the handle_parity_checks5 path, removing a user of the ops.{pending,ack,complete,count} fields of struct stripe_operations. This conversion also found a remaining issue with the current code. There is a small window for a drive to fail between when we schedule a repair and when the parity calculation for that repair completes. When this happens we will writeback to 'failed_num' when we really want to write back to 'pd_idx'. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:57 +10:00
Dan Williams	2b7497f0e0	md: kill STRIPE_OP_IO flag From: Dan Williams <dan.j.williams@intel.com> The R5_Want{Read,Write} flags already gate i/o. So, this flag is superfluous and we can unconditionally call ops_run_io(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:52 +10:00
Dan Williams	b203886edb	md: kill STRIPE_OP_MOD_DMA in raid5 offload From: Dan Williams <dan.j.williams@intel.com> This micro-optimization allowed the raid code to skip a re-read of the parity block after checking parity. It took advantage of the fact that xor-offload-engines have their own internal result buffer and can check parity without writing to memory. Remove it for the following reasons: 1/ It is a layering violation for MD to need to manage the DMA and non-DMA paths within async_xor_zero_sum 2/ Bad precedent to toggle the 'ops' flags outside the lock 3/ Hard to realize a performance gain as reads will not need an updated parity block and writes will dirty it anyways. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:50 +10:00
Neil Brown	526647320e	Make sure all changes to md/dev-XX/state are notified The important state change happens during an interrupt in md_error. So just set a flag there and call sysfs_notify later in process context. Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:44 +10:00
Neil Brown	72a23c211e	Make sure all changes to md/sync_action are notified. When the 'resync' thread starts or stops, when we explicitly set sync_action, or when we determine that there is definitely nothing to do, we notify sync_action. To stop "sync_action" from occasionally showing the wrong value, we introduce a new flags - MD_RECOVERY_RECOVER - to say that a recovery is probably needed or happening, and we make sure that we set MD_RECOVERY_RUNNING before clearing MD_RECOVERY_NEEDED. Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:41 +10:00
Neil Brown	5e96ee65c8	Allow setting start point for requested check/repair This makes it possible to just resync a small part of an array. e.g. if a drive reports that it has questionable sectors, a 'repair' of just the region covering those sectors will cause them to be read and, if there is an error, re-written with correct data. Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:24 +10:00
Neil Brown	a0da84f35b	Improve setting of "events_cleared" for write-intent bitmaps. When an array is degraded, bits in the write-intent bitmap are not cleared, so that if the missing device is re-added, it can be synced by only updated those parts of the device that have changed since it was removed. The enable this a 'events_cleared' value is stored. It is the event counter for the array the last time that any bits were cleared. Sometimes - if a device disappears from an array while it is 'clean' - the events_cleared value gets updated incorrectly (there are subtle ordering issues between updateing events in the main metadata and the bitmap metadata) resulting in the missing device appearing to require a full resync when it is re-added. With this patch, we update events_cleared precisely when we are about to clear a bit in the bitmap. We record events_cleared when we clear the bit internally, and copy that to the superblock which is written out before the bit on storage. This makes it more "obviously correct". We also need to update events_cleared when the event_count is going backwards (as happens on a dirty->clean transition of a non-degraded array). Thanks to Mike Snitzer for identifying this problem and testing early "fixes". Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:22 +10:00
David Woodhouse	b660398101	kbuild: fix a.out.h export to userspace with O= build. We need to check for existence of the a.out.h header in the source tree, not the object tree, if we want it to get the right answer with O=. Signed-off-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-06-27 23:13:54 +02:00
Wang Chen	9433f6dd3a	PCI: Fix comment of pci_dynids struct pci_driver has no field of driver_data. It's in pci_device_id. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-06-27 13:06:54 -07:00
Luis R. Rodriguez	ffd7891dc9	mac80211: Let drivers have access to TKIP key offets for TX and RX MIC Some drivers may want to to use the TKIP key offsets for TX and RX MIC so lets move this out. Lets also clear up a bit how this is used internally in mac80211. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-27 09:09:17 -04:00
Michael Buesch	f225763a7d	ssb, b43, b43legacy, b44: Rewrite SSB DMA API This is a rewrite of the DMA API for SSB devices. This is needed, because the old (non-existing) "API" made too many bad assumptions on the API of the host-bus (PCI). This introduces an almost complete SSB-DMA-API that maps to the lowlevel bus-API based on the bustype. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-27 09:09:15 -04:00
Peter Zijlstra	2398f2c6d3	sched: update shares on wakeup We found that the affine wakeup code needs rather accurate load figures to be effective. The trouble is that updating the load figures is fairly expensive with group scheduling. Therefore ratelimit the updating. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-27 14:31:45 +02:00
Peter Zijlstra	b6a86c746f	sched: fix sched_domain aggregation Keeping the aggregate on the first cpu of the sched domain has two problems: - it could collide between different sched domains on different cpus - it could slow things down because of the remote accesses Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-27 14:31:32 +02:00
Peter Zijlstra	c09595f63b	sched: revert revert of: fair-group: SMP-nice for group scheduling Try again.. Initial commit: `18d95a2832` Revert: `6363ca57c7` Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-27 14:31:29 +02:00
Steven Whitehouse	b2cad26cfc	[GFS2] Remove obsolete conversion deadlock avoidance code This is only used by GFS1 so can be removed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-06-27 09:39:47 +01:00
Steven Whitehouse	1bdad60633	[GFS2] Remove remote lock dropping code There are several reasons why this is undesirable: 1. It never happens during normal operation anyway 2. If it does happen it causes performance to be very, very poor 3. It isn't likely to solve the original problem (memory shortage on remote DLM node) it was supposed to solve 4. It uses a bunch of arbitrary constants which are unlikely to be correct for any particular situation and for which the tuning seems to be a black art. 5. In an N node cluster, only 1/N of the dropped locked will actually contribute to solving the problem on average. So all in all we are better off without it. This also makes merging the lock_dlm module into GFS2 a bit easier. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-06-27 09:39:44 +01:00
Assaf Krauss	b662348662	mac80211: 11h - Handling measurement request This patch handles the 11h measurement request information element. This is minimal requested implementation - refuse measurement. Signed-off-by: Assaf Krauss <assaf.krauss@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 16:49:14 -04:00
Assaf Krauss	f2df38596a	mac80211: 11h Infrastructure - Parsing This patch introduces parsing of 11h and 11d related elements from incoming management frames. Signed-off-by: Assaf Krauss <assaf.krauss@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 16:49:14 -04:00
Henrique de Moraes Holschuh	5005657cbd	rfkill: rename the rfkill_state states and add block-locked state The current naming of rfkill_state causes a lot of confusion: not only the "kill" in rfkill suggests negative logic, but also the fact that rfkill cannot turn anything on (it can just force something off or stop forcing something off) is often forgotten. Rename RFKILL_STATE_OFF to RFKILL_STATE_SOFT_BLOCKED (transmitter is blocked and will not operate; state can be changed by a toggle_radio request), and RFKILL_STATE_ON to RFKILL_STATE_UNBLOCKED (transmitter is not blocked, and may operate). Also, add a new third state, RFKILL_STATE_HARD_BLOCKED (transmitter is blocked and will not operate; state cannot be changed through a toggle_radio request), which is used by drivers to indicate a wireless transmiter was blocked by a hardware rfkill line that accepts no overrides. Keep the old names as #defines, but document them as deprecated. This way, drivers can be converted to the new names and verified to actually use rfkill correctly one by one. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 14:21:22 -04:00
Henrique de Moraes Holschuh	79399a8d19	rfkill: add notifier chains support Add a notifier chain for use by the rfkill class. This notifier chain signals the following events (more to be added when needed): 1. rfkill: rfkill device state has changed A pointer to the rfkill struct will be passed as a parameter. The notifier message types have been added to include/linux/rfkill.h instead of to include/linux/notifier.h in order to avoid the madness of modifying a header used globally (and that triggers an almost full tree rebuild every time it is touched) with information that is of interest only to code that includes the rfkill.h header. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 14:21:21 -04:00
Henrique de Moraes Holschuh	477576a073	rfkill: add the WWAN radio type Unfortunately, instead of adding a generic Wireless WAN type, a technology- specific type (WiMAX) was added. That's useless for other WWAN devices, such as EDGE, UMTS, X-RTT and other such radios. Add a WWAN rfkill type for generic wireless WAN devices. No keys are added as most devices really want to use KEY_WLAN for WWAN control (in a cycle of none, WLAN, WWAN, WLAN+WWAN) and need no specific keycode added. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Iñaky Pérez-González <inaky.perez-gonzalez@intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh	801e49af4c	rfkill: add read-write rfkill switch support Currently, rfkill support for read/write rfkill switches is hacked through a round-trip over the input layer and rfkill-input to let a driver sync rfkill->state to hardware changes. This is buggy and sub-optimal. It causes real problems. It is best to think of the rfkill class as supporting only write-only switches at the moment. In order to implement the read/write functionality properly: Add a get_state() hook that is called by the class every time it needs to fetch the current state of the switch. Add a call to this hook every time the current state of the radio plays a role in a decision. Also add a force_state() method that can be used to forcefully syncronize the class' idea of the current state of the switch. This allows for a faster implementation of the read/write functionality, as a driver which get events on switch changes can avoid the need for a get_state() hook. If the get_state() hook is left as NULL, current behaviour is maintained, so this change is fully backwards compatible with the current rfkill drivers. For hardware that issues events when the rfkill state changes, leave get_state() NULL in the rfkill struct, set the initial state properly before registering with the rfkill class, and use the force_state() method in the driver to keep the rfkill interface up-to-date. get_state() can be called by the class from atomic context. It must not sleep. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh	f3146aff7f	rfkill: clarify meaning of rfkill states rfkill really should have been named rfswitch. As it is, one can get confused whether RFKILL_STATE_ON means the KILL switch is on (and therefore, the radio is being blocked from operating), or whether it means the RADIO rf output is on. Clearly state that RFKILL_STATE_ON means the radio is unblocked from operating (i.e. there is no rf killing going on). Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-06-26 14:21:19 -04:00
Jens Axboe	15c8b6c1aa	on_each_cpu(): kill unused 'retry' parameter It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:38 +02:00
Jens Axboe	8691e5a8f6	smp_call_function: get rid of the unused nonatomic/retry argument It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:35 +02:00
Jens Axboe	3d44223327	Add generic helpers for arch IPI function calls This adds kernel/smp.c which contains helpers for IPI function calls. In addition to supporting the existing smp_call_function() in a more efficient manner, it also adds a more scalable variant called smp_call_function_single() for calling a given function on a single CPU only. The core of this is based on the x86-64 patch from Nick Piggin, lots of changes since then. "Alan D. Brunelle" <Alan.Brunelle@hp.com> has contributed lots of fixes and suggestions as well. Also thanks to Paul E. McKenney <paulmck@linux.vnet.ibm.com> for reviewing RCU usage and getting rid of the data allocation fallback deadlock. Acked-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:21:34 +02:00
Ingo Molnar	9a13150109	Merge commit 'v2.6.26-rc8' into core/rcu	2008-06-26 09:24:23 +02:00
Rene Herman	16d7523973	thermal: Create CONFIG_THERMAL_HWMON=n A bug in libsensors <= 2.10.6 is exposed when this new hwmon I/F is enabled. Create CONFIG_THERMAL_HWMON=n until some time after libsensors 2.10.7 ships so those users can run the latest kernel. libsensors 3.x is already fixed -- those users can use CONFIG_THERMAL_HWMON=y now. Signed-off-by: Rene Herman <rene.herman@gmail.com> Acked-by: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-06-25 19:25:42 -04:00
John W. Linville	1839cea91e	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/wireless-2.6	2008-06-25 15:17:58 -04:00
Ingo Molnar	c7e745c6de	Merge commit 'v2.6.26-rc8' into core/softlockup	2008-06-25 17:49:08 +02:00
Ingo Molnar	cbd6712406	Merge branch 'linus' into x86/irq	2008-06-25 12:30:49 +02:00
Ingo Molnar	28f73e51d0	Merge branch 'linus' into x86/delay Conflicts: arch/x86/kernel/tsc_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 12:30:10 +02:00
Ingo Molnar	97e6722b8d	Merge branch 'linus' into tracing/ftrace	2008-06-25 12:27:56 +02:00
Ingo Molnar	773dc8eaca	Merge branch 'linus' into sched/new-API-sched_setscheduler	2008-06-25 12:27:05 +02:00
Ingo Molnar	f57aec5a87	Merge branch 'linus' into sched/devel Conflicts: kernel/sched_rt.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 12:26:59 +02:00
Ingo Molnar	ace7f1b796	Merge branch 'linus' into core/softirq	2008-06-25 12:23:59 +02:00
Ingo Molnar	d02859ecb3	Merge commit 'v2.6.26-rc8' into x86/xen Conflicts: arch/x86/xen/enlighten.c arch/x86/xen/mmu.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 12:16:51 +02:00
Peng Haitao	d8de72473e	[PATCH] remove useless argument type in audit_filter_user() The second argument "type" is not used in audit_filter_user(), so I think that type can be removed. If I'm wrong, please tell me. Signed-off-by: Peng Haitao <penght@cn.fujitsu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-06-24 23:36:35 -04:00
Ben Dooks	f8dd0ecbb7	DM9000: Allow the use of the NSR register to get link status. The DM9000's internal PHY reports a copy of the link status in the NSR register of the chip. Reading the status when polling for link status is faster as it eliminates the need to sleep, but does not print as much information. Add an platform flag to force this behaviour, and a Kconfig option to allow it to be forced to the faster method always. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-06-24 22:58:07 -04:00
Alok Kataria	f3f3149f35	x86: use cpu_khz for loops_per_jiffy calculation, cleanup As suggested by Ingo, remove all references to tsc from init/calibrate.c TSC is x86 specific, and using tsc in variable names in a generic file should be avoided. lpj_tsc is now called lpj_fine, since it is related to fine tuning of lpj value. Also tsc_rate_* is called timer_rate_* Signed-off-by: Alok N Kataria <akataria@vmware.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Daniel Hecht <dhecht@vmware.com> Cc: Tim Mann <mann@vmware.com> Cc: Zach Amsden <zach@vmware.com> Cc: Sahil Rihan <srihan@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:53:46 +02:00
Li Zefan	a033c332e0	lockdep: remove duplicate definition of STATIC_LOCKDEP_MAP_INIT STATIC_LOCKDEP_MAP_INIT is defined twice in lockdep.h. I guess it's a copy & paste. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 13:43:07 +02:00
Marcelo Tosatti	06e0564566	KVM: close timer injection race window in __vcpu_run If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-24 12:16:59 +03:00
Eilon Greenstein	34f80b04f3	bnx2x: Add support for BCM57711 HW Supporting the 57711 and 57711E - refers to in the code as E1H. The 57710 is referred to as E1. To support the new members in the family, the bnx2x structure was divided to 3 parts: common, port and function. These changes caused some rearrangement in the bnx2x.h file. A set of accessories macros were added to make access to the bnx2x structure more readable Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-23 20:33:01 -07:00
Rusty Russell	961ccddd59	sched: add new API sched_setscheduler_nocheck: add a flag to control access checks Hidehiro Kawai noticed that sched_setscheduler() can fail in stop_machine: it calls sched_setscheduler() from insmod, which can have CAP_SYS_MODULE without CAP_SYS_NICE. Two cases could have failed, so are changed to sched_setscheduler_nocheck: kernel/softirq.c:cpu_callback() - CPU hotplug callback kernel/stop_machine.c:__stop_machine_run() - Called from various places, including modprobe() Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Cc: sugita <yumiko.sugita.yf@hitachi.com> Cc: Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-23 22:57:56 +02:00
Alok Kataria	3da757daf8	x86: use cpu_khz for loops_per_jiffy calculation On the x86 platform we can use the value of tsc_khz computed during tsc calibration to calculate the loops_per_jiffy value. Its very important to keep the error in lpj values to minimum as any error in that may result in kernel panic in check_timer. In virtualization environment, On a highly overloaded host the guest delay calibration may sometimes result in errors beyond the ~50% that timer_irq_works can handle, resulting in the guest panicking. Does some formating changes to lpj_setup code to now have a single printk to print the bogomips value. We do this only for the boot processor because the AP's can have different base frequencies or the BIOS might boot a AP at a different frequency. Signed-off-by: Alok N Kataria <akataria@vmware.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Daniel Hecht <dhecht@vmware.com> Cc: Tim Mann <mann@vmware.com> Cc: Zach Amsden <zach@vmware.com> Cc: Sahil Rihan <srihan@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-23 22:51:33 +02:00
Abhishek Sagar	ecea656d1d	ftrace: freeze kprobe'd records Let records identified as being kprobe'd be marked as "frozen". The trouble with records which have a kprobe installed on their mcount call-site is that they don't get updated. So if such a function which is currently being traced gets its tracing disabled due to a new filter rule (or because it was added to the notrace list) then it won't be updated and continue being traced. This patch allows scanning of all frozen records during tracing to check if they should be traced. Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-23 22:10:58 +02:00
Abhishek Sagar	785656a41f	kprobes: enable clean usage of get_kprobe Allow clean use of get_kprobe() outside of core kprobe code. Ftrace makes use of get_kprobe to identify probes installed on mcount call-sites. Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: jkenisto@us.ibm.com Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-23 22:10:57 +02:00
Abhishek Sagar	395a59d0f8	ftrace: store mcount address in rec->ip Record the address of the mcount call-site. Currently all archs except sparc64 record the address of the instruction following the mcount call-site. Some general cleanups are entailed. Storing mcount addresses in rec->ip enables looking them up in the kprobe hash table later on to check if they're kprobe'd. Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com> Cc: davem@davemloft.net Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-23 22:10:56 +02:00
Kevin Coffman	d00953a53e	gss_krb5: create a define for token header size and clean up ptr location cleanup: Document token header size with a #define instead of open-coding it. Don't needlessly increment "ptr" past the beginning of the header which makes the values passed to functions more understandable and eliminates the need for extra "krb5_hdr" pointer. Clean up some intersecting white-space issues flagged by checkpatch.pl. This leaves the checksum length hard-coded at 8 for DES. A later patch cleans that up. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:47:25 -04:00
Alan Cox	36c7343b4e	tty_driver: Update required method documentation Some of the requirement rules are now more relaxed. Also correct a contradiction in the previous update Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-23 10:36:47 -07:00
Miklos Szeredi	8837abcab3	nfsd: rename MAY_ flags Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it clear, that these are not used outside nfsd, and to avoid name and number space conflicts with the VFS. [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:50 -04:00
Jeff Layton	a75c5d01e4	sunrpc: remove sv_kill_signal field from svc_serv struct Since we no longer make any distinction between shutdown signals with nfsd, then it becomes easier to just standardize on a particular signal to use to bring it down (SIGINT, in this case). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Jeff Layton	9867d76ca1	knfsd: convert knfsd to kthread API This patch is rather large, but I couldn't figure out a way to break it up that would remain bisectable. It does several things: - change svc_thread_fn typedef to better match what kthread_create expects - change svc_pool_map_set_cpumask to be more kthread friendly. Make it take a task arg and and get rid of the "oldmask" - have svc_set_num_threads call kthread_create directly - eliminate __svc_create_thread Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Neil Brown	bedbdd8bad	knfsd: Replace lock_kernel with a mutex for nfsd thread startup/shutdown locking. This removes the BKL from the RPC service creation codepath. The BKL really isn't adequate for this job since some of this info needs protection across sleeps. Also, add some comments to try and clarify how the locking should work and to make it clear that the BKL isn't necessary as long as there is adequate locking between tasks when touching the svc_serv fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Benny Halevy	0d169ca136	nfsd: eliminate unused nfs4_callback.cb_stat The cb_stat member of struct nfs4_callback is unused since commit `ff7d9756` nfsd: use static memory for callback program and stats Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:48 -04:00
Benny Halevy	a5e561fee6	nfsd: eliminate unused nfs4_callback.cb_program The cb_program member of struct nfs4_callback unused since commit `ff7d9756` nfsd: use static memory for callback program and stats Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:48 -04:00
J. Bruce Fields	7c11337d9d	nfsd: remove three unused NFS4_ACE_* defines These flag bits aren't used by either the protocol or our implementation, so I don't know why they were here. Thanks to Johann Dahm for running across these. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Johann Dahm <jdahm@umich.edu>	2008-06-23 13:02:48 -04:00
Denis V. Lunev	f9f48ec72b	[patch 4/4] flock: remove unused fields from file_lock_operations fl_insert and fl_remove are not used right now in the kernel. Remove them. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-06-23 11:52:30 -04:00
Jan Engelhardt	20d4fdc1a7	[patch 2/4] fs: make struct file arg to d_path const Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-06-23 11:52:30 -04:00
Ingo Molnar	1de8644cc7	Merge branch 'linus' into sched/devel	2008-06-23 11:30:23 +02:00
Ingo Molnar	1e74f9cbbb	Merge branch 'linus' into core/rcu	2008-06-23 11:29:11 +02:00
Ingo Molnar	f34bfb1bee	Merge branch 'linus' into tracing/ftrace	2008-06-23 11:11:42 +02:00

... 16 17 18 19 20 ...

12658 Commits